Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Discussions
- :
- count(distinct val)

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jul 10, 2013 3:07 AM
(4887 views)

Hello All,

Any function in Jmp similar to count distinct in sql?

Examples:

How can I obtain the values of count distinct subjectsid in a new column in these tables with functions?

Thanks

TABLA 1

subjectid count distinct subjectid

a 5

b 5

c 5

a 5

b 5

d 5

e 5

TABLA 2

subjectid count distinct subjectid

a 3

b 3

c 3

a 3

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jul 10, 2013 7:42 AM
(7536 views)

Solution

I can't readily think of a way to have a formula accomplish a COUNT DISTINCT. However, you can get the value via a couple Table Summaries and a Table Join, which you can then code into JSL if you choose.

In Table A, Select Table --> Summary, Group By subjectid, select OK. In the summary table, select Table --> Summary, Select subjectid, and select "N" under Statistics, select OK. In the new table, delete the "N Rows" Column. Go back to the first table, and select Table --> Join. Then select to join with the last summary table with matching specification of Cartesian Join, then select OK. The distinct count will show up in your original table.

If you were to script it:

dt = Current Data Table**()**;

sumdt = dt << Summary**(** Group**(** :subjectid **)**, invisible **)**;

sumsumdt = sumdt << Summary**(** Group, N**(** :subjectid **)**, invisible **)**;

count_distinctdt = dt << Join**(**

With**(** sumsumdt **)**,

Cartesian Join

**)**;

count_distinctdt << delete columns**(**"N Rows"**)**;

column**(**"N(subjectid)"**)** << set name**(**"count distinct subjectid"**)**;

sumsumdt << close window;

sumdt << close window;

close**(**dt,no save**)**;

6 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jul 10, 2013 7:42 AM
(7537 views)

I can't readily think of a way to have a formula accomplish a COUNT DISTINCT. However, you can get the value via a couple Table Summaries and a Table Join, which you can then code into JSL if you choose.

In Table A, Select Table --> Summary, Group By subjectid, select OK. In the summary table, select Table --> Summary, Select subjectid, and select "N" under Statistics, select OK. In the new table, delete the "N Rows" Column. Go back to the first table, and select Table --> Join. Then select to join with the last summary table with matching specification of Cartesian Join, then select OK. The distinct count will show up in your original table.

If you were to script it:

dt = Current Data Table**()**;

sumdt = dt << Summary**(** Group**(** :subjectid **)**, invisible **)**;

sumsumdt = sumdt << Summary**(** Group, N**(** :subjectid **)**, invisible **)**;

count_distinctdt = dt << Join**(**

With**(** sumsumdt **)**,

Cartesian Join

**)**;

count_distinctdt << delete columns**(**"N Rows"**)**;

column**(**"N(subjectid)"**)** << set name**(**"count distinct subjectid"**)**;

sumsumdt << close window;

sumdt << close window;

close**(**dt,no save**)**;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jul 11, 2013 1:30 PM
(4398 views)

A simple though inefficient formula is:

Summation( i = 1, N Row(), :subjectid* == :subjectid)*

It takes advantage of the fact that true is 1 and false is 0, so summing up the comparisons gives a count.

Using the Table Summary method as TheChadd describes or using JSL will be faster if the table is really big.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jul 20, 2017 5:28 AM
(3595 views)

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jul 20, 2017 6:45 AM
(3588 views)

Sadly, I can't find the **N Categories()** function in the Scripting Index.

Luckily I would be satisfied with any of the previous solutions.

```
// Open an example
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
// Use the Tables > Summary command approach
dt << Summary(
Group( :age ),
Freq( "None" ),
Weight( "None" )
);
// Use the Analyze > Tabulate approach
dt << Tabulate(
Show Control Panel( 0 ),
Add Table( Row Table( Grouping Columns( :age ) ) )
);
// Use the JSL Summarize() function approach
Summarize(
group = By( :age ),
n = Count()
);
```

Learn it once, use it forever!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jul 20, 2017 11:29 AM
(3569 views)

I was puzzled by @Beaux's response as well but I think he's referring to the N Categories summary statistic in **Tables->Summary**.

-Jeff

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jul 20, 2017 1:16 PM
(3560 views)