cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Browse apps to extend the software in the new JMP Marketplace
Choose Language Hide Translation Bar
unbbmas
Level I

count(distinct val)

Hello All,

Any function in  Jmp similar to count distinct in sql?

Examples:

How can I obtain the values of count distinct subjectsid in a new column in these tables with functions?

Thanks

TABLA 1               

subjectid             count distinct subjectid

a                                              5

b                                             5

c                                              5

a                                              5

b                                             5

d                                             5

e                                             5

               

TABLA 2               

subjectid             count distinct subjectid

a                                              3

b                                             3

c                                              3

a                                              3

1 ACCEPTED SOLUTION

Accepted Solutions

Re: count(distinct val)

I can't readily think of a way to have a formula accomplish a COUNT DISTINCT.  However, you can get the value via a couple Table Summaries and a Table Join, which you can then code into JSL if you choose.

In Table A, Select Table --> Summary, Group By subjectid, select OK.  In the summary table, select Table --> Summary, Select subjectid, and select "N" under Statistics, select OK.  In the new table, delete the "N Rows" Column.  Go back to the first table, and select Table --> Join.  Then select to join with the last summary table with matching specification of Cartesian Join, then select OK.  The distinct count will show up in your original table.

If you were to script it:

dt = Current Data Table();

sumdt = dt << Summary( Group( :subjectid ), invisible );

sumsumdt = sumdt << Summary( Group, N( :subjectid ), invisible );

count_distinctdt = dt << Join(

     With( sumsumdt ),

     Cartesian Join

);

count_distinctdt << delete columns("N Rows");

column("N(subjectid)") << set name("count distinct subjectid");

sumsumdt << close window;

sumdt << close window;

close(dt,no save);

View solution in original post

9 REPLIES 9

Re: count(distinct val)

I can't readily think of a way to have a formula accomplish a COUNT DISTINCT.  However, you can get the value via a couple Table Summaries and a Table Join, which you can then code into JSL if you choose.

In Table A, Select Table --> Summary, Group By subjectid, select OK.  In the summary table, select Table --> Summary, Select subjectid, and select "N" under Statistics, select OK.  In the new table, delete the "N Rows" Column.  Go back to the first table, and select Table --> Join.  Then select to join with the last summary table with matching specification of Cartesian Join, then select OK.  The distinct count will show up in your original table.

If you were to script it:

dt = Current Data Table();

sumdt = dt << Summary( Group( :subjectid ), invisible );

sumsumdt = sumdt << Summary( Group, N( :subjectid ), invisible );

count_distinctdt = dt << Join(

     With( sumsumdt ),

     Cartesian Join

);

count_distinctdt << delete columns("N Rows");

column("N(subjectid)") << set name("count distinct subjectid");

sumsumdt << close window;

sumdt << close window;

close(dt,no save);

XanGregg
Staff

Re: count(distinct val)

A simple though inefficient formula is:

   Summation( i = 1, N Row(), :subjectid == :subjectid)

It takes advantage of the fact that true is 1 and false is 0, so summing up the comparisons gives a count.

Using the Table Summary method as TheChadd describes or using JSL will be faster if the table is really big.

Beaux
Level I

Re: count(distinct val)

I found this aggregate function useful:

 

N Categories()

Re: count(distinct val)

Sadly, I can't find the N Categories() function in the Scripting Index.

Luckily I would be satisfied with any of the previous solutions.

// Open an example
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );

// Use the Tables > Summary command approach
dt << Summary(
	Group( :age ),
	Freq( "None" ),
	Weight( "None" )
);

// Use the Analyze > Tabulate approach
dt << Tabulate(
	Show Control Panel( 0 ),
	Add Table( Row Table( Grouping Columns( :age ) ) )
);

// Use the JSL Summarize() function approach
Summarize(
	group = By( :age ),
	n = Count()
);
Jeff_Perkinson
Community Manager Community Manager

Re: count(distinct val)

I was puzzled by @Beaux's response as well but I think he's referring to the N Categories summary statistic in Tables->Summary.

 

JMPScreenSnapz109.png

-Jeff
Beaux
Level I

Re: count(distinct val)

Strange that that function is not available in the formula editor (statistical) or JSL... Maybe next version.

hogi
Level XII

Re: count(distinct val)

As @Jeff_Perkinson says - really puzzling that there is a N categories functionality in Jmp - but that it's restricted to Tables/Summary ?!?

 

If other users are interested as well to have N Categories not just in the Tables/Summary Platform, but also via JSL ...
  - here is the wish:
new JSL function: Col N Categories 

hogi
Level XII

Re: count(distinct val)

Tables/ Summary: hogi_1-1720504221983.png

 

Tabulate: hogi_2-1720504251432.png

 

Header Statistics:  hogi_0-1720504202718.png

 

Graph Builder: NOPE

 

Col N Categories: NOPE

@Jeff_Perkinson , whom do we have to convince to get it implemented in JMP18.1?