cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • Learn how to build custom Python data connectors and further customize JMP’s Data Connector Framework with the Python Data Connector Demo, available now in the JMP Marketplace!
  • See how to create experiments to support product design and ID useful product features. Register for June 12 webinar, 2pm US Eastern Time.

Discussions

Solve problems, and share tips and tricks with other JMP users.
Choose Language Hide Translation Bar
katharina_l
Level III

Get distinct entries from a list?

is there a simple way to obtain the distinct entries in a list? (similar to summary of a table by grouping column in order to get distinct entries in that column)
The result could be either a reduced list containing only the distinct entries (repeated ones removed) or just a number showing the amount of different entries.
Before writing a script I would like to check if there is a simple way / function / formula. Couldn't find anything in the JMP help ...

1 ACCEPTED SOLUTION

Accepted Solutions
txnelson
Super User

Re: Get distinct entries from a list?

One way is to use an Associative Array

names default to here(1);

// Create a list
dt=
// Open Data Table: Big Class.jmp
// → Data Table( "Big Class" )
Open( "$SAMPLE_DATA/Big Class.jmp" );
genderList = :sex << get values;

// Get the distinct list of values
distinctList = (associative array(genderList))<<get keys;
{"F", "M"}

 

Jim

View solution in original post

14 REPLIES 14
txnelson
Super User

Re: Get distinct entries from a list?

One way is to use an Associative Array

names default to here(1);

// Create a list
dt=
// Open Data Table: Big Class.jmp
// → Data Table( "Big Class" )
Open( "$SAMPLE_DATA/Big Class.jmp" );
genderList = :sex << get values;

// Get the distinct list of values
distinctList = (associative array(genderList))<<get keys;
{"F", "M"}

 

Jim
katharina_l
Level III

Re: Get distinct entries from a list?

awesome, this works! Thank you very much Jim!

hogi
Level XIII

Re: Get distinct entries from a list?

The approach via associative array is easy to script / remember / apply.
Take care if you want to get the distinct values for a list of  mio of values.

 

A workaround: 
save the values to a table and use either summarize or the tables/summary which you mentioned above.

 

Maybe, in the future, there will be a direct / fast way in JMP to calculate unique values?
I added this hope as a subtopic to Col N Categories - and all the others ... 
Please support the idea and vote : )

katharina_l
Level III

Re: Get distinct entries from a list?

Actually, my problem is that I want to detect unique entries in a sequence of strings in a table. Here I attached an example: Originally I have the column "sequence" (arbitrary delimiter, here the delimiter is "|). My actual workaround now is to transform this to a list and then apply Jim's solution with the associative array. But an easy way to calculate unique values would be highly appreciated ...

hogi
Level XIII

Re: Get distinct entries from a list?

Oh yeah, this is where the associative arrays hurt!
At the end you count the entries. Is this what you need - or do you also need the intermediate step (with the list of unique entries)?


A quick and easy way to handle unique values - let's hope that the JMP developers recognize this topic as something really useful. Such that it gets implemented in the next release ...

katharina_l
Level III

Re: Get distinct entries from a list?

For my current application, I only need the number of distinct entries, not the values themselves. But I think there might be many more situations where it would be helpful to reduce a list to its unique entries. 


And yeah, let's hope for implementation in the next release :)

hogi
Level XIII

Re: Get distinct entries from a list?

Regarding speed, I just checked if the intermediate steps gets faster if I replace the associative array part with a simple

If(not(contains(), insert into(mylist ... - no benefit.

 

New Column( "unique entries",
	Expression,
	Formula(
		Local( {nr = N Rows( Current Data Table() )},
			If( Row() == nr,
				Caption( "done" )
			);
			Match( :variant,
				1, Associative Array( :list from sequence[Empty()] ) << get keys,
				2,
					myList = {};
					For Each( {entry}, :list from sequence,
						If( !Contains( myList, entry ),
							Insert Into( myList, entry )
						)
					);
					myList;
			);
		)
	)
)


for 1mio rows, I get:

hogi_2-1733302159526.png

So 1:0 for associative arrays. Any better idea?

 

[Another 2+2 seconds are needed for the columns :list from sequence and :N unique entries.
So, no real benefit to get into the ms regime for the intermediate step]

hogi
Level XIII

Re: Get distinct entries from a list?

- easy

- fast

 

and:
- not greedy

 

I just tried to check the timing with 10 mio rows - and got stuck.
JMP crashes and the last 60 seconds look like this:

hogi_0-1733308211187.png

hogi
Level XIII

Re: Get distinct entries from a list?

... so, I just tried it with 5 mio rows - and got an interesting insight:

hogi_1-1733308275918.png

When I save the table, close JMP and load the table again, the memory usage is significantly less than what is needed to calculate the values!

At  first sight, I thought it's due to the Associative Array, but the same thing happens when I use mylist.

How can I prevent the column formula from eating my memory?

TS-00177710

Recommended Articles