Solved: Re: Get unique column values & use them for filtering / cleaning up in a differe...

Ressel · Jun 9, 2023 11:05 AM

Hi, what would be the approach for using, for example, the keys in an array for selecting or filtering rows with the same keys, but in a different JMP table. I have two tables. One of them maps sample IDs for chemical testing to different factor settings (temperature, time, study nr., etc.). I am successfully getting the sample IDs in an array. (Sorry, I'm a total newbie to this.)

smplID = Associative array(:sampleid) << Get Keys;

Now I want to use these keys to select rows in the second table (where my responses are), invert the row selection and delete the rows. This reference was, so far, getting me nowhere. Perhaps the solution below, as described here, is better suited to start with?

Summarize(dt, smpleID = by(:sampleid));

Or maybe this approach is preferable?

dt << Summary (Group ( :sampleid ), Frequ( "None" ), Weight( "None" );

Anyway, what I really want to know is, how I may use any of these approaches to effectively cleanse a data table for rows I don't need, before I proceed any further with data analysis. I guess I need to learn how to loop.

Yes, I am trying to read through the Scripting Guide whenever possible. I am just not able to prioritize it.

jthi · Nov 13, 2021 01:59 AM

To get unique values, all those are valid methods and their speeds will depend on the amount of data you have. One think to notice, is that if you want to take unique Numeric values and you use Summarize, you will have to convert them back to numeric (Summarize returns them as strings).

Some different ways to get unique values:

Names Default To Here(1);
dt = Open("$SAMPLE_DATA/Big Class.jmp");

aa_uniq = Associative Array(dt:age) << get keys;
Summarize(dt, summarize_uniq = by(:age));
dt_summary = dt << Summary(Group(:age), Freq("None"), Weight("None"), private);
summary_uniq = Column(dt_summary, 1) << get as matrix;
Close(dt_summary, no save);
Show(aa_uniq, summarize_uniq, summary_uniq);

To select values from datatable you can use Contains with select where. There might be no need to invert row selection as you can use ! (or Not()) to negate the result you get from functions. Waits are here just for demonstrations sake:

Names Default To Here(1);
dt = Open("$SAMPLE_DATA/Big Class.jmp");

names_of_interest = {"KATIE", "LILLIE", "JOHN"};

wait(2);
dt << Select Where(Contains(names_of_interest, :name)) << Invert Row Selection << Hide and Exclude(1);// << clear select;
//Remove selections
wait(2);
dt << Clear Row States;
wait(2);
dt << Select Where(!Contains(names_of_interest, :name)) << Hide and Exclude(1);// << clear select;

Also if you have a lot of data, this might not be the most efficient way to do this, but is most likely the simplest and I would start with it.

-Jarmo

View solution in original post

jthi · Nov 13, 2021 01:59 AM

To get unique values, all those are valid methods and their speeds will depend on the amount of data you have. One think to notice, is that if you want to take unique Numeric values and you use Summarize, you will have to convert them back to numeric (Summarize returns them as strings).

Some different ways to get unique values:

Names Default To Here(1);
dt = Open("$SAMPLE_DATA/Big Class.jmp");

aa_uniq = Associative Array(dt:age) << get keys;
Summarize(dt, summarize_uniq = by(:age));
dt_summary = dt << Summary(Group(:age), Freq("None"), Weight("None"), private);
summary_uniq = Column(dt_summary, 1) << get as matrix;
Close(dt_summary, no save);
Show(aa_uniq, summarize_uniq, summary_uniq);

To select values from datatable you can use Contains with select where. There might be no need to invert row selection as you can use ! (or Not()) to negate the result you get from functions. Waits are here just for demonstrations sake:

Names Default To Here(1);
dt = Open("$SAMPLE_DATA/Big Class.jmp");

names_of_interest = {"KATIE", "LILLIE", "JOHN"};

wait(2);
dt << Select Where(Contains(names_of_interest, :name)) << Invert Row Selection << Hide and Exclude(1);// << clear select;
//Remove selections
wait(2);
dt << Clear Row States;
wait(2);
dt << Select Where(!Contains(names_of_interest, :name)) << Hide and Exclude(1);// << clear select;

Also if you have a lot of data, this might not be the most efficient way to do this, but is most likely the simplest and I would start with it.

-Jarmo

Ressel · Nov 13, 2021 9:06 AM

Thank you, this helped me A LOT! I am not sure what "a lot of data" really means. The first table I am opening, today contains about 400 rows, but this number is bound to grow with the ongoing implementation of a laboratory information management system (LIMS) in our organization. I am guessing over time there'll be thousands, perhaps even as many as ten thousands of rows. The second table where I am "cleaning" away all, for me, irrelevant sample IDs already has some 50 000 rows.

As of now, having implemented your solution, I am working with an array of sample IDs. (Although I feel my skills are extremely limited, scripting still feels extremely valuable, and satisfying when finding a solution, even when it is "ripped" from the community. So thank you very much again.)

jthi · Nov 13, 2021 12:59 PM

These methods will be fast for that amount of rows!

There is one more solution that came to my mind, you can first create Summary table and then Join that to the table you want to remove found values from.

Using JMPs Cars.jmp and Cars 1993.jmp sample datas as example. Cars 1993 will be one data we will filter Cars with by using Model column.

1. Create Summary table from Cars 1993 with Model as Group

2. Join that new summary table to Cars table with following settings:

3. From new Match Flag column you can see two values: Both and Main. Both means that the Model can be found from both of datasets

I think you can do this directly with Cars 1993 but you have to be careful with duplicates (Drop multiples With Table might solve that, not sure). Also in this case you would want to choose columns you want to keep, as there will be duplicates (at least with this example).

These both can also be scripted fairly easily as JMP can do most of the scripting.

-Jarmo

Ressel · Nov 15, 2021 02:38 AM

That's the thing. I need to be very careful with the table manipulations. Otherwise I might misplace some replicate measurements. Hence I am trying to clean the tables in advance, since this helps maintaining an overview.

Get unique column values & use them for filtering / cleaning up in a different table

Re: Get unique column values & use them for filtering / cleaning up in a different table

Re: Get unique column values & use them for filtering / cleaning up in a different table

Re: Get unique column values & use them for filtering / cleaning up in a different table

Re: Get unique column values & use them for filtering / cleaning up in a different table

Re: Get unique column values & use them for filtering / cleaning up in a different table