Choose Language Hide Translation Bar
Highlighted
Level I

Row selection based on multiple columns and multiple criteria

Hey guys,

I read the available threads on row selection but still can't compile a formula to accomplish what I need (sorry - I am VERY inexperienced still..).

I have a table that looks like this:

 Person Compound Dose Units Smith Aspirin 200 mg Smith Aspirin 150 mg Jones Gabapentin 20 ug Jones Gabapentin 30 ug Jones NyQuil 60 mL Edison Motrin 600 mg Edison Motrin 600 mg

For each person, I need to look at column Compound and for each Compound, then look at doses and choose the row with higher dose. If one person had used two compounds I need to do dose evaluation for only a comp with multiple doses. So working my way through this table, I should get a new table as below:

 Person Compound Dose Units Smith Aspirin 200 mg Jones Gabapentin 30 ug Jones NyQuil 60 mL Edison Motrin 600 mg

2 ACCEPTED SOLUTIONS

Accepted Solutions
Highlighted
Super User

Re: Row selection based on multiple columns and multiple criteria

You can do this with a tabulation:

 Person Compound Max Edison Motrin 600 Jones Gabapentin 30 NyQuil 60 Smith Aspirin 200

Click on Analyze > Tabulate (JMP 11) or Tables > Tabulate (JMP 10, 9, 8)

Drag Person to the drop zone for rows.

Drag Compound just to the right of Person

Drag Dose on top of N

Drag Max on top of Sum

Click Done

Click the little red triangle and select Make into data table

You might have to normalize the units if that's a consideration.

Highlighted
Super User

Re: Row selection based on multiple columns and multiple criteria

I'm old fashioned and my first impulse is to use the older Tables > Summary approach which is very similar to using Tabulate.

1. Tables > Summary

2. Drag Person, then Compound into the Group box

3. Click Dose and then click the Statistics button and select Max

4. Click OK

or as a script:

dt = current data table();

dt << Summary( Group( :Person, :Compound ), Max( :Dose ) )

Michael

6 REPLIES 6
Highlighted
Super User

Re: Row selection based on multiple columns and multiple criteria

You can do this with a tabulation:

 Person Compound Max Edison Motrin 600 Jones Gabapentin 30 NyQuil 60 Smith Aspirin 200

Click on Analyze > Tabulate (JMP 11) or Tables > Tabulate (JMP 10, 9, 8)

Drag Person to the drop zone for rows.

Drag Compound just to the right of Person

Drag Dose on top of N

Drag Max on top of Sum

Click Done

Click the little red triangle and select Make into data table

You might have to normalize the units if that's a consideration.

Highlighted
Super User

Re: Row selection based on multiple columns and multiple criteria

I'm old fashioned and my first impulse is to use the older Tables > Summary approach which is very similar to using Tabulate.

1. Tables > Summary

2. Drag Person, then Compound into the Group box

3. Click Dose and then click the Statistics button and select Max

4. Click OK

or as a script:

dt = current data table();

dt << Summary( Group( :Person, :Compound ), Max( :Dose ) )

Michael

Highlighted
Community Manager

Re: Row selection based on multiple columns and multiple criteria

PMroz has a good solution.

Here's another.

Sort your table by Name, Compound, (ascending) Dose.

Then create a Row State column (Cols -> New Column).

The formula should be:

`Selected State( :Compound != :Compound[Row() + 1] )`

In the formula editor it looks like this:

This will select any row where the compound name is different from the compound name in the next row. Since the table is sorted by person, compound, and dose these rows will represent the maximum dose for each person.

You can then copy these selections to the active row state area using the pop-up menu next to the row state column name.

The rows will be selected and you can use Tables->Subset to get a new data table with just the selected rows.

I've attached a data table that shows this.

-Jeff

-Jeff
Highlighted
Staff

Re: Row selection based on multiple columns and multiple criteria

It looks like the logic for selecting the flagged rows might need to be a little more specific?

If( :Person == Lag( :Person, 1 ) & :Compound == Lag( :Compound, 1 ), 0, 1)

Try running this script against the dose table in Jeff's post, then run the sort script first, followed by the subset script.

dt=current data table();

dt<<New Script(  "Sort First",  dt = Current Data Table();  dt << Sort(  replace table,

By( :Person, :Compound, :Dose ),  Order( Ascending, Ascending, Ascending )););

dt<<New Script(  "subset flagged rows",  dt = Current Data Table();  dt << Select Where( :Flag == 1 );

dt << Subset(  Output Table( "Subset of dose bw" ),  Selected Rows( 0 ),  Rows( [1, 3, 5, 6] ),  Selected columns only( 0 )););

dt<<New Column( "Flag",

Numeric,  Nominal,  Format( "Best", 12 ),

Formula(  If( :Person == Lag( :Person, 1 ) & :Compound == Lag( :Compound, 1 ),

0,  1  )  ));

JMP Systems Engineer, Pharm and BioPharm Sciences
Highlighted
Community Manager

Re: Row selection based on multiple columns and multiple criteria

Byron is right. I was a little too quick and forgot that if two people in sequence are taking the same compound then my condition won't catch the first one.

Here's a corrected formula and data table.

`Selected State( :Compound != :Compound[Row() + 1] | :Person != :Person[Row() + 1] )`

This time it selects the row if the compound or the person doesn't match the next row.

-Jeff
Highlighted
Level I

Re: Row selection based on multiple columns and multiple criteria

Thank you, guys!

I tried all three solutions - and got the table I wanted.

It is way simpler than I thought - and I love summary and tabulate functions!!

Thanks again

Tania

Article Labels

There are no labels assigned to this post.