cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Browse apps to extend the software in the new JMP Marketplace
Choose Language Hide Translation Bar

Row selection based on multiple columns and multiple criteria

Hey guys,

I read the available threads on row selection but still can't compile a formula to accomplish what I need (sorry - I am VERY inexperienced still..).

I have a table that looks like this:

   

PersonCompoundDoseUnits
SmithAspirin200mg
SmithAspirin150mg
JonesGabapentin20ug
JonesGabapentin30ug
JonesNyQuil60mL
EdisonMotrin600mg
EdisonMotrin600mg

For each person, I need to look at column Compound and for each Compound, then look at doses and choose the row with higher dose. If one person had used two compounds I need to do dose evaluation for only a comp with multiple doses. So working my way through this table, I should get a new table as below: 

PersonCompoundDoseUnits
SmithAspirin200mg
JonesGabapentin30ug
JonesNyQuil60mL
EdisonMotrin600mg

Can someone please help? Will be very much appreciated!!!!!

2 ACCEPTED SOLUTIONS

Accepted Solutions
pmroz
Super User

Re: Row selection based on multiple columns and multiple criteria

You can do this with a tabulation:

PersonCompoundMax
EdisonMotrin600
JonesGabapentin30
NyQuil60
SmithAspirin200

Click on Analyze > Tabulate (JMP 11) or Tables > Tabulate (JMP 10, 9, 8)

Drag Person to the drop zone for rows.

Drag Compound just to the right of Person

Drag Dose on top of N

Drag Max on top of Sum

Click Done

Click the little red triangle and select Make into data table

You might have to normalize the units if that's a consideration.

View solution in original post

mpb
mpb
Level VII

Re: Row selection based on multiple columns and multiple criteria

I'm old fashioned and my first impulse is to use the older Tables > Summary approach which is very similar to using Tabulate.

1. Tables > Summary

2. Drag Person, then Compound into the Group box

3. Click Dose and then click the Statistics button and select Max

4. Click OK

or as a script:

dt = current data table();

dt << Summary( Group( :Person, :Compound ), Max( :Dose ) )

Michael

View solution in original post

6 REPLIES 6
pmroz
Super User

Re: Row selection based on multiple columns and multiple criteria

You can do this with a tabulation:

PersonCompoundMax
EdisonMotrin600
JonesGabapentin30
NyQuil60
SmithAspirin200

Click on Analyze > Tabulate (JMP 11) or Tables > Tabulate (JMP 10, 9, 8)

Drag Person to the drop zone for rows.

Drag Compound just to the right of Person

Drag Dose on top of N

Drag Max on top of Sum

Click Done

Click the little red triangle and select Make into data table

You might have to normalize the units if that's a consideration.

mpb
mpb
Level VII

Re: Row selection based on multiple columns and multiple criteria

I'm old fashioned and my first impulse is to use the older Tables > Summary approach which is very similar to using Tabulate.

1. Tables > Summary

2. Drag Person, then Compound into the Group box

3. Click Dose and then click the Statistics button and select Max

4. Click OK

or as a script:

dt = current data table();

dt << Summary( Group( :Person, :Compound ), Max( :Dose ) )

Michael

Jeff_Perkinson
Community Manager Community Manager

Re: Row selection based on multiple columns and multiple criteria

PMroz has a good solution.

Here's another.

Sort your table by Name, Compound, (ascending) Dose.

Then create a Row State column (Cols -> New Column).

7620_JMPScreenSnapz002.png

And add a Formula.

7621_JMPScreenSnapz003.png

The formula should be:


Selected State( :Compound != :Compound[Row() + 1] )


In the formula editor it looks like this:

7622_JMPScreenSnapz004.png

This will select any row where the compound name is different from the compound name in the next row. Since the table is sorted by person, compound, and dose these rows will represent the maximum dose for each person.

You can then copy these selections to the active row state area using the pop-up menu next to the row state column name.

7623_JMPScreenSnapz001.png

The rows will be selected and you can use Tables->Subset to get a new data table with just the selected rows.

I've attached a data table that shows this.

-Jeff

-Jeff
Byron_JMP
Staff

Re: Row selection based on multiple columns and multiple criteria

It looks like the logic for selecting the flagged rows might need to be a little more specific?

If( :Person == Lag( :Person, 1 ) & :Compound == Lag( :Compound, 1 ), 0, 1)

Try running this script against the dose table in Jeff's post, then run the sort script first, followed by the subset script.

dt=current data table();

dt<<New Script(  "Sort First",  dt = Current Data Table();  dt << Sort(  replace table,

  By( :Person, :Compound, :Dose ),  Order( Ascending, Ascending, Ascending )););

dt<<New Script(  "subset flagged rows",  dt = Current Data Table();  dt << Select Where( :Flag == 1 );

  dt << Subset(  Output Table( "Subset of dose bw" ),  Selected Rows( 0 ),  Rows( [1, 3, 5, 6] ),  Selected columns only( 0 )););

dt<<New Column( "Flag",

  Numeric,  Nominal,  Format( "Best", 12 ),

  Formula(  If( :Person == Lag( :Person, 1 ) & :Compound == Lag( :Compound, 1 ),

  0,  1  )  ));

JMP Systems Engineer, Health and Life Sciences (Pharma)
Jeff_Perkinson
Community Manager Community Manager

Re: Row selection based on multiple columns and multiple criteria

Byron is right. I was a little too quick and forgot that if two people in sequence are taking the same compound then my condition won't catch the first one.

Here's a corrected formula and data table.


Selected State( :Compound != :Compound[Row() + 1] | :Person != :Person[Row() + 1] )



7624_JMPScreenSnapz005.png

This time it selects the row if the compound or the person doesn't match the next row.

-Jeff

Re: Row selection based on multiple columns and multiple criteria

Thank you, guys!

I tried all three solutions - and got the table I wanted.

It is way simpler than I thought - and I love summary and tabulate functions!!

Thanks again

Tania