cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Stokes
Level IV

Have difficulty to find 2nd max, 3rd max value in 1 column etc

Hi JSL experts,

 

I have a big trouble to find the 2nd max, 3rd max etc value in 1 column.

For a col below, I hope to identify the top 5 max value, and mark the value out in a sperate column.

It seems I can only find the col max, but have a lot of difficulty to find the 2nd max, 3rd max etc.

Is there any standard formula for this solution?
Thanks

 

(100,

232,

343,

121,

100,

132,

100,

100,

100,

100,)

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Have difficulty to find 2nd max, 3rd max value in 1 column etc

Hi,

 

This depends on what you mean by "max", and whether you are willing to sort the table. I'll address a few common cases and hope one of them covers your use case.

 

Case 1: "max" means position in sorted dataset of original values, including duplicates.

You can use a built-in transform for this. Right-click on data in the column of interest and select New Formula Column > Distributional > Rank (Reverse Order).

 

Clicking on the formula that is created, we have the following, which works whether or not the table is sorted:

(Col Number( :Col1 ) - Col Rank( :Col1 )) + 1

brady_brady_0-1626387795830.png

 

Case 2: "Max" means position in sorted set of UNIQUE values (no duplication). Unwilling to pre-sort data.

***Note: due to its reliance on associative arrays, this approach only works with numeric data involving integers.

 

We can use the convoluted (and expensive) formula:

Loc(
	Matrix(
		Reverse(
			Associative Array( (:Col1 << get data table) :: col1 << getvalues ) << get keys
		)
	),
	:Col1
)

This formula does the following:

1) Gets the values from the column of interest and passes them to an Associative array, which removes duplicates.

2) Gets the associative array keys (in order from low to high by default) and reverses their order.

3) Uses the Loc() function to report the position of the given row within the matrix of reversed associative array keys.

 

 

Case 3: As in case 2, but we are willing to pre-sort (descending) by the column of interest.

 

In this case we can use a simpler and more efficient, albeit self-referential formula:

If( Row() == 1, 
	1,
	Lag( :Formula3 ) + (Lag( :Col1 ) != :Col1)
)

Below, we see the difference between the first formula and the next two. Again, only the 3rd formula is dependent on table sorting.

 

brady_brady_4-1626389373876.png

 

If you want to try this out, run the script below. It first appears unsorted, but after a one-second pause, it sorts by col1, at which point the results of the 3rd formula will match the results of the second.

 

dt = New Table( "Example",
	Add Rows( 20 ),
	New Column( "Col1", <<set values(J(20,1,randominteger(1,12)))) ,
	New Column( "Reverse Rank[Col1]", formula((Col Number( :Col1 ) - Col Rank( :Col1 )) + 1)),
	New Column( "Formula2",
		Formula(
			Loc(
				Matrix(
					Reverse(
						Associative Array( (:Col1 << get data table):col1  << getvalues)
						 << get keys
					)
				),
				:Col1
			)
		)
	),
	New Column( "Formula3",
		Formula( If( Row() == 1, 1, Lag( :Formula3 ) + (Lag( :Col1 ) != :Col1) ) )
	)
);
dt<< Run Formulas;
wait(1);
dt << sort(by(:col1), order(descending), replacetable);

 

Cheers,

Brady

 

 

 

View solution in original post

3 REPLIES 3

Re: Have difficulty to find 2nd max, 3rd max value in 1 column etc

Hi,

 

This depends on what you mean by "max", and whether you are willing to sort the table. I'll address a few common cases and hope one of them covers your use case.

 

Case 1: "max" means position in sorted dataset of original values, including duplicates.

You can use a built-in transform for this. Right-click on data in the column of interest and select New Formula Column > Distributional > Rank (Reverse Order).

 

Clicking on the formula that is created, we have the following, which works whether or not the table is sorted:

(Col Number( :Col1 ) - Col Rank( :Col1 )) + 1

brady_brady_0-1626387795830.png

 

Case 2: "Max" means position in sorted set of UNIQUE values (no duplication). Unwilling to pre-sort data.

***Note: due to its reliance on associative arrays, this approach only works with numeric data involving integers.

 

We can use the convoluted (and expensive) formula:

Loc(
	Matrix(
		Reverse(
			Associative Array( (:Col1 << get data table) :: col1 << getvalues ) << get keys
		)
	),
	:Col1
)

This formula does the following:

1) Gets the values from the column of interest and passes them to an Associative array, which removes duplicates.

2) Gets the associative array keys (in order from low to high by default) and reverses their order.

3) Uses the Loc() function to report the position of the given row within the matrix of reversed associative array keys.

 

 

Case 3: As in case 2, but we are willing to pre-sort (descending) by the column of interest.

 

In this case we can use a simpler and more efficient, albeit self-referential formula:

If( Row() == 1, 
	1,
	Lag( :Formula3 ) + (Lag( :Col1 ) != :Col1)
)

Below, we see the difference between the first formula and the next two. Again, only the 3rd formula is dependent on table sorting.

 

brady_brady_4-1626389373876.png

 

If you want to try this out, run the script below. It first appears unsorted, but after a one-second pause, it sorts by col1, at which point the results of the 3rd formula will match the results of the second.

 

dt = New Table( "Example",
	Add Rows( 20 ),
	New Column( "Col1", <<set values(J(20,1,randominteger(1,12)))) ,
	New Column( "Reverse Rank[Col1]", formula((Col Number( :Col1 ) - Col Rank( :Col1 )) + 1)),
	New Column( "Formula2",
		Formula(
			Loc(
				Matrix(
					Reverse(
						Associative Array( (:Col1 << get data table):col1  << getvalues)
						 << get keys
					)
				),
				:Col1
			)
		)
	),
	New Column( "Formula3",
		Formula( If( Row() == 1, 1, Lag( :Formula3 ) + (Lag( :Col1 ) != :Col1) ) )
	)
);
dt<< Run Formulas;
wait(1);
dt << sort(by(:col1), order(descending), replacetable);

 

Cheers,

Brady

 

 

 

Stokes
Level IV

Re: Have difficulty to find 2nd max, 3rd max value in 1 column etc

Thanks a lot.

For case 1, It seems my JMP doesn't have Rank under Distributional, it only have the first 6 options as you showed in the pictures.

Formula Column > Distributional > Rank (Reverse Order).

 

I think the case 3 seems a better solution for my current issue, appreciate the help.

Re: Have difficulty to find 2nd max, 3rd max value in 1 column etc

You might be on an older version of JMP... does it work when you manually enter the formula I showed for that scenario? This may work for you--I can't recall when the by-variable support was added to the column functions... I think it may have been JMP 14 for a lot of them.

 

Cheers,

Brady