cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Submit your abstract to the call for content for Discovery Summit Americas by April 23. Selected abstracts will be presented at Discovery Summit, Oct. 21- 24.
Discovery is online this week, April 16 and 18. Join us for these exciting interactive sessions.
Choose Language Hide Translation Bar
amy
amy
Level II

Struggling with extracting the second largest values for each column

I have been trying to extract second largest numbers from JMP tables. So far, I have only been manually doing it using

Sort descending( Eval List( {:X1, :X2, :X3}));

How can I just pass the entire table rather than individual column names?

Capture.PNG

1 ACCEPTED SOLUTION

Accepted Solutions
txnelson
Super User

Re: Struggling with extracting the second largest values for each column

Your comparison component in your IF() function has a logic issue.

If( (Abs( colMax )) > 300 | (Abs( colMax )) > 250

 any value that is > 250 will make the comparison true, therefore, the >300 is not needed.

The comparison only needs to be:

If( ( Abs( colMax ) > 250,

You ask why you are getting values of 193?  That is because the values being printed out are the second highest value for the column, and your filtering is based upon the maximum value in the column.  The second highest value can be of any value.  If you had the following values in a column

82
15
600
44
193

The maximum column value is 600, so it would pass your IF() clause of Max Col() > 250, and the second highest value in the colum is 193

Concerning your question about "how can I find the row number of the second highest value i.e colSecondMaxValues?", the way to do this is already in the script.  The statement 

maxRows = dt << get rows where( As Column( dt, colNamesList[i] ) == colMax );

returns a matrix of the row numbers for the rows that have the Maximum values in the column.  If you want the row numbers for the second highest value, you just need to get the rows where they equal the second max value.  So if you add this statement after the Insert Into() function, you will get the row(s) that have the second highest value

Show(
			dt << get rows where(
				As Column( dt, colNamesList[i] ) == Col Max( Column( dt, colNamesList[i] ) )
			)
		);

 Here is the complete code

dt = Open( "$SAMPLE_DATA/semiconductor capability.jmp" );

colNamesList = dt << get column names( string, continuous, numeric );
colSecondMaxValues = {};

// Loop across all continuous columns and find the Max value in
// a column, delete that value, and then find the second max value
// finally, replace the old values, and save the results
For( i = 1, i <= N Items( colNamesList ), i++,
	colMax = Col Max( Column( dt, colNamesList[i] ) );
	If( Abs( colMax ) > 250,
		maxRows = dt << get rows where( As Column( dt, colNamesList[i] ) == colMax );
		Show( maxRows );
		Column( dt, colNamesList[i] )[maxRows] = .;
		Insert Into( colSecondMaxValues, Col Max( Column( dt, colNamesList[i] ) ) );
		Show(
			dt << get rows where(
				As Column( dt, colNamesList[i] ) == Col Max( Column( dt, colNamesList[i] ) )
			)
		);
		Show( colSecondMaxValues );
		Column( dt, colNamesList[i] )[maxRows] = colMax;
	);
);
Jim

View solution in original post

5 REPLIES 5
txnelson
Super User

Re: Struggling with extracting the second largest values for each column

Here is a little script that I wrote that determines the second highest value across all continuous columns in a data table

names default to here(1);
dt=open("$SAMPLE_DATA/semiconductor capability.jmp");

colNamesList= dt<<get column names(string,continuous,numeric);
colSecondMaxValues = {};

// Loop across all continuous columns and find the Max value in
// a column, delete that value, and then find the second max value
// finally, replace the old values, and save the results
For(i=1,i<=nitems(colNamesList),i++,
	colMax=col max(column(dt,colNamesList[i]));
	maxRows = dt<<get rows where(as column(dt,colNamesList[i])==colMax);
	column(dt,colNamesList[i])[maxRows]=.;
	insert into(colSecondMaxValues, col max(column(dt,colNamesList[i])));
	column(dt,colNamesList[i])[maxRows]=colMax;
	);
	
nw = New Window( "Second Max Values",
	Outline Box( "Table of Second Max Values",
		Table Box(
			String Col Box( "Column", colNamesList ),
			Number Col Box( "Second Max Value", colSecondMaxValues )
		)
	)
);
Jim
amy
amy
Level II

Re: Struggling with extracting the second largest values for each column

@txnelson Thank you for the script. I am trying to make a greater than 300 or 250 condition work for this script and I am unable to understand why it still takes values like 193. Also, how can I find the row number of the second highest value i.e colSecondMaxValues?  Thank you very much for your help. 

 

dt=open("$SAMPLE_DATA/semiconductor capability.jmp");

colNamesList= dt<<get column names(string,continuous,numeric);
colSecondMaxValues = {};

// Loop across all continuous columns and find the Max value in
// a column, delete that value, and then find the second max value
// finally, replace the old values, and save the results
For(i=1,i<=nitems(colNamesList),i++,
	colMax=col max(column(dt,colNamesList[i]));
	If((abs(colMax)) > 300 | (abs(colMax)) > 250,
	maxRows = dt<<get rows where(as column(dt,colNamesList[i])==colMax);
	show(maxRows);
	column(dt,colNamesList[i])[maxRows]=.;
	insert into(colSecondMaxValues, col max(column(dt,colNamesList[i])));
	show(colSecondMaxValues);
	column(dt,colNamesList[i])[maxRows]=colMax;
	)
	);
txnelson
Super User

Re: Struggling with extracting the second largest values for each column

Your comparison component in your IF() function has a logic issue.

If( (Abs( colMax )) > 300 | (Abs( colMax )) > 250

 any value that is > 250 will make the comparison true, therefore, the >300 is not needed.

The comparison only needs to be:

If( ( Abs( colMax ) > 250,

You ask why you are getting values of 193?  That is because the values being printed out are the second highest value for the column, and your filtering is based upon the maximum value in the column.  The second highest value can be of any value.  If you had the following values in a column

82
15
600
44
193

The maximum column value is 600, so it would pass your IF() clause of Max Col() > 250, and the second highest value in the colum is 193

Concerning your question about "how can I find the row number of the second highest value i.e colSecondMaxValues?", the way to do this is already in the script.  The statement 

maxRows = dt << get rows where( As Column( dt, colNamesList[i] ) == colMax );

returns a matrix of the row numbers for the rows that have the Maximum values in the column.  If you want the row numbers for the second highest value, you just need to get the rows where they equal the second max value.  So if you add this statement after the Insert Into() function, you will get the row(s) that have the second highest value

Show(
			dt << get rows where(
				As Column( dt, colNamesList[i] ) == Col Max( Column( dt, colNamesList[i] ) )
			)
		);

 Here is the complete code

dt = Open( "$SAMPLE_DATA/semiconductor capability.jmp" );

colNamesList = dt << get column names( string, continuous, numeric );
colSecondMaxValues = {};

// Loop across all continuous columns and find the Max value in
// a column, delete that value, and then find the second max value
// finally, replace the old values, and save the results
For( i = 1, i <= N Items( colNamesList ), i++,
	colMax = Col Max( Column( dt, colNamesList[i] ) );
	If( Abs( colMax ) > 250,
		maxRows = dt << get rows where( As Column( dt, colNamesList[i] ) == colMax );
		Show( maxRows );
		Column( dt, colNamesList[i] )[maxRows] = .;
		Insert Into( colSecondMaxValues, Col Max( Column( dt, colNamesList[i] ) ) );
		Show(
			dt << get rows where(
				As Column( dt, colNamesList[i] ) == Col Max( Column( dt, colNamesList[i] ) )
			)
		);
		Show( colSecondMaxValues );
		Column( dt, colNamesList[i] )[maxRows] = colMax;
	);
);
Jim
ih
Super User (Alumni) ih
Super User (Alumni)

Re: Struggling with extracting the second largest values for each column

If you want the second largest value in the whole table, reguardless of which column it is in:

 

Names default to here(1);
dt = open( "$SAMPLE_DATA/semiconductor capability.jmp" );

//Find the two largest values in the entire table
Sort Descending(Shape(dt << Get all columns as matrix(),1))[{1, 2}];

//Or, to exclude character columns
Sort Descending(Shape(dt << Get as matrix(dt << get column names(numeric)),1))[{1, 2}];
amy
amy
Level II

Re: Struggling with extracting the second largest values for each column

@ih I am sorry, but I need the values from every column and not an entire table.