Share your ideas for the JMP Scripting Unsession at Discovery Summit by September 17th. We hope to see you there!
Choose Language Hide Translation Bar
Highlighted
joshua
Level II

How to get max values accross multiple columns with missing data in their rows!

Hi,

 

Lets say we have data set that we want to get max values across the selected columns 

 

dt = New Table( "Untitled",
	New Column( "a@b;2", Numeric, Set Values( [12, 33, 1] ) ),
	New Column( "b", Numeric, Set Values( [2, 11, 2] ) ),
	New Column( "c@d;3", Numeric, Set Values( [333, 1, 0] ) )
);

colnames = dt << getcolumnnames;

use_cols = {};

include = {"@"};

For( j = 1, j <= N Items( include ), j++,

	For( i = 1, i <= Nitems( colnames  ), i++,
	If(Contains(colnames[i], include[j]),  
	insert into (use_cols , colnames[i]))
		
));


dt << new column("max_col",numeric,formula(Maximum(use_cols)))

joshua_0-1593736661870.png

I don't understand why it returns missing value in max_col ? Should we loop through the use_cols ?

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted
txnelson
Super User

Re: How to get max values accross multiple columns with missing data in their rows!

Running on a Windows PC, running Windows 10, I went back several versions of JMP and ran your script, and it ran correctly on all versions.

version.PNG

Your second qualification, that sometimes you have missing values in your data should not be an issue.  The Maximum() function handles missing values.

Not knowing more about your script, I will point out a possible issue.  If you look at the formula that is created for the column max_col you will see:

version2.PNG

The formula relies on the list "use_cols", so every time the formula is run, or rerun, it go to the list and uses the value "at that point in time" of "use_cols" and places it into the formula.  If the value of "use_cols" is "{Name( "a@b;2" ), Name( "c@d;3" )}", everything will work fine.  If for some reason, the list "use_cols" has changed it's value, the formula may not work and you will get missing values.  What you really want is for the formula to be 

Maximum( {:Name( "a@b;2" ), :Name( "c@d;3" )} )

The easiest way to do this, in my experience, is to specify the new column and formula as:

Eval(Parse("dt << new column(\!"max_col\!",numeric,formula( Maximum(" || char(use_cols) || ")));"));

It results in the formula being specified as:

version3.PNG

Jim

View solution in original post

5 REPLIES 5
Highlighted
joshua
Level II

Re: How to get max values accross multiple columns with missing data in their rows!

Forgot to add sometimes columns may contain missing values as well. I could not figured out how to add them when I built the data table.
Highlighted
Thierry_S
Level VI

Re: How to get max values accross multiple columns with missing data in their rows!

Hi,

You just need to help JMP figure out that the variable "use_cols" is referring to columns by adding "as columns" to your last statement:

dt << New Column( "max_col", numeric, formula( Maximum( as columns( use_cols ) ) ) );

 

Thierry R. Sornasse
Highlighted
txnelson
Super User

Re: How to get max values accross multiple columns with missing data in their rows!

Running on a Windows PC, running Windows 10, I went back several versions of JMP and ran your script, and it ran correctly on all versions.

version.PNG

Your second qualification, that sometimes you have missing values in your data should not be an issue.  The Maximum() function handles missing values.

Not knowing more about your script, I will point out a possible issue.  If you look at the formula that is created for the column max_col you will see:

version2.PNG

The formula relies on the list "use_cols", so every time the formula is run, or rerun, it go to the list and uses the value "at that point in time" of "use_cols" and places it into the formula.  If the value of "use_cols" is "{Name( "a@b;2" ), Name( "c@d;3" )}", everything will work fine.  If for some reason, the list "use_cols" has changed it's value, the formula may not work and you will get missing values.  What you really want is for the formula to be 

Maximum( {:Name( "a@b;2" ), :Name( "c@d;3" )} )

The easiest way to do this, in my experience, is to specify the new column and formula as:

Eval(Parse("dt << new column(\!"max_col\!",numeric,formula( Maximum(" || char(use_cols) || ")));"));

It results in the formula being specified as:

version3.PNG

Jim

View solution in original post

Highlighted
joshua
Level II

Re: How to get max values accross multiple columns with missing data in their rows!

Thanks txnelson for your great solution! You always helpful to go detail and explain things very clear!
Highlighted
joshua
Level II

Re: How to get max values accross multiple columns with missing data in their rows!

Sorry! this did not work out!