cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
ILoveJMP
Level III

Max value of each row across multiple columns and the corresponding name of the source column

Hi, is there an easy way to find the max value of each row and the column name (or parameter) corresponding to the max value? Below is a simple example to illustrate the request:

 

T1T2T3T4T5T6T7T8T9T10Max_ValueCol_Name (Parameter)
87.283.681.48383838382.383.384.687.2T1
85.984828483.383.3848386.587.287.2T10
84.982.7868382.382.3838283.685.286T3
85.684.682.784.684.384.384.383.684.986.286.2T10
85.283.68283.3838383.382.384.385.685.6T10
86.884.382.784.383.683.684.383.385.986.286.8T1
86.28482.3848783.383.68384.985.987T5
88.184.682.384.383.68484.383.385.986.288.1T1
85.683.381.783.3838383.682.78485.985.9T10
37.436.136.436.436.136.136.435.436.737.437.4T1
36.734.534.834.533.834.134.533.536.136.136.7T1
34.832.933.232.932.232.53631.933.835.136T7
3836.436.436.436.436.136.7393838.339T8
3835.135.14034.134.534.833.836.43740T4

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Craige_Hales
Super User

Re: Max value of each row across multiple columns and the corresponding name of the source column

The data table subscripting is new in JMP 13.

You can try something like this idea to build an expression that will create a row vector matrix from the current row when evaluated. 

dt = New Table( "Untitled",
    Add Rows( 3 ),
    New Column( "first", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [1, 5, 9] ) ),
    New Column( "second", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [2, 6, 10] ) ),
    New Column( "skip", character, setvalues( {"a", "b", "c"} ) ),
    New Column( "third", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [3, 7, 11] ) ),
    New Column( "fourth", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [4, 8, 12] ) )
);

names = dt << getcolumnnames( "numeric" ); // returns a list of all numeric column names
Show( names ); // {first, second, third, fourth}
Substitute Into( names, Expr( {} ), Expr( Concat() ) );
Show( Name Expr( names ) ); // first || second || third || fourth

Row() = 2; // names will pull data from the current row
rowvec = (Eval( names )); // [5 6 7 8]

Given a list of row names, the Substitute Into() function replaces the curly bracket list operator with the concat function. Concat function is the same as the || concatenation operator and can take a lot of arguments. The concatenation evaluates each column reference in the list for the current row. The individual scalar values are concatenated into the row matrix. 

Craige

View solution in original post

21 REPLIES 21
DataMojo
Level II

Re: Max value of each row across multiple columns and the corresponding name of the source column

Looking for the exact same solution.
For each row find max value and return column name so i can run statistics on the resultant data.
Eg to find how many times a column name shows up as maximum, the number of columns that contribute to max value, the STD DEV etc.
Craige_Hales
Super User

Re: Max value of each row across multiple columns and the corresponding name of the source column

// this example uses JMP 13 data table indexing

dt = New Table( "Untitled",
    New Column( "a", Numeric, Set Values( [1, 33, 1] ) ),
    New Column( "b", Numeric, Set Values( [2, 2, 2] ) ),
    New Column( "c", Numeric, Set Values( [333, 1, 0] ) ),
    New Column( "maxval", numeric ),
    New Column( "maxcol", Numeric )
);

colnames = dt << getcolumnnames;

For Each Row(
    r = Row();
    // 1::3 means columns 1 through 3.  {a,b,c} would also work.
    location = Loc Max( dt[ r, 1 :: 3] ); // location of maximum value in array
    maxval = dt[r, location];
    maxcol = colnames[location] << getname;
);

Data table with maximum value from row and name of maximum column for rowData table with maximum value from row and name of maximum column for row

Earlier post on data table subscripting

Craige
DataMojo
Level II

Re: Max value of each row across multiple columns and the corresponding name of the source column

I have run into one issue here, if the columns I need to choose are between the data set it doesnt seem to work.

 

For Each Row(
    r = Row();
    // 1::3 means columns 1 through 3.  {a,b,c} would also work.
    location = Loc Max( dt[ r, 57 :: 113] ); // location of maximum value in array
    maxval = dt[r, location];
    maxcol = colnames[location] << getname;
);

 

I need to choose columns 57 through 113, but for some reason when I do this it through the script off and i get incorrect values. 

DataMojo
Level II

Re: Max value of each row across multiple columns and the corresponding name of the source column

Resolved i was missing the offset of columns that had to be added to the location variable.

Craige_Hales
Super User

Re: Max value of each row across multiple columns and the corresponding name of the source column

Good point. The example could be generalized like this (intentionally ugly column arrangment...) Pick one of the either/or that works best for you. The 2nd one has the advantage of isolating you from accidental rearrangements of the columns. The first one might be good for 100s of columns.

dt = New Table( "Untitled",
    New Column( "maxval", numeric ),
    New Column( "a", Numeric, Set Values( [1, 33, 1] ) ),
    New Column( "maxcol", Numeric ),
    New Column( "b", Numeric, Set Values( [2, 2, 2] ) ),
    New Column( "c", Numeric, Set Values( [333, 1, 0] ) ),
);


// either
vars = [2 4 5];
colnames = (dt << getcolumnnames)[vars];
// or
colnames={a,b,c}; // this might be best choice


For Each Row(
    row = Row();
    location = Loc Max( dt[ row, colnames] ); // location of maximum value in array
    col = colnames[location]<<getname;
    maxval = dt[row, col];
    maxcol = col;
);
Craige
ILoveJMP
Level III

Re: Max value of each row across multiple columns and the corresponding name of the source column

It still doesn't work properly for me. Attached please find both the script and the data files. Could you please let me know what I did wrong here? 

Thanks!

txnelson
Super User

Re: Max value of each row across multiple columns and the corresponding name of the source column

The script and data table worked great for me.  Now, I did change the

 

     dt=Open("........");

to

 

     dt=current data table();

Since the table was already open.

 

So, I would suggest that you try that, or that you place an additional statement after your dt=Open statement:

     wait(0);

There is a chance that your data table has not finished before the For Each Row starts.  The Wait(0); will force the open to complete before running the For Each Row

Jim
ILoveJMP
Level III

Re: Max value of each row across multiple columns and the corresponding name of the source column

Thanks much Jim for the reply! However, it still doesn't work for me with dt=current data table(). Not sure what is going on. BTW, could you send me the output file you generated?

Thanks!

txnelson
Super User

Re: Max value of each row across multiple columns and the corresponding name of the source column

Here is the data table after I ran the script

Jim