Solved: Sorting JMP data table and deleting rows in JMP Script

Report Inappropriate Content · Jul 10, 2014 06:18 PM

Hi:

I'm a rookie with JMP. I want to sort and format a data table based on conditions on first two columns. The first column is a string of text ("LaserID" ) and the second column is time when data is taken ("time"). My goal is to to derive another table with exact same number of columns as the original table so that there is only one row of data of a particular "LaserID" (i.e. no repeat). The elimination of multiple row(s) with same "LaserID" will be based on when the data is taken ("time").

In another word, the new table will consist of row with a unique "LaserID" with data taken at the latest time.

I am attaching the Original and Final table I want to achieve. Any help in writing a script for this task will be appreciated.

David_Burnham · Jul 11, 2014 03:19 PM

Something like this should do it:

Get Latest Row = Function({dt,ID},{Default Local},

// this can be much simpler if we assume datetime order!

//get only those rows matching the specified laser ID

lstRows = dt << Get Rows Where(:Laser ID == ID);

//get the corresponding times as a list

lstTimes = Column(dt,"Time")[lstRows];

// locate the maximum time

loc = Loc Max(lstTimes);

// return the row containing the maximum time

lstRows[loc]

);

//source data table

dt = Data Table("Initial");

// get list of unique values for Laser ID

Summarize(lstIDs = By(:Laser ID));

// create empty final table

newDt = dt << Subset(Copy Formulas(0));

newDt << Select All Rows;

newDt << Delete Rows;

newDt << Set Name("Final");

// create a row for each result

Column(newDt,"Laser ID") << Set Values(lstIDs);

//iterate through each ID to find the row with the latest date

For (i=1,i<=NItems(lstIDs),i++,

currentID = lstIDs[i];

r = Get Latest Row(dt,currentID);

show(r);

//place values from r'th row into new data table

For (c=1,c<=NCols(dt),c++,

Column(newDt,c)[i] = Column(dt,c)[r]

)

);

-Dave

View solution in original post

David_Burnham · Jul 11, 2014 03:19 PM

Something like this should do it:

Get Latest Row = Function({dt,ID},{Default Local},

// this can be much simpler if we assume datetime order!

//get only those rows matching the specified laser ID

lstRows = dt << Get Rows Where(:Laser ID == ID);

//get the corresponding times as a list

lstTimes = Column(dt,"Time")[lstRows];

// locate the maximum time

loc = Loc Max(lstTimes);

// return the row containing the maximum time

lstRows[loc]

);

//source data table

dt = Data Table("Initial");

// get list of unique values for Laser ID

Summarize(lstIDs = By(:Laser ID));

// create empty final table

newDt = dt << Subset(Copy Formulas(0));

newDt << Select All Rows;

newDt << Delete Rows;

newDt << Set Name("Final");

// create a row for each result

Column(newDt,"Laser ID") << Set Values(lstIDs);

//iterate through each ID to find the row with the latest date

For (i=1,i<=NItems(lstIDs),i++,

currentID = lstIDs[i];

r = Get Latest Row(dt,currentID);

show(r);

//place values from r'th row into new data table

For (c=1,c<=NCols(dt),c++,

Column(newDt,c)[i] = Column(dt,c)[r]

)

);

-Dave

syedhashmi · Jul 19, 2019 07:44 PM

Hi David, I am using your script to elimintae duplicates as well, but I have a question: what if my column (equivalent to "LaserID" column in the example) is numeric & continuous? somehow the script wouldnt work in my case....If i change the column properties to make it "charecter", then the script works.

thanks.

pmroz · Jul 11, 2014 04:02 PM

Here's an alternate approach that uses tabulate with max, and then update:

dt1 = Data Table( "Initial" );

tab1 = dt1 << Tabulate(

Show Control Panel( 0 ),

Add Table(

Column Table( Analysis Columns( :time ), Statistics( Max ) ),

Row Table( Grouping Columns( :LaserID ) )

), invisible

);

dt2 = tab1 << Make Into Data Table;

tab1 << close window();

dt2 << Update(

With( dt1 ),

Match Columns( :LaserID = :LaserID, :Name( "Max(time)" ) = :time ),

Add Columns from Update table(

:PRF,

:Name( "Power(W)" ),

:Name( "Pmon(W)" ),

:PRF2,

:Name( "Power2(W)" ),

:Name( "Pmon2(W)" )

)

);

Jeff_Perkinson · Oct 18, 2016 2:19 PM

You've gotten a pair of scripting responses. If you'd like to do this interactively, try these steps.

Sort the table by LaserID and Time.

Then create a new column with a formula to select the rows where the LaserID in the current row doesn't match the LaserID in the next row. Since the table is sorted by LaserID and Time you'll get the last row of each LaserID series.

The trick here is to use a column with a data type of Row State and the Selected State state function in the formula.

Choose Cols -> New Column... and set the data type to Row State.

Then add a Formula.

For the formula, start with Row State -> Selected State.

Then put the comparison inside the Selected State() function. This comparison uses the Lag() function with a -1 argument to compare with the value of LaserID in the next row.

Now you just need to copy the row states in your new column to the active row state area by clicking on the star icon in front of the column name.

Now you can use Tables->Subset to subset the selected rows.

I hope that helps!

-Jeff

Sorting JMP data table and deleting rows in JMP Script

Re: Sorting JMP data table and deleting rows in JMP Script

Re: Sorting JMP data table and deleting rows in JMP Script

Re: Sorting JMP data table and deleting rows in JMP Script

Re: Sorting JMP data table and deleting rows in JMP Script

Re: Sorting JMP data table and deleting rows in JMP Script