cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
evanchen870
Level I

How to sort columns and delete repeated rows

New to scripting in JMP and want to filter the table of tool errors. I have a set of data and want to filter by alarm date next delete the repeated rows of errors. For example, if an error occurs at 1:00 and it repeats itself 4 times within 1 hour I want to delete the repeats and only catches when the error first appears.

My script is below and I want to resort the script after I do the delete of repeated rows by alarm date (descending).

dt = open ("SR_Loop_LAM_002_Alarm_Log.csv");

dt << sort (by (:ENTITY), order (ascending), replace table (0));

for (i = nrows (dt), i>0, i--,

  if(:ENTITY[i-1]==:ENTITY & :ALARM_ID[i-1] == :ALARM_ID & date difference(:ALARM_DATE[i-1], :ALARM_DATE, "hour", "fractional") < 1,

  dt << delete rows (i))

);

dt << save ("SR_Loop_LAM_002_Alarm_Log.csv");

close (dt, no save)

1 ACCEPTED SOLUTION

Accepted Solutions
Nate_Riordan
Staff (Retired)

Re: How to sort columns and delete repeated rows

Hi Evan,

There are a number of ways you could go around it.  There is a specific challenge with your duplicate removal in that you want to remove 'duplicates' that occur within an hour of the same alarm.  This requires lag function logic, which in turn requires the data be appropriately ordered before execution.  Furthermore if you delete a row before execution of the next logic then you could end up with out of bounds errors in looping.  One workaround is to set a flag for duplicate removal, this can be done by creating a boolean column or by using some built in features.  Below I've modified your code to use row selection as that flag.  After the logic is performed to determine if the alarm is a duplicate, all rows are deleted at once.  Further speed is added by opening the table in invisible mode so that screen I/O is avoided.

/* data table made invisible to speed up execution */

dt = Open( "SR_Loop_LAM_002_Alarm_Log.csv" , invisible );


/* updated for multiple sort */

dt << Sort( By( :ENTITY, :ALARM_DATE ), Order( Ascending, Ascending ), Replace Table( 0 ) );

/* clear row and column selection first (just good coding habit) */

dt << Clear Select;

dt << Clear Column Selection;


/* select rows that meet the specified conditions

    because this is going from bottom to top the loop should stop at row 1 (there is no row 0)

*/

For( i = N Rows( dt ), i > 1, i--,

     If(

          :ENTITY[i - 1] == :ENTITY[i] &

          :ALARM_ID[i - 1] == :ALARM_ID[i] &

          Date Difference( :ALARM_DATE[i - 1], :ALARM_DATE[i], "hour", "fractional" ) < 1,

          dt << Select Rows( i )

     )

);

/* delete the selected rows */

dt << Delete Rows;


/* final sort */

dt << Sort( By( \:ALARM_DATE ), Order( Descending ), Replace Table( 0 ) );

dt << save( "SR_Loop_LAM_002_Alarm_Log_cleansed.csv" );

   

Close( dt, no save );

View solution in original post

7 REPLIES 7
evanchen870
Level I

Re: How to sort columns and delete repeated rows

current JMP error is "Invalid row number in access or evaluation of 'subscript', "ENTITY[i-1]"

ian_jmp
Level X

Re: How to sort columns and delete repeated rows

Looks like you are nearly there. Your 'for' loop terminates at the value of i=1, but you are indexing i-1=0. JMP subscripts always start from 1.

evanchen870
Level I

Re: How to sort columns and delete repeated rows

Thanks for the quick response Ian, but the script is still not doing what I want it to do.

Currently, I have columns ENTITY, ALARM_DATE, ENTITY_TYPE, ALARM_ID, and ALARM_DESCRIPTION

1. I first want my script to sort by either ENTITY, ALARM_DATE, or ENTITY_TYPE

2. then I want to use a if loop to delete repeated alarms (by ALARM_DESCRIPTION) within 1 hr of time difference of the ALARM_DATE

3. lastly I want to sort by ALARM_DATE (descending) to replace the original table

So, by your suggestions I will need to change to  (i = nrows (dt), i>1, i--)?

txnelson
Super User

Re: How to sort columns and delete repeated rows

If you use the Lag() function you will not get an error.

lat(ENTITY) will point to the save column(row) as ENTITY[i-1]

Jim
ron_horne
Super User (Alumni)

Re: How to sort columns and delete repeated rows

hi evanchen870

take a look at this discussion from the past the script is also very fast and robust for deleting since it uses "get rows where"

Re: Stop a loop

good luck!

Nate_Riordan
Staff (Retired)

Re: How to sort columns and delete repeated rows

Hi Evan,

There are a number of ways you could go around it.  There is a specific challenge with your duplicate removal in that you want to remove 'duplicates' that occur within an hour of the same alarm.  This requires lag function logic, which in turn requires the data be appropriately ordered before execution.  Furthermore if you delete a row before execution of the next logic then you could end up with out of bounds errors in looping.  One workaround is to set a flag for duplicate removal, this can be done by creating a boolean column or by using some built in features.  Below I've modified your code to use row selection as that flag.  After the logic is performed to determine if the alarm is a duplicate, all rows are deleted at once.  Further speed is added by opening the table in invisible mode so that screen I/O is avoided.

/* data table made invisible to speed up execution */

dt = Open( "SR_Loop_LAM_002_Alarm_Log.csv" , invisible );


/* updated for multiple sort */

dt << Sort( By( :ENTITY, :ALARM_DATE ), Order( Ascending, Ascending ), Replace Table( 0 ) );

/* clear row and column selection first (just good coding habit) */

dt << Clear Select;

dt << Clear Column Selection;


/* select rows that meet the specified conditions

    because this is going from bottom to top the loop should stop at row 1 (there is no row 0)

*/

For( i = N Rows( dt ), i > 1, i--,

     If(

          :ENTITY[i - 1] == :ENTITY[i] &

          :ALARM_ID[i - 1] == :ALARM_ID[i] &

          Date Difference( :ALARM_DATE[i - 1], :ALARM_DATE[i], "hour", "fractional" ) < 1,

          dt << Select Rows( i )

     )

);

/* delete the selected rows */

dt << Delete Rows;


/* final sort */

dt << Sort( By( \:ALARM_DATE ), Order( Descending ), Replace Table( 0 ) );

dt << save( "SR_Loop_LAM_002_Alarm_Log_cleansed.csv" );

   

Close( dt, no save );

evanchen870
Level I

Re: How to sort columns and delete repeated rows

Thank you all for the great response! I have edit my script and it working beautifully now.