I have a data table that is made up of multiple rows of individual measurements in-series that I am trimming before doing a table split function on, however I have a problem with some redundant measurements that I would like to remove but the current code I'm using is extremely slow and likely inefficient for what I need.
Background Information:
- Each row is the result of an individual automated measurement on a piece of material performed in series and produces a report with the following columns:
- A unique part ID assigned to each individual piece of material [ PartID ]
- The Lot ID associated with the raw material used to create the individual piece of material [ LotID ]
- The actual measurement value for the measurement performed [ Value ]
- Measurement result assigned based off of the measurement value(Pass/Fail/Rework) [ Result ]
- Measurement timestamp [ Date_Created ]
My equipment produces dozens of rows of measurements per piece of material, and processes thousands of pieces of material per report - those reports are then fed into JMP as my original data table. Each report also contains multiple lots, so the data table has multiple references to "PartID = 1" because each lot is given a sequential number starting at 1, so I need to use both PartID and LotID to refer to a given piece of material. In the case where the result is "Rework", the measurement equipment stops any further measurements on that individual piece of material, re-orients the piece of material, and returns it to the start of the line re-measuring all previous tests on the material but still keeping the same unique part ID that was associated with the piece of material.
Desired Outcome:
I am wanting to use script to go through the data table and eliminate all measurements up to (and including) the rework result, since after rework all the measurements are repeated and rework can affect the measurement values. To do this I am currently using JSL script to loop through each row and do the following:
- Identify whether the result is "Rework"; if so then do the following:
- Identify the PartID for the given piece of material as "xPartID"
- Identify the LotID for the given piece of material as "xLot"
- Identify the Date_Created for the given measurement as "xDate"
- Select and delete all rows where (PartID = xPartID) and (LotID = xLot) and (Date_Created <= xDate)
- If the result is not "Rework" then bail out of the loop, no need to continue.
I keep having extreme slowdowns and basically the code leads to crashing the program. If I run it step-by-step it seems to work, but if left to run it eventually dies. Here's the actual code:
//Recoding the two potential "Rework" results into a shared value "aRework"; putting the letter "a" in front of it so if I sort by the column ascending then all of these values are at the start of the list, to hopefully make it easier and faster to process them.
dt << Recode Column(
dt:result,
{Map Value(
_rcOrig,
{"FailOnHighRework", "aRework",
"FailOnLowRework", "aRework"},
Unmatched( _rcNow )
)},
Update Properties( 1 ),
Target Column( :result )
);
//Sorting the table using the result column after the recode, so all of the rework rows are at the start of the table
SortColumn = "result";
dt << Sort( By( :result ), Replace Table, Order( Ascending ), Suppress formula evaluation( 1 ) );
//now run the row-by-row script
dt << begin data update;
for each row(
If( :result == "aRework",
xRow = Row();
xPartID=:PartID[Row()];
xLot=:lot[Row()];
xDate=:date_created[Row()];
dt <<select where(:partid == xPartID & :lot==xLot & :date_created<=xDate);
dt << delete rows();
);
//if we run out of reworks, bail out - it's over, we're done here
if(:result!="aRework",
Break();
);
);
dt << End Data Update;
//------------------
I am on JMP 16.2.0 (570548).