- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
need jsl script help - trying to remove random rows for model validation testing and use the remainder of rows for model creation
Hi - I am trying to fit a regression equation using a random 80% of data and, using the resulting modeled equation, run the remaining 20% of data through the model and save the observed versus modeled results of those 20%. That is, the 20% of rows are used for validation.
My dataset has 227 rows. I want to train (i.e. develop a regression equation) using 80% of those after which that regression equation would be run on the remaining 20% of rows to compute observed vs modeled for those 20% of rows.
My script is not working properly - it excludes 20% of rows correctly but then runs all 100% of rows and saves those. I show the code below but have removed the portion that computes various metrics about observed vs modeled results.
Any help would be most appreciated! (I'm using jmp pro 17.0)
dt = Open("C:\Users\trcampbell\Desktop\MASTER ECOLI\2025\2024.xlsx");
Random_rows = dt << Select Randomly(0.2) << Hide and Exclude();
New Column("Excluded", Numeric, ordinal);
:Excluded << Set Formula(If(Excluded(), 1, 0));
dt << Sort(By(:Excluded), Replace Table, Order(Descending), Copy formula(0));
/////////////......LOG ECOLI VS LOG TURB....LINEAR FIT....../////////////////////
obj = Fit Model(
Y(:LOGECOLI),
Effects(:LOGTURB),
Personality("Standard Least Squares"),
Emphasis("Effect Leverage"),
Run(:LOGECOLI)
);
obj << Prediction Formula;
ref = obj <<
dt << save(
"C:\Users\trcampbell\Desktop\MASTER ECOLI\2025\validation results\test xxxxx1 with 126 and 886.xls"
);
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: need jsl script help - trying to remove random rows for model validation testing and use the remainder of rows for model creation
There are different ways you could do this.
- Instead of using Select Randomly, you could use Subset and have it create an 80% random sample. Then run your Fit Model and save the subsetted table.
- Alternatively, after running your script, then subset the table based on your Excluded column and then save the subsetted table that is produced
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: need jsl script help - trying to remove random rows for model validation testing and use the remainder of rows for model creation
As you have JMP Pro I would consider using Validation Column
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: need jsl script help - trying to remove random rows for model validation testing and use the remainder of rows for model creation
Thank you jthi!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: need jsl script help - trying to remove random rows for model validation testing and use the remainder of rows for model creation
You are using JMP Pro. Are you aware of the built-in support for model selection using training, validation, and test hold-out sets?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: need jsl script help - trying to remove random rows for model validation testing and use the remainder of rows for model creation
I ran your JSL on JMP 17, using the Semiconductor Capability sample data table. The script selected 291 rows of the 1455 total number of rows. The 291 selected rows were hidden and excluded.
It then ran the Fit Model, and the results of the fit model were based upon only the non excluded rows.
The saved prediction formula does calculate the predicted values for all rows. It does not discriminate based on selected/excluded vs. non selected/excluded rows.
This may be the source of the issue you are seeing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: need jsl script help - trying to remove random rows for model validation testing and use the remainder of rows for model creation
Thank you, Jim. Is there a way to revise the script to save only the 20% (validation set) of rows?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: need jsl script help - trying to remove random rows for model validation testing and use the remainder of rows for model creation
There are different ways you could do this.
- Instead of using Select Randomly, you could use Subset and have it create an 80% random sample. Then run your Fit Model and save the subsetted table.
- Alternatively, after running your script, then subset the table based on your Excluded column and then save the subsetted table that is produced
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: need jsl script help - trying to remove random rows for model validation testing and use the remainder of rows for model creation
Thank you. I just used this code below in the line just before I save the file. I assume this is correct (that is, it still utilizes the formula that was obtained from the 80% of training rows).
dt<< Delete Rows(dt << get rows Where( :Excluded == 0 ));
dt<< Save("C:\\Users\\trcampbell\\Desktop\\MASTER ECOLI\\2025\\validation results\\excluded rows.xlsx");
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: need jsl script help - trying to remove random rows for model validation testing and use the remainder of rows for model creation
Yes, thanks very much.