cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
The Discovery Summit 2025 Call for Content is open! Submit an abstract today to present at our premier analytics conference.
Choose Language Hide Translation Bar
View Original Published Thread

need jsl script help - trying to remove random rows for model validation testing and use the remainder of rows for model creation

learning_JSL
Level IV

Hi - I am trying to fit a regression equation using a random 80% of data and, using the resulting modeled equation, run the remaining 20% of data through the model and save the observed versus modeled results of those 20%.  That is, the 20% of rows are used for validation. 

 

My dataset has 227 rows.  I want to train (i.e. develop a regression equation) using 80% of those after which that regression equation would be run on the remaining 20% of rows to compute observed vs modeled for those 20% of rows. 

 

My script is not working properly - it excludes 20% of rows correctly but then runs all 100% of rows and saves those.   I show the code below but have removed the portion that computes various metrics about observed vs modeled results.  

 

Any help would be most appreciated!   (I'm using jmp pro 17.0)

 

 

dt = Open("C:\Users\trcampbell\Desktop\MASTER ECOLI\2025\2024.xlsx");
 
Random_rows = dt << Select Randomly(0.2) << Hide and Exclude();
 
New Column("Excluded", Numeric, ordinal);
 
:Excluded << Set Formula(If(Excluded(), 1, 0));
dt << Sort(By(:Excluded), Replace Table, Order(Descending), Copy formula(0));
 
/////////////......LOG ECOLI VS LOG TURB....LINEAR FIT....../////////////////////
 
obj = Fit Model(
	Y(:LOGECOLI),
	Effects(:LOGTURB),
	Personality("Standard Least Squares"),
	Emphasis("Effect Leverage"),
	Run(:LOGECOLI)
);
    
obj << Prediction Formula;
     
ref = obj << 
 
dt << save(
	"C:\Users\trcampbell\Desktop\MASTER ECOLI\2025\validation results\test xxxxx1 with 126 and 886.xls"
);
1 ACCEPTED SOLUTION

Accepted Solutions
txnelson
Super User


Re: need jsl script help - trying to remove random rows for model validation testing and use the remainder of rows for model creation

There are different ways you could do this.

  1. Instead of using Select Randomly, you could use Subset and have it create an 80% random sample.  Then run your Fit Model and save the subsetted table.
  2. Alternatively, after running your script, then subset the table based on your Excluded column and then save the subsetted table that is produced
Jim

View solution in original post

8 REPLIES 8
jthi
Super User


Re: need jsl script help - trying to remove random rows for model validation testing and use the remainder of rows for model creation

As you have JMP Pro I would consider using Validation Column

jthi_0-1742563935845.png

 

-Jarmo
learning_JSL
Level IV


Re: need jsl script help - trying to remove random rows for model validation testing and use the remainder of rows for model creation

Thank you jthi!


Re: need jsl script help - trying to remove random rows for model validation testing and use the remainder of rows for model creation

You are using JMP Pro. Are you aware of the built-in support for model selection using training, validation, and test hold-out sets?

txnelson
Super User


Re: need jsl script help - trying to remove random rows for model validation testing and use the remainder of rows for model creation

I ran your JSL on JMP 17, using the Semiconductor Capability sample data table.  The script selected 291 rows of the 1455 total number of rows.  The 291 selected rows were hidden and excluded.

It then ran the Fit Model, and the results of the fit model were based upon only the non excluded rows.  

txnelson_0-1742565545213.png

The saved prediction formula does calculate the predicted values for all rows.  It does not discriminate based on selected/excluded vs. non selected/excluded rows.

This may be the source of the issue you are seeing. 

Jim
learning_JSL
Level IV


Re: need jsl script help - trying to remove random rows for model validation testing and use the remainder of rows for model creation

Thank you, Jim.  Is there a way to revise the script to save only the 20% (validation set) of rows?

txnelson
Super User


Re: need jsl script help - trying to remove random rows for model validation testing and use the remainder of rows for model creation

There are different ways you could do this.

  1. Instead of using Select Randomly, you could use Subset and have it create an 80% random sample.  Then run your Fit Model and save the subsetted table.
  2. Alternatively, after running your script, then subset the table based on your Excluded column and then save the subsetted table that is produced
Jim
learning_JSL
Level IV


Re: need jsl script help - trying to remove random rows for model validation testing and use the remainder of rows for model creation

Thank you.  I just used this code below in the line just before I save the file.  I assume this is correct (that is, it still utilizes the formula that was obtained from the 80% of training rows).

 

dt<< Delete Rows(dt << get rows Where( :Excluded == 0 ));

 

dt<< Save("C:\\Users\\trcampbell\\Desktop\\MASTER ECOLI\\2025\\validation results\\excluded rows.xlsx");

learning_JSL
Level IV


Re: need jsl script help - trying to remove random rows for model validation testing and use the remainder of rows for model creation

Yes, thanks very much.

Recommended Articles

No recommendations found