cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
dale_lehman
Level VII

Monte Carlo simulation for an analysis (stepwise regression) platform

I have an analysis which uses several columns that have random elements in them.  I know how to manually update the random numbers that are in the columns and I understand how to conduct a simulation for the output of an analysis.  But in this case, I am wanting to run a stepwise regression each time the random numbers in the columns are updated.  I want to do this at least 1000 times.  I don't know how to automate this process:  generate new random numbers in formula columns, rerun the stepwise regression platform run the chosen model and collect the results.

7 REPLIES 7
dale_lehman
Level VII

Re: Monte Carlo simulation for an analysis (stepwise regression) platform

Let me be more specific in case any of the generous scripters out there can help.  I've attached a smaller version of what I am trying to do.  I have 10 columns (all starting with Column 2) in the attached file that involve a random normal distribution and the first column Y, that uses 5 columns labeled parameter 1 through 5, and one random normal distribution in the formula.  I wish to conduct the following steps:

1.  Run the stepwise regression stored in the file.  I don't need to make or run the model, I only need to execute the Go button that runs the stepwise procedure after Run from the Fit Model platform.

2.  Record how often each of the 10 columns that begin with Column 2 are selected by the stepwise procedure. (I don't know how to script collecting that particular information)

3.  Rerun the formula for the 5 parameter columns so that they contain a new set of simulated values.  I know this can be done with the Rerun formula command found at the red arrow at the top left of the table (but I don't know how to script that).

4.  Repeat Steps 1 and 2. (I believe this can be done via a simple loop)

5.  Do this 1000 times and show the distributions for the number of times each of the 10 Column 2 columns were selected in the stepwise procedure.

 

I believe all this can be scripted but it is far beyond anything I am able to do.  If anyone can be of assistance, I'd appreciate that.  I am willing to coauthor a paper with someone as I have an outlet in mind for this.

dale_lehman
Level VII

Re: Monte Carlo simulation for an analysis (stepwise regression) platform

I have mostly solved my problem - but still have a question.  Attached is a small script that executes the stepwise regression as I want.  But to generate 1000 samples, I needed to create a "sample" column and use that as a "By" variable in the analysis.  This works, but is a cumbersome way to do simulation, since I need to make the file very long - e.g., for 1000 samples of 1000 observations each, I need 1,000,000 rows.  Does anyone know a more efficient way to conduct a simulation?  As far as I can tell, the "simulate" option that accompanies many JMP platforms can only simulate certain analysis results but not the analysis itself.

peng_liu
Staff

Re: Monte Carlo simulation for an analysis (stepwise regression) platform

What you have laid out in the previous post is a good plan. I would do that, too. I will try to give some recommendations.

  1. If you are new to scripting in JMP, a starting point is Scripting Guide . Some important pieces, in this specific task, might be:
  2. Besides that, you need to be familiar with Scripting Index under Help menu. There are 3 groups of scripting objects: Functions, Objects, and Display boxes. There is a combo box on the top left, and you can narrow down to a specific group.
  3. For this specific task, you may need to check out the Objects group. All platforms, including data table, are listed here. You may want to check out the "Fit Stepwise" item under "Fit Model" object.Click "Fit Stepwise", and look to the right for a "Finish" message among "Item Messages". That might be the key missing piece to you now.
  4. For this specific task, you also need to learn how to manipulate data tables. Concatenation, in particular, maybe. Go to the Objects list, click "Data Table", and look to the right for "Concatenate" among Item Messages.

Besides the above, here are other helpful tips that I learned from my colleagues.

  1. Post smaller, specific, questions to the discussion board, whenever possible. It will be more likely to generate help.
  2. Prefer screenshots to attachments. For JSL codes, you can insert them directly in the post, so readers can see. Look for the <JSL> button in the toolbar when you edit a post. It will be more likely to generate help, too. E.g. it is unlikely that a reader on a mobile device will open an attachment.

peng_liu_0-1644765195539.png

 

frank_wang
Level IV

Re: Monte Carlo simulation for an analysis (stepwise regression) platform

Hi dale_lehman

My idea is get and save the result from 'stepwise' platform in folder. Then 'Multiple File Import' whole result as one file.

At last, row clean by 'P value' and distribut as your want.  Detail JSL as below:

Clear Globals();
Clear Log();
Names Default To Here( 1 );

//Define path in and out
Path_in = Pick File( "Select file" );

If( Directory Exists( "$Desktop\Fit_Stepwise" ),
	,
	Create Directory( "$Desktop\Fit_Stepwise" )
);

Path_out = "$Desktop\Fit_Stepwise\";

//Run Fit model and stepwise after open datatable
dt = Open( Path_in );

For( i = 1, i <= 1000, i++,
	ow = dt << Fit Model(
		Y( :Y ),
		Effects(
			:Column 2 001,
			:Column 2 002,
			:Column 2 003,
			:Column 2 004,
			:Column 2 005,
			:Column 2 006,
			:Column 2 007,
			:Column 2 008,
			:Column 2 009,
			:Column 2 010
		),
		Personality( "Stepwise" ),
		Run,
		SendToReport( Dispatch( {}, "Step History", OutlineBox, {Close( 1 )} ) )
	);

	dt_result = Report( ow )[Outline Box( "Stepwise Fit for Y" )][Outline Box( "Current Estimates" )][Table Box( 1 )] << Make Into Data Table;
	dt_result << save( Path_out || Char( i ) || ".csv", "csv" );
	ow << close window;
	dt_result << Close Window;
);

dt << Close Window;

dt_result = Multiple File Import(
	<<Set Folder( Path_out ),
	<<Set Name Filter( "*.*;" ),
	<<Set Name Enable( 0 ),
	<<Set Size Filter( {589, 589} ),
	<<Set Size Enable( 0 ),
	<<Set Date Filter( {3727681987.773, 3727682566.676} ),
	<<Set Add File Name Column( 1 ),
	<<Set Import Mode( "CSVData" ),
	<<Set Charset( "Best Guess" ),
	<<Set Stack Mode( "Stack Similar" ),
	<<Set CSV Has Headers( 1 ),
	<<Set CSV Allow Numeric( 1 ),
	<<Set CSV First Header Line( 1 ),
	<<Set CSV Number Of Header Lines( 1 ),
	<<Set CSV First Data Line( 2 ),
	<<Set CSV EOF Comma( 1 ),
	<<Set CSV EOF Tab( 0 ),
	<<Set CSV EOF Space( 0 ),
	<<Set CSV EOF Spaces( 0 ),
	<<Set CSV EOF Other( "" ),
	<<Set CSV EOL CRLF( 1 ),
	<<Set CSV EOL CR( 1 ),
	<<Set CSV EOL LF( 1 ),
	<<Set CSV EOL Semicolon( 0 ),
	<<Set CSV EOL Other( "" ),
	<<Set CSV Quote( "\!"" ),
	<<Set CSV Escape( "" ),
	<<Set Import Callback( Empty() )
) << Import Data;

dt_result << Row Selection( Select where( :Name( "\!"Prob>F\!"" ) > 0.05 ) ) << Delete Rows;
dt_result << Distribution( Stack( 1 ), Nominal Distribution( Column( :Parameter ), Horizontal Layout( 1 ), Vertical( 0 ) ) );

 

 

心若止水
dale_lehman
Level VII

Re: Monte Carlo simulation for an analysis (stepwise regression) platform

frank

Thanks, this is almost perfect.  However, I have random functions in my columns and I want them to update (choose new random draws) each time the loop is run.  The script you wrote appears to use the same random numbers for each run.  Is there a way to have the formulas updated each time?

dale_lehman
Level VII

Re: Monte Carlo simulation for an analysis (stepwise regression) platform

 I found the missing instruction, but can't get it to work.  dt << Rerun Formulas;

will generate a new set of random numbers in my formulas.  However, when I insert it in the script - inside the For loop but before Fit Model, it does not seem to be changing the random numbers.  My modified script is:

Clear Globals();
Clear Log();
Names Default To Here( 1 );

//Define path in and out
Path_in = Pick File( "Select file" );

If( Directory Exists( "$Desktop\Fit_Stepwise" ),
,
Create Directory( "$Desktop\Fit_Stepwise" )
);

Path_out = "$Desktop\Fit_Stepwise\";

//Run Fit model and stepwise after open datatable
dt = Open( Path_in );

For( i = 1, i <= 1000, i++,
dt << Rerun Formulas;
ow = dt << Fit Model(
Y( :Y ),
Effects(
:Column 2 001,
:Column 2 002,
:Column 2 003,
:Column 2 004,
:Column 2 005,
:Column 2 006,
:Column 2 007,
:Column 2 008,
:Column 2 009,
:Column 2 010
),
Personality( "Stepwise" ),
Run,
SendToReport( Dispatch( {}, "Step History", OutlineBox, {Close( 1 )} ) )
);

dt_result = Report( ow )[Outline Box( "Stepwise Fit for Y" )][Outline Box( "Current Estimates" )][Table Box( 1 )] << Make Into Data Table;
dt_result << save( Path_out || Char( i ) || ".csv", "csv" );
ow << close window;
dt_result << Close Window;
);

dt << Close Window;

dt_result = Multiple File Import(
<<Set Folder( Path_out ),
<<Set Name Filter( "*.*;" ),
<<Set Name Enable( 0 ),
<<Set Size Filter( {589, 589} ),
<<Set Size Enable( 0 ),
<<Set Date Filter( {3727681987.773, 3727682566.676} ),
<<Set Add File Name Column( 1 ),
<<Set Import Mode( "CSVData" ),
<<Set Charset( "Best Guess" ),
<<Set Stack Mode( "Stack Similar" ),
<<Set CSV Has Headers( 1 ),
<<Set CSV Allow Numeric( 1 ),
<<Set CSV First Header Line( 1 ),
<<Set CSV Number Of Header Lines( 1 ),
<<Set CSV First Data Line( 2 ),
<<Set CSV EOF Comma( 1 ),
<<Set CSV EOF Tab( 0 ),
<<Set CSV EOF Space( 0 ),
<<Set CSV EOF Spaces( 0 ),
<<Set CSV EOF Other( "" ),
<<Set CSV EOL CRLF( 1 ),
<<Set CSV EOL CR( 1 ),
<<Set CSV EOL LF( 1 ),
<<Set CSV EOL Semicolon( 0 ),
<<Set CSV EOL Other( "" ),
<<Set CSV Quote( "\!"" ),
<<Set CSV Escape( "" ),
<<Set Import Callback( Empty() )
) << Import Data;

dt_result << Row Selection( Select where( :Name( "\!"Prob>F\!"" ) > 0.05 ) ) << Delete Rows;
dt_result << Distribution( Stack( 1 ), Nominal Distribution( Column( :Parameter ), Horizontal Layout( 1 ), Vertical( 0 ) ) );

 

Can someone help with my placement of the Rerun Formulas command or tell me why it doesn't work?

ian_jmp
Level X

Re: Monte Carlo simulation for an analysis (stepwise regression) platform

(Looks like there is another, related thread that I didn't spot 'till posting this).

 

Anyway, please find an alternative approach (using the data table above) that may (or may not . . . ) be quicker:

NamesDefaultToHere(1);

dt = DataTable("Test Data.jmp");

n = 100;						// Number of simulations
resMat = [];					// Results matrix
for(i=1, i<=n, i++,
	fm = dt << Fit Model(
					Y( :Y ),
					Effects(
						:Column 2 001, :Column 2 002, :Column 2 003, :Column 2 004, :Column 2 005,
						:Column 2 006, :Column 2 007, :Column 2 008, :Column 2 009, :Column 2 010
					),
					Personality( "Stepwise" ),
					Run,
					Invisible
				);
	fmRep = Report(fm);
	fmRep[ButtonBox(7)] << click;													// Press the 'Go' button
	Wait(0);																		// Needed to let Stepwise do it's thing before proceeding
	selectedItems = fmRep[CheckBoxBox(2)] << Get Selected Indices As Matrix;		// Retrive which parameters were selected this time
	fmRep << closeWindow;
	resmat = VConcat(resmat, Transpose(selectedItems));								// Append current results row-wise to those we have already
	dt << reRunFormulas;															// Get a new random sample
);

dt2 = AsTable(resmat);																// Make a results table
dt2 << setName("Parameters selected by 'Stepwise'");