Subscribe Bookmark RSS Feed

bootstrap iterations are all identical???

dherring_wakehe

Community Trekker

Joined:

Jul 22, 2015

1 ACCEPTED SOLUTION

Accepted Solutions
michael_jmp

Staff

Joined:

Jun 23, 2011

Solution

After some investigation by JMP development, it appears that there is a bug for the case when the KFold validation sets are created and bootstrapping is done in the same Generalized Regression report. The random seed used for validation set creation is not getting properly reset for each bootstrap iteration. This bug should be fixed in time for JMP14.

 

In the meantime, here is a workaround. You can fit your model in Generalized Regression using KFold cross-validation as you did before and then save the validation sets (Save Columns > Save Validation Column). Then launch Fit Model again, click Recall and use the validation column that you saved in the Validation role. Run the new model. If you bootstrap from this new report, you should bypass the bug.

 

Hope that helps. Let us know if you have further questions.

-Michael

 

 

Michael Crotty
Sr Statistical Writer
JMP Development
9 REPLIES
michael_jmp

Staff

Joined:

Jun 23, 2011

Hello,

Please provide more information about the situation where you are seeing identical results for all bootstrap iterations:

  • What version of JMP and which platform are you using?
  • What part of the report did you try to bootstrap?
  • Did you set a random seed in the Bootstrapping options window?

Thanks,

Michael

Michael Crotty
Sr Statistical Writer
JMP Development
dherring_wakehe

Community Trekker

Joined:

Jul 22, 2015

JMP pro 13.0.0 (64 bit)

several different platforms including Generalized regression

did not choose a seed

 

Also I have noticed that if I launch an analysis that may have a stocastic feature such as a Cross-fold validation or bootstrapping the first one looks fine, then a redo  analysis is slightly but appropriately different, and then after that each redo yields identical results to the second analysis.  In both cases it looks like something is not getting reset to allow for a truely random execution whether it be bootstrapping or cross-fold validation. 

michael_jmp

Staff

Joined:

Jun 23, 2011

I'm having trouble reproducing this in JMP Pro 13.0.0 on a 64-bit Windows machine.
Do you see this for many different data tables? Are you able to reproduce this behavior on one of the tables in JMP Sample Data?

Thanks,
Michael
Michael Crotty
Sr Statistical Writer
JMP Development
dherring_wakehe

Community Trekker

Joined:

Jul 22, 2015

Yes I can. I used the Nicardipine data set and attempted to model sex as a function of ~10 continuous predictor variables. The issue appears to be linked to using cross-validation as the validation method. For both Adaptive Elastic Net and Adaptive Lasso when I use CF validation and then attempt to do a bootstrap on the parameter estimates I get identical results for boot iterations 1 to n.  (boot 0 is different).  I tried to model a continuous variable in the same way and found that the first few iterations were different but eventually all the boot iterations yielded identical results? Almost appears to be converging on a solution and then never changing.  When I use AIC as the validation method it looks like bootstrap works fine.  

Jeff_Perkinson

Community Manager

Joined:

Jun 23, 2011


dherring_wakehe wrote:

Yes I can. I used the Nicardipine data set and attempted to model sex as a function of ~10 continuous predictor variables. The issue appears to be linked to using cross-validation as the validation method. For both Adaptive Elastic Net and Adaptive Lasso when I use CF validation...


I'm afraid I'm not able to reproduce it with this info.

 

  • Exactly which predictors are you using?
  • I don't understand "CF validation". Do you mean k-fold? 
  • Can you post your table of iterations?

 

 

-Jeff
dherring_wakehe

Community Trekker

Joined:

Jul 22, 2015

Run the script saved in the Nicardipine data set. Then bootstrap the parameter estimates in the output table. I selected fractional weights (logistic regression and small data set) and deselected "split selected column" and "discard Stacked Table if Split Works"
*Note* This is a very clumsy demonstration since only a few variables actually contribute to the model but it reproduces the same behavior I see in my own data set.
Jeff_Perkinson

Community Manager

Joined:

Jun 23, 2011

I don't see a Generalized Regression script in Nicardipine but I think I've gotten pretty close to what you're seeing and have very similar results in multiple (most?) bootstrap samples when running this script.

 

 

dt=open("$SAMPLE_DATA\Nicardipine.jmp");
foo=dt<<Fit Model(
	Y( :Sex ),
	Effects(
		:Datetime of First Exposure to Treatment,
		:Date of Last Exposure to Treatment,
		:Time of Last Exposure to Treatment,
		:Datetime of Last Exposure to Treatment,
		:Date of Death,
		:Completed,
		:Death,
		:Name( "Lost to Follow-Up" ),
		:Moderately Disabled,
		:Randomized,
		:Recovery,
		:Severely Disabled,
		:Vegetative Survival,
		:Responses,
		:Cases,
		:Study Day of Collection
	),
	Personality( "Generalized Regression" ),
	Set Alpha Level( 0.1 ),
	Generalized Distribution( "Binomial" ),
	Run(
		Fit( Estimation Method( Lasso( Adaptive ) ), Validation Method( KFold, 5 ) )
	)
);


(foo << Report)[Table Box( 4)] <<
Bootstrap(
	25,
	Fractional Weights( 1 ),
	Split Selected Column( 1 )
);

Setting Fractional Weights(0) instead of (1) above seems to make it behave better. 

 

We're continuing to look at it and will report back when we know more. 

 

-Jeff
michael_jmp

Staff

Joined:

Jun 23, 2011

Solution

After some investigation by JMP development, it appears that there is a bug for the case when the KFold validation sets are created and bootstrapping is done in the same Generalized Regression report. The random seed used for validation set creation is not getting properly reset for each bootstrap iteration. This bug should be fixed in time for JMP14.

 

In the meantime, here is a workaround. You can fit your model in Generalized Regression using KFold cross-validation as you did before and then save the validation sets (Save Columns > Save Validation Column). Then launch Fit Model again, click Recall and use the validation column that you saved in the Validation role. Run the new model. If you bootstrap from this new report, you should bypass the bug.

 

Hope that helps. Let us know if you have further questions.

-Michael

 

 

Michael Crotty
Sr Statistical Writer
JMP Development
dherring_wakehe

Community Trekker

Joined:

Jul 22, 2015

Thank you for the investigation and reply!  I'll use the work around for this admittedly rather esoteric analytic sequence.  - DH