cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
User5555
Level I

Can I do a second pass Elastic Net only including the most significant predictors?

First off, I am new to predictive modeling and I appreciate any advice.

 

BACKGROUND:

I am doing a binomial elastic net where n = 54 and p = 89. This model is for predicting drug effects clinically; thus, prediction by measuring as few variables as possible would be ideal.

 

After Elastic Net variable selection/shrinkage, 24 predictors are included in the model. Of the 24 predictors included in the model, 10 are significant predictors and contribute 96% of the total effects in the model (link to calculation below).

 

QUESTION: Can I generate a second predictive model using only the top 10 predictors from my first elastic net? I would like to demonstrate that as few as 10 variables could be used for prediction with high AUC. If so, would the second pass be better with elastic net or another model type?

 

CALCULATIONS: The total effect I'm referencing is described here https://www.jmp.com/support/help/en/16.2/#page/jmp/assess-variable-importance.shtml?os=win&source=ap...

and briefly here: Independent Resampled Inputs For each factor, Monte Carlo samples are obtained by resampling its set of observed values. Use this option when you believe that your factors are uncorrelated and that their likely values are not represented by a uniform distribution"

 

Please let me know if any additional information is required; I can never tell what information is relevant! 

THANK YOU in advance!!

1 REPLY 1

Re: Can I do a second pass Elastic Net only including the most significant predictors?

The choice of the model is yours. The fitting routines and myriad criteria are aids, not rules. You can drag the vertical line in the solution path diagram to the left to decrease the predictors used in the model and immediately evaluate its performance.