cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
MathStatChem
Level VI

Technical Details of Variable Importance Calculations

I am looking for more technical details (beyond what is stated in the JMP documentation) about how Variable Importance is calculated (available under the Prediction Profiler options). 

I have read the two reference papers listed in the documentation (Sobol (2001) and Saltelli (2002)).  I think I get the general idea, but I what I want to know is more of the gritty details on how the monte carlo simulation is done (just for independent uniform inputs)

  • how is the monte carlo simulation setup and executed?  From what I see in the papers, you do different kinds of random input combinations based on subsets of the input variables ?  This is the part I am most confused about and that is least explained in the documentation.
  • how many monte carlo runs are performed.  It appears to be based on the size of the data set, but it is unclear and not described in the documentation.
  • How are the Main Effect and Total Effect metrics for variable importance calculated from the simulation results.  

This is more than just "wanting to know the math".  What I need to do is to compare this approach to other variable importance/impact assessments, which is becoming more and more popular in pharma QbD approaches to determine process parameter criticality.

 

1 REPLY 1
Kevin_Anderson
Level VI

Re: Technical Details of Variable Importance Calculations

Hi, MathStatChem!

 

This might not help you very much, since I don't know much about the nuts and bolts underneath the hood either (and I'm excited to hear from someone who does know!), but there's a JSL Utility function called Sobol Quasi Random Sequence(nDim, nRow), that's described as generating "a sequence of space-filling quasi random numbers using the Sobol sequence in up to 4000 dimensions".  I always assumed this supports the simulation in "The Sobolizer", what some insiders called the Variable Importance system.

 

There's a lot floating around in the internets regarding Sensitivity Analysis that I'm sure you've researched too.  'Sorry to be so unhelpful.

 

Good luck!

Kevin