cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
The Discovery Summit 2025 Call for Content is open! Submit an abstract today to present at our premier analytics conference.
Choose Language Hide Translation Bar
MathStatChem
Level VI

Technical Details of Variable Importance Calculations

I am looking for more technical details (beyond what is stated in the JMP documentation) about how Variable Importance is calculated (available under the Prediction Profiler options). 

I have read the two reference papers listed in the documentation (Sobol (2001) and Saltelli (2002)).  I think I get the general idea, but I what I want to know is more of the gritty details on how the monte carlo simulation is done (just for independent uniform inputs)

  • how is the monte carlo simulation setup and executed?  From what I see in the papers, you do different kinds of random input combinations based on subsets of the input variables ?  This is the part I am most confused about and that is least explained in the documentation.
  • how many monte carlo runs are performed.  It appears to be based on the size of the data set, but it is unclear and not described in the documentation.
  • How are the Main Effect and Total Effect metrics for variable importance calculated from the simulation results.  

This is more than just "wanting to know the math".  What I need to do is to compare this approach to other variable importance/impact assessments, which is becoming more and more popular in pharma QbD approaches to determine process parameter criticality.

 

1 REPLY 1
Kevin_Anderson
Level VI

Re: Technical Details of Variable Importance Calculations

Hi, MathStatChem!

 

This might not help you very much, since I don't know much about the nuts and bolts underneath the hood either (and I'm excited to hear from someone who does know!), but there's a JSL Utility function called Sobol Quasi Random Sequence(nDim, nRow), that's described as generating "a sequence of space-filling quasi random numbers using the Sobol sequence in up to 4000 dimensions".  I always assumed this supports the simulation in "The Sobolizer", what some insiders called the Variable Importance system.

 

There's a lot floating around in the internets regarding Sensitivity Analysis that I'm sure you've researched too.  'Sorry to be so unhelpful.

 

Good luck!

Kevin