Exploring Variable Clustering and Importance in JMP®

 

Ryan J. Parker, JMP Technical Student, SAS

@RyanParker 

 

This presentation demonstrates two techniques for shedding insight into predictive models built into JMP Pro. The first technique, variable clustering, is shown to simplify the model building process for the user by grouping similar predictors into clusters. Using a data set on wine quality, we illustrate how this reduction not only has benefits in terms of predictive power, but also improves interpretability of the fitted model. Next, we show how the second technique, variable importance, can be used to assess the importance of input variables in complex statistical models. By decomposing the response variability of a statistical model using the distribution of possible input variables, this method allows the user to examine the importance of input variables on a model's response. Knowing which variables have the most impact on response variability is important for understanding how these inputs affect predictions. Using complex statistical models, such as neural networks, we show how JMP provides estimates of main effects and total effects (which include all possible interaction effects) for each predictor in the model. This session will be presented by Chris Gotwalt, Senior Research Statistician and Director of Research and Development for JMP.

Published on ‎03-24-2025 09:04 AM by Community Manager Community Manager | Updated on ‎03-27-2025 09:54 AM

 Exploring Variable Clustering and Importance in JMP®

 

Ryan J. Parker, JMP Technical Student, SAS

@RyanParker 

 

This presentation demonstrates two techniques for shedding insight into predictive models built into JMP Pro. The first technique, variable clustering, is shown to simplify the model building process for the user by grouping similar predictors into clusters. Using a data set on wine quality, we illustrate how this reduction not only has benefits in terms of predictive power, but also improves interpretability of the fitted model. Next, we show how the second technique, variable importance, can be used to assess the importance of input variables in complex statistical models. By decomposing the response variability of a statistical model using the distribution of possible input variables, this method allows the user to examine the importance of input variables on a model's response. Knowing which variables have the most impact on response variability is important for understanding how these inputs affect predictions. Using complex statistical models, such as neural networks, we show how JMP provides estimates of main effects and total effects (which include all possible interaction effects) for each predictor in the model. This session will be presented by Chris Gotwalt, Senior Research Statistician and Director of Research and Development for JMP.



Start:
Mon, Sep 9, 2013 09:00 AM EDT
End:
Thu, Sep 12, 2013 05:00 PM EDT
Attachments
0 Kudos