Exploring Variable Clustering and Importance in JMP®
Ryan J. Parker, JMP Technical Student, SAS
This presentation demonstrates two techniques for shedding insight into predictive models built into JMP Pro. The first technique, variable clustering, is shown to simplify the model building process for the user by grouping similar predictors into clusters. Using a data set on wine quality, we illustrate how this reduction not only has benefits in terms of predictive power, but also improves interpretability of the fitted model. Next, we show how the second technique, variable importance, can be used to assess the importance of input variables in complex statistical models. By decomposing the response variability of a statistical model using the distribution of possible input variables, this method allows the user to examine the importance of input variables on a model's response. Knowing which variables have the most impact on response variability is important for understanding how these inputs affect predictions. Using complex statistical models, such as neural networks, we show how JMP provides estimates of main effects and total effects (which include all possible interaction effects) for each predictor in the model. This session will be presented by Chris Gotwalt, Senior Research Statistician and Director of Research and Development for JMP.