JMP on Tuesday Session #12 Partition  Tutorial

Thank you for your questions during the session. We're sharing the answers here for everyone and for us to improve our JMP skills together.

Recorded video of the session (if you cannot load the video in Chrome, Kindly use Microsoft Edge)


Q1: How R square, G^ 2 and log worth are related together? can you explain their relationship?

R2 is a measure of the amount of variation explained by splitting the data into groups

  • G2 is an indication of variation in the data set
  • LogWorth is a p-value transformation ( e.g. -log10(0.01) = 2)
  • R2 higher, P value will be smaller, LogWorth will be bigger


Q2: What is the maximum number of splits possible for catagorical and continuous data?

  • There is no definite answer, The statistical goal of the partition process is to create groups that have the smallest possible G2

Q3: can we stop splitting without achieving G^2 Value as Zero?

  • Sure, of course. You have the full control of the splitting

Q4: How to make use of K-fold cross validation in partition?  

  • From the “Partition” Hot Spot (Red triangle), choose “K Fold Crossvalidation” will do 


Q5:  Under Column Contribution, do we have a threshold to determine if the factors are significant (something similar to Pvalue)?

  • Understand what you are looking for , but as I know, seems there is no indication / threshold for partition. Generally choose the top few columns as your focus will do


­ Q6:  Hi, new JMP user here. Do you have a recommended data size limit for using JMP? Can it handle medium and big data?

Technically there is no limitation. JMP is using the memory which your computer allocates to it

  • Usually recommend x64bits (vs x32bits), JMP PRO when dealing with big data


­ Q7:  How is the validation column created? ­‑

  • Main menu “Analyze” -> “Predictive Modeling” -> “Make Validation Columns” 


Q8:  does validation signifies the confirmation of poisonous and edible­?

  • Validation is to measure the quality of your model. Your predictive model will be more robust with validation


Q9:  is G **2 same as variation, sigma **2?

  • G2 A fit statistic used for categorical responses (instead of sum of squares that is used for continuous responses). Lower values indicate a better fit



