In this video, we show how to create a validation column in JMP Pro using the Make Validation Column utility.
We'll use the Impurity example and create a new column, Validation. We'll randomly assign 60% of the observations to the training set and the remaining 40% to the validation set.
To start, we select Make Validation Column from the Analyze menu under Predictive Modeling.
We can enter a Stratification Columns to balance the training and validation sets across these columns. This is often used when you have categorical input or output variables.
We can also enter a grouping column, or enter a cutpoint column. Click the Help button to learn more about these options.
We’ll leave all of the fields blank, and click OK.
We enter 0.6 for Training Set and 0.4 for Validation Set. Note that we can also assign a portion of the observations to a test set, but we'll use just two partitions for this example.
The new column name will be Validation.
Validation Column Type enables you to generate a validation column with fixed values, or with a formula for the random assignment to the training and validation sets in the column.
We use Fixed to randomly assign data to the training and validation sets and not store a formula. With Fixed random assignment, we can enter a random seed that enables us to re-create the same validation column again if needed. The random seed is an integer that you specify that tells JMP how to start the random assignment to the training and validation sets. Setting the random seed makes the assignment to the training and validation sets repeatable.
We enter the random seed 123 for this video, and then click Go.
JMP adds a new column, Validation, to the data table. Behind the scenes, the values are stored as 0 (for training data) and 1 (for validation data).
The Column Info window has a note telling us how the column was created.
As requested, 60% of the observations have been assigned to the training set and 40% of the observations have been assigned to the validation set.
When we build predictive models with this validation column, JMP Pro builds the model using the training data, and estimates the predictive performance of the model on new data using the validation data.
Note that this feature is available only in JMP Pro. For a complete listing of platforms and options available in JMP Pro, open the JMP Starter menu, and select the JMP Pro category. Platforms and features that are unique to JMP Pro are detailed under the tabs.