Solved: How to perform k-fold cross-validation on ML models (e.g. neural network and ran...

abhinavsharma91 · Jun 8, 2023 9:29 AM

I have classification dataset on which I want to fit neural network and random forest model. I am performing the k-fold cross validation for selecting best model.

I have separate data (fixed test data) on which I want to test the model performance. How can I pass that data to JMP and get classification report?

Victor_G · Mar 12, 2023 06:36 AM

Hi @abhinavsharma91,

In order to use K-folds crossvalidation on several Machine Learning models, you can use the platform "Model Screening" (accessible in "Analyze", "Predictive Modeling", "Model Screening"), and select which algorithms you would like to try (in your example Neural Networks and Random Forest), and the number of folds used for crossvalidation (example here on toy dataset "Diamonds Data" with a 5-folds crossvalidation):

For your test data, there are several options :

If your data are split in two datasets (one for training, one for testing), you can run the model on your training datatable with cross-validation, save the formulas for the best performing model, and then copy and paste the model's formula column on your test datatable, so that no data leakage is done for the unbiased assessment of the model's performances on test data.
You can also have all your data on the same datatable, but "hide and exclude" the rows that will be used for testing; in this way, the models won't use your test data for the training and cross-validation, but by saving the formulas you will still have predictions for your hidden & excluded rows (and on the same table).

I hope this answer will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

View solution in original post

Victor_G · Mar 12, 2023 06:36 AM