Choose Language Hide Translation Bar
Highlighted
Level III

## The course test question: In predictive modeling, why should you use model validation?

Dear colleagues,

Currently, I am studying at "Statistical Thinking for Industrial Problem Solving" course and during the Quiz a question was set with four possible answers. I am afraid that the answer accepted by the quiz interface is correct.

The question is:

In predictive modeling, why should you use model validation?

Select one:

1. Use it to make sure that you have fit the correct model.
2. Use it to make sure that your model generalizes well to new data.
3. Use it to make sure that you can identify the most significant variable.
4. You don’t need to use model validation in predictive modeling.

I selected 1, as validation is required to understand if a model is adequate (no overfit, for example) before introduction new, test data for the final decision. However, the right answer is 2. It looks quite doubtful because 2 is related more to test data, which is the next step after validation.

Please, correct me, if I am wrong and give a line of clarification to me.

Thanks!

Reaching New Frontiers
1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted
Staff

## Re: The course test question: In predictive modeling, why should you use model validation?

It is difficult to write good examination questions and answers. This question, however, is straight-forward.

Cross-validation cannot be used to assure the correct model. Only new empirical observations can be used to test the correctness. In the broad scheme of honest assessment, many predictive modelers use three hold-out sets: training, validation, and test. Only the training data are used to fit the model. Only the validation data are used to check the generalization of the model, and only the test data are used to confirm the model is correct. So answer 2 is the correct answer.

Learn it once, use it forever!
4 REPLIES 4
Highlighted
Staff

## Re: The course test question: In predictive modeling, why should you use model validation?

Answer 1 is about confirmation of the model with new empirical data. You make a prediction, usually where no prior observation was made. You then observe if the future observation is consistent with the prediction. Answer 2 is about generalization, that is to say, you have not over-fit the learning set. Honest assessment through cross-validation is an acceptable method to check generalization.

Learn it once, use it forever!
Highlighted
Level III

## Re: The course test question: In predictive modeling, why should you use model validation?

Dear Mr. Bailey,

Thank you so much for dedicated attention.

However, these options - 1 & 2 - do not give obvious answers to the question. As a beginner with JMP and Machine Learning, I would suppose that both answers can be considered within the model validation concept.

Do you agree, as a specialist?

Reaching New Frontiers
Highlighted
Staff

## Re: The course test question: In predictive modeling, why should you use model validation?

It is difficult to write good examination questions and answers. This question, however, is straight-forward.

Cross-validation cannot be used to assure the correct model. Only new empirical observations can be used to test the correctness. In the broad scheme of honest assessment, many predictive modelers use three hold-out sets: training, validation, and test. Only the training data are used to fit the model. Only the validation data are used to check the generalization of the model, and only the test data are used to confirm the model is correct. So answer 2 is the correct answer.

Learn it once, use it forever!
Highlighted
Level III

## Re: The course test question: In predictive modeling, why should you use model validation?

Thank you very much!

With best regards,
Michael
Reaching New Frontiers
Article Labels