cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
The Discovery Summit 2025 Call for Content is open! Submit an abstract today to present at our premier analytics conference.
Get the free JMP Student Edition for qualified students and instructors at degree granting institutions.
Choose Language Hide Translation Bar
View Original Published Thread

What are the cross-validation statistics defined for partitions (decision trees)?

K-fold cross-validation randomly divides the (non-excluded) rows in a data table D into k subsets D1, D2…, Dk.
The subsets are of equal size and independent of each other.


K distinct models are trained using data from D-Di, where Di is the split that was removed.

For continuous responses, the error for each observation in Di is calculated using the model trained from D-Di.
For nominal or ordinal responses, JMP calculates Gˆ2 from each observation in Di using a model trained on D-Di.
This is repeated for each of the K partitions.



The resulting errors are squared and then added (for nominal and ordinal responses, the Gˆ2 values ​​are added).
This is how the cross-validated SSE (Gˆ2 for nominal or ordinal responses) is constructed.


FAQ # 2097

This post originally written in Japanese and has been translated for your convenience. When you reply, it will also be translated back to Japanese.

Details