cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
sreekumarp
Level I

CROSS VALIDATION - VALIDATION COLUMN METHOD

When using the validation column method for cross validation , we split the data set into training , validation and test sets. This split ratio is specified by the user. Is there any guideline /reference to decide on the split ratio (such as 60:20:20 / 70:15:15 / 50:25:25 / 80 :10:10). Is it chosen also based on the total number of observations -N ?

10 REPLIES 10

Re: CROSS VALIDATION - VALIDATION COLUMN METHOD

To your original question, no, there are not specific rules about how much data to leave out. In the JMP Education analytics courses, we advise you to hold out as much data as you are comfortable with, with at least 20% held out. If you feel the training set is too small to hold back that many rows, consider k-fold cross validation. How many rows are you willing to sacrifice to validation? Use k = n / that many rows. If k < 5 using that formula, consider leave-one-out cross validation.