cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Marimo
Level I

leave-one-out or k-fold cross-validation methods??

We recieved the reviewer comment as follow:

to confirm the utility of the cut-off value, validation would be needed. However, in this cohort, validation analysis was not performed. As the sample size was small, it would be difficult to perform external validation study. however, internal validation (or cross-validation) analysis might be able by statistical methods such as leave-one-out or k-fold cross-validation methods.

 

I could not undersand statistical method which reviewer intend.

Could you solve this problem by JMP?

3 REPLIES 3
Victor_G
Super User

Re: leave-one-out or k-fold cross-validation methods??

Hi @Marimo,

 

Welcome in the Community !

 

Here are some information about validation, the use of specific validation methods, and how to do it in JMP from a previous post I gave: https://community.jmp.com/t5/Discussions/CROSS-VALIDATION-VALIDATION-COLUMN-METHOD/m-p/588349/highli...

 

I hope this first answer will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

Re: leave-one-out or k-fold cross-validation methods??

@Marimo ,

Leave-one-out is a version of k-fold cross validation where all rows in your data set are used as a validation "set" one at a time. 

 

As an example, if you have 20 rows in your data set you could employ a k-fold 5 cross-validation which will use 20% of your data 5 times. In this case that would 4 rows for cross-validation and 16 rows for training. 

 

Each 20% is using something called sampling with replacement to try and ensure all of the data is included in one of the five k-folds.  The other option in this example is to make a k-fold 20 which then uses each row of the data set as a validation set.  The higher the number of folds, the lower the percentage of the data is used for cross-validation. K-fold cross-validation can be found in most platforms in JMP.

 

Another option would be to use a method like Bootstrap Forest which has sampling with replacement built into the algorithm.  You would need JMP Pro to take full advantage of this platform. Bootstrap Forest uses model averaging to find the most important variables in your data set by building many smaller models (trees) with different subsets of the data.

 

HTH

Bill

dlehman1
Level V

Re: leave-one-out or k-fold cross-validation methods??

Question:  if I have 20 rows and use K=20 folds, is that exactly the same thing as "leave-one-out" validation?  I've always thought the latter uses every observation as a validation set (rotated through the entire data set) whereas the K-fold samples with replacement, so it isn't clear to me that every observation is used for validation.  Can you clarify this?