cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMPĀ® Marketplace
Choose Language Hide Translation Bar
Adias
Level II

LDA cross-validation

Hello,

I'm a new user of JMP (JMP 13.1.0). I have a question please: Is it possible to perform an linear discriminant analysis with cross-validation (leave-one-out method) in this version of the software?

Thank you,

Adias 

13 REPLIES 13

Re: LDA cross-validation

It is similar to excluding rows, but you can also specify a hold out set for testing the selected model.

 

You can create multiple data columns for the validation role, but you can only use one at a time. The idea is that once you decide on the size of the hold out sets, any of them are equally useful. Also, you can use the same validation column in more than one modeling platform for a fair and valid comparison of models. Why would you need more than one validation column? How would you use multiple validation columns?

 

I do not know what you mean by "option to select the validation method." There is only one cross-validation method with hold out sets represented by the Validation analysis role.

 

 

mkachl01
Level II

Re: LDA cross-validation

I see, thanks a lot for explaining this!

 

By 'validation options' I meant different methods of validating the models like k-fold or Monte Carlo cross-validation explained here: https://www.statisticshowto.com/cross-validation-statistics/

 

As far as I understand, to run 5-fold cross-validation, I'd need to have five validation columns. Each column would represent the 80-20 split, but where the 20% dedicated for the test in each of the 5 folds is different every time. Is there a way I can do this in JMP?

Re: LDA cross-validation

JMP provides K-fold cross-validation in some platforms but not all of them. You would have to implement your own version of it. Here is one way to do it:

 

  1. Create a new data column called Fold that uses the Nominal modeling type and with a formula of Random Integer( 1, 5 ). This column will identify the five folds for you.
  2. Create a series of five more data columns called Validation i, also using the Nominal modeling type, and each with a formula of Fold == i, where i changes from 1 to 5 as you go from the first to the last of these new columns.
  3. Launch the Discriminant platform five times using the succession of Validation i columns in the Validation analysis role.

You should have new columns for 5-fold cross-validation in your data table like this.

 

Capture 1.JPG

 

This first iteration of your model fitting might look like this:

 

Capture 2.JPG

 

Use the cross-validation information for each of the folds, such as:

 

Capture 3.JPG

 

You can then combine the five sets of results into the overall training and validation results as you see fit such as described in your cited reference.

mkachl01
Level II

Re: LDA cross-validation

That's exactly what I was looking for, thank you so much for this!