Discussions

Adias · Mar 17, 2017 02:17 PM

Hello,

I'm a new user of JMP (JMP 13.1.0). I have a question please: Is it possible to perform an linear discriminant analysis with cross-validation (leave-one-out method) in this version of the software?

Thank you,

Adias

Mark_Bailey · May 28, 2020 07:43 AM

It is similar to excluding rows, but you can also specify a hold out set for testing the selected model.

You can create multiple data columns for the validation role, but you can only use one at a time. The idea is that once you decide on the size of the hold out sets, any of them are equally useful. Also, you can use the same validation column in more than one modeling platform for a fair and valid comparison of models. Why would you need more than one validation column? How would you use multiple validation columns?

I do not know what you mean by "option to select the validation method." There is only one cross-validation method with hold out sets represented by the Validation analysis role.

mkachl01 · May 28, 2020 11:47 AM

I see, thanks a lot for explaining this!

By 'validation options' I meant different methods of validating the models like k-fold or Monte Carlo cross-validation explained here: https://www.statisticshowto.com/cross-validation-statistics/

As far as I understand, to run 5-fold cross-validation, I'd need to have five validation columns. Each column would represent the 80-20 split, but where the 20% dedicated for the test in each of the 5 folds is different every time. Is there a way I can do this in JMP?

Mark_Bailey · May 28, 2020 12:45 PM

JMP provides K-fold cross-validation in some platforms but not all of them. You would have to implement your own version of it. Here is one way to do it:

Create a new data column called Fold that uses the Nominal modeling type and with a formula of Random Integer( 1, 5 ). This column will identify the five folds for you.
Create a series of five more data columns called Validation i, also using the Nominal modeling type, and each with a formula of Fold == i, where i changes from 1 to 5 as you go from the first to the last of these new columns.
Launch the Discriminant platform five times using the succession of Validation i columns in the Validation analysis role.

You should have new columns for 5-fold cross-validation in your data table like this.

This first iteration of your model fitting might look like this:

Use the cross-validation information for each of the folds, such as:

You can then combine the five sets of results into the overall training and validation results as you see fit such as described in your cited reference.

mkachl01 · Jun 1, 2020 04:54 AM

That's exactly what I was looking for, thank you so much for this!

Discussions

LDA cross-validation

Re: LDA cross-validation

Re: LDA cross-validation

Re: LDA cross-validation

Re: LDA cross-validation

Recommended Articles