Subscribe Bookmark RSS Feed

How to establish testing portion of data for multivariate DA?

ajgarnello

Community Trekker

Joined:

Feb 10, 2015

Hello,

I have been using JMP 11 for a little over a year now, and haven't figure out how to partition my data into training/validation sets for a multivariate discriminant analysis (i.e. train my model using a specific 2/3, then validate using the final 1/3). I also haven't been able to find help via other online resources. Any help on locating this feature?

5 REPLIES
louv

Staff

Joined:

Jun 23, 2011

Have you tried making a new column in your data set and initializing the new column and choosing Random followed by Random Indicator where you can specify the proportion that you desire for your split.

8041_Screen Shot 2015-02-10 at 4.41.59 PM.png

8042_Screen Shot 2015-02-10 at 4.41.16 PM.png

ajgarnello

Community Trekker

Joined:

Feb 10, 2015

Hello LouV,

thank you for the response; though it seems I haven't made my issue clear:

I have my data partitioned into a training and validation set already, though I am unaware of the steps required to create the DA model with the training data, and then apply it to my validation set.

louv

Staff

Joined:

Jun 23, 2011

Sorry for my misunderstanding.

Perhaps this blog post submitted by Jeff Perkinson might help

http://blogs.sas.com/content/jmp/2010/07/06/train-validate-and-test-for-data-mining-in-jmp

And this from JMP help

Validation

ajgarnello

Community Trekker

Joined:

Feb 10, 2015

Thank you, that helped me understand!

julian

Staff

Joined:

Jun 25, 2014

Hi ajgarnello,

It sounds like you have found the answer, but in case not I wanted to point out that discriminant analysis in JMP uses excluded rows as the validation set. So, once you've made your column to identify rows for training and validation, select all the validation rows and exclude them (you can select them all with Rows > Data Filter, or right click one validation cell, and use "Select Matching," then Rows > Exclude). Now, when you run DA JMP will automatically fit for both the training and validation sets, and give you classification statistics for your validation (excluded rows) set (see screenshot below).

Here's a link to the help page for validation in DA: Validation in Discriminant Analysis

I hope this helps!

Julian

8044_Screen Shot 2015-02-12 at 2.23.31 PM.png