Subscribe Bookmark RSS Feed

How to establish testing portion of data for multivariate DA?

Highlighted
ajgarnello

Community Trekker

Joined:

Feb 10, 2015

Hello,

I have been using JMP 11 for a little over a year now, and haven't figure out how to partition my data into training/validation sets for a multivariate discriminant analysis (i.e. train my model using a specific 2/3, then validate using the final 1/3). I also haven't been able to find help via other online resources. Any help on locating this feature?

1 ACCEPTED SOLUTION

Accepted Solutions
louv

Staff

Joined:

Jun 23, 2011

Solution

Sorry for my misunderstanding.

Perhaps this blog post submitted by Jeff Perkinson might help

http://blogs.sas.com/content/jmp/2010/07/06/train-validate-and-test-for-data-mining-in-jmp

And this from JMP help

Validation

7 REPLIES
louv

Staff

Joined:

Jun 23, 2011

Have you tried making a new column in your data set and initializing the new column and choosing Random followed by Random Indicator where you can specify the proportion that you desire for your split.

8041_Screen Shot 2015-02-10 at 4.41.59 PM.png

8042_Screen Shot 2015-02-10 at 4.41.16 PM.png

ajgarnello

Community Trekker

Joined:

Feb 10, 2015

Hello LouV,

thank you for the response; though it seems I haven't made my issue clear:

I have my data partitioned into a training and validation set already, though I am unaware of the steps required to create the DA model with the training data, and then apply it to my validation set.

louv

Staff

Joined:

Jun 23, 2011

Solution

Sorry for my misunderstanding.

Perhaps this blog post submitted by Jeff Perkinson might help

http://blogs.sas.com/content/jmp/2010/07/06/train-validate-and-test-for-data-mining-in-jmp

And this from JMP help

Validation

ajgarnello

Community Trekker

Joined:

Feb 10, 2015

Thank you, that helped me understand!

julian

Staff

Joined:

Jun 25, 2014

Hi ajgarnello,

It sounds like you have found the answer, but in case not I wanted to point out that discriminant analysis in JMP uses excluded rows as the validation set. So, once you've made your column to identify rows for training and validation, select all the validation rows and exclude them (you can select them all with Rows > Data Filter, or right click one validation cell, and use "Select Matching," then Rows > Exclude). Now, when you run DA JMP will automatically fit for both the training and validation sets, and give you classification statistics for your validation (excluded rows) set (see screenshot below).

Here's a link to the help page for validation in DA: Validation in Discriminant Analysis

I hope this helps!

Julian

8044_Screen Shot 2015-02-12 at 2.23.31 PM.png

ArnoG

New Contributor

Joined:

Aug 29, 2017

Hello,

 

I understand this topic is outdated by now, but I ought to share this info in case someone looked for it:

I noticed the possibility to select rows (observations) to make a validation dataset in JMP 10 and 11 does not work. In the linear discriminant analysis, for example, if I select 10 observations to be used as "test" dataset, for some reason unknown to me the number of excluded observations changes depending on the number of variables which is inputted in the model, ignoring whether or not specific rows are hidden and/or excluded from the dataset. For these previous versions, I concluded that the misclasification results are obtained from the "training" dataset and gave up on the test dataset.

 

I installed the trial version of JMP pro 13.2, however, and it works perfectly. The possibility to perform quick DA with a few click is very nice. By contrast to the previous versions of JMP, this version also calculates R2 values for both the training and validation dataset and associated misclassified number of observations.

 

Best,

 

Arno

Peter_Bartell

Joined:

Jun 5, 2014

ArnoG: You have found one of the main feature/capability differences between JMP and JMP Pro. The model validation capabilities in JMP Pro are far more flexible and adaptable to a wide variety of modeling and data constructs compared to JMP. I often recommend JMP Pro for those where predictive modeling is a core use case and the efficient cross validation of models is paramount to building those predictive models.