cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Check out the JMP® Marketplace featured Capability Explorer add-in
Choose Language Hide Translation Bar
ajgarnello
Level I

How to establish testing portion of data for multivariate DA?

Hello,

I have been using JMP 11 for a little over a year now, and haven't figure out how to partition my data into training/validation sets for a multivariate discriminant analysis (i.e. train my model using a specific 2/3, then validate using the final 1/3). I also haven't been able to find help via other online resources. Any help on locating this feature?

1 ACCEPTED SOLUTION

Accepted Solutions
louv
Staff (Retired)

Re: How to establish testing portion of data for multivariate DA?

Sorry for my misunderstanding.

Perhaps this blog post submitted by Jeff Perkinson might help

http://blogs.sas.com/content/jmp/2010/07/06/train-validate-and-test-for-data-mining-in-jmp

And this from JMP help

Validation

View solution in original post

7 REPLIES 7
louv
Staff (Retired)

Re: How to establish testing portion of data for multivariate DA?

Have you tried making a new column in your data set and initializing the new column and choosing Random followed by Random Indicator where you can specify the proportion that you desire for your split.

8041_Screen Shot 2015-02-10 at 4.41.59 PM.png

8042_Screen Shot 2015-02-10 at 4.41.16 PM.png

ajgarnello
Level I

Re: How to establish testing portion of data for multivariate DA?

Hello LouV,

thank you for the response; though it seems I haven't made my issue clear:

I have my data partitioned into a training and validation set already, though I am unaware of the steps required to create the DA model with the training data, and then apply it to my validation set.

louv
Staff (Retired)

Re: How to establish testing portion of data for multivariate DA?

Sorry for my misunderstanding.

Perhaps this blog post submitted by Jeff Perkinson might help

http://blogs.sas.com/content/jmp/2010/07/06/train-validate-and-test-for-data-mining-in-jmp

And this from JMP help

Validation

ajgarnello
Level I

Re: How to establish testing portion of data for multivariate DA?

Thank you, that helped me understand!

julian
Community Manager Community Manager

Re: How to establish testing portion of data for multivariate DA?

Hi ajgarnello,

It sounds like you have found the answer, but in case not I wanted to point out that discriminant analysis in JMP uses excluded rows as the validation set. So, once you've made your column to identify rows for training and validation, select all the validation rows and exclude them (you can select them all with Rows > Data Filter, or right click one validation cell, and use "Select Matching," then Rows > Exclude). Now, when you run DA JMP will automatically fit for both the training and validation sets, and give you classification statistics for your validation (excluded rows) set (see screenshot below).

Here's a link to the help page for validation in DA: Validation in Discriminant Analysis

I hope this helps!

Julian

8044_Screen Shot 2015-02-12 at 2.23.31 PM.png

ArnoG
Level II

Re: How to establish testing portion of data for multivariate DA?

Hello,

 

I understand this topic is outdated by now, but I ought to share this info in case someone looked for it:

I noticed the possibility to select rows (observations) to make a validation dataset in JMP 10 and 11 does not work. In the linear discriminant analysis, for example, if I select 10 observations to be used as "test" dataset, for some reason unknown to me the number of excluded observations changes depending on the number of variables which is inputted in the model, ignoring whether or not specific rows are hidden and/or excluded from the dataset. For these previous versions, I concluded that the misclasification results are obtained from the "training" dataset and gave up on the test dataset.

 

I installed the trial version of JMP pro 13.2, however, and it works perfectly. The possibility to perform quick DA with a few click is very nice. By contrast to the previous versions of JMP, this version also calculates R2 values for both the training and validation dataset and associated misclassified number of observations.

 

Best,

 

Arno

Peter_Bartell
Level VIII

Re: How to establish testing portion of data for multivariate DA?

ArnoG: You have found one of the main feature/capability differences between JMP and JMP Pro. The model validation capabilities in JMP Pro are far more flexible and adaptable to a wide variety of modeling and data constructs compared to JMP. I often recommend JMP Pro for those where predictive modeling is a core use case and the efficient cross validation of models is paramount to building those predictive models.