Subscribe Bookmark RSS Feed

Creating a validation column in JMP PRO

agneshb

Community Trekker

Joined:

Jan 19, 2014

is there a quick way or addins to create the validation column when using JMP PRO (and not to have to go through the steps in new colunm>missing data>random...).

I'm aware of the save validation in the neural platform, but it only creates training and validation category (I like to also have a test category).

Thanks!

Agnès

1 ACCEPTED SOLUTION

Accepted Solutions
Solution

Hi,

There should be no distinguishible pattern in the selection criterion. One of the simplest formulas uses the RandomCategory function of JMP, like this :

7476_RandCat.PNG

(Sorry for the decimal commas but I work in France). You will have to insert a new line in the function definition in the dialog box as, by default, the function proposes only two.

Yves

6 REPLIES
markbailey

Staff

Joined:

Jun 23, 2011

When you create the new column for the validation status, you can use the Initialize Data option near the bottom of the dialog to select Random > Random Indicator. From here, I would then enter a large portion for 0 and 1-portion for 1. The portion is up to you, perhaps 0.25-0.5.

Learn it once, use it forever!
agneshb

Community Trekker

Joined:

Jan 19, 2014

Thanks Mark and it is the way I'm doing now, usually using 0.6 for the training, 0.2 for the validation and 0.2 for the test, but I have to recreate the column each time I create a new subset of data and I was wondering if there was a simpler way to generate the column or a script that can be saved.

julian

Staff

Joined:

Jun 25, 2014

Hi AgnesHB,

Maybe this will help? I used Sequence() to count from 1 to 5 (with step size of 1, and repeating each value 1 time), and placed the sequence in a match function to recode 1, 2, and 3 as 0 (for training), 4 as 1 (for validation) and 5 as 2 (for test), which will fit your .6, .2, .2 setup.

JSL:    Match(Sequence(1, 5, 1, 1), 1, 0, 2, 0, 3, 0, 4, 1, 5, 2)

7454_Screen Shot 2014-10-20 at 11.38.48 PM.png

I hope this helps!

Julian

julian

Staff

Joined:

Jun 25, 2014

Alternatively, you could use a random integer function rather than sequence, but using sequence will ensure proportions more close to what you define (since a sequence is predictable and a random integer, well, isn't). But, the above could be seriously problematic if, for some reason, there is something systematically biased across the sequence (like every 5th observation being different in some way due to the measurement system, which would mean ALL your "test" data would be of that type). Probably unlikely, but not great. A quick random shuffle of rows would solve that problem.

Julian

Solution

Hi,

There should be no distinguishible pattern in the selection criterion. One of the simplest formulas uses the RandomCategory function of JMP, like this :

7476_RandCat.PNG

(Sorry for the decimal commas but I work in France). You will have to insert a new line in the function definition in the dialog box as, by default, the function proposes only two.

Yves

julian

Staff

Joined:

Jun 25, 2014

I wasn't aware of Random Category() until now; that's a much more elegant solution!

Julian