cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
Emma1
Level III

Cut dataset

Hello,
I would like to cut my 70% vs 30% dataset to have a learning dataset and a test dataset for my statistical models
I use this method:

 

Emma1_0-1628866834288.png

 

And I change the 0.7 in 0.3 to have 30%
However when I do that, the dataset does not separate into two parts: 70% vs 30%, it takes 70% then 30%
This means that in the dataset 30% there may be data from the dataset containing 70%

Is there a way to cut into two parts the dataset: 70 vs 30% without having the same type of data in both parts?


Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
ZF
ZF
Level III

Re: Cut dataset

Create a new column using formula: 

Random Binomial(1, 0.3), it will give you 30% "1' and 70% "0".

View solution in original post

4 REPLIES 4
txnelson
Super User

Re: Cut dataset

You need to use the "Make Validation Column"

     Analyze=>Predictive Modeling=>Make Validation Column

This will give you a new column that if you need to, you can subset the data table into 2 different tables.

Jim
Emma1
Level III

Re: Cut dataset

Hello,

I use the JMP version 16.1.0 and I can't find in the "Analyze" menu the "make validation column"

 

Emma1_0-1629100101451.png

 

Thank you

ZF
ZF
Level III

Re: Cut dataset

Create a new column using formula: 

Random Binomial(1, 0.3), it will give you 30% "1' and 70% "0".

Emma1
Level III

Re: Cut dataset

It works very well !!

Thank you