Our World Statistics Day conversations have been a great reminder of how much statistics can inform our lives. Do you have an example of how statistics has made a difference in your life? Share your story with the Community!
Choose Language Hide Translation Bar

Stratified Data Partitioning (with balancing options) add-in.

This add-in allows the user to split a dataset into train/validate/test partitions. It includes options for rebalancing the proportions of the output data set's strata variable levels in relation to a focal group. This feature is useful, for example, in oversampling an event that is rare in the original data.


Instructions for using the add-in are attached.


Updated 3/23/2016:  Includes additional balancing options.

Updated 9/1/2016:  Bug fixes (related to an error when running the add-in)

Updated 9/2/2016:  Added instructions (attached pdf)

Updated 11/27/2017: Uploaded revised instructions (attached pdf)




Comments or suggestions? Please contact mia.stephens of JMP's Academic team.


This add in is  going to be a great help in teaching Data Mining using JMP!!.

Thanks for this useful JMP add in. I request instructions on how  to properly use this and gain benefit. Specially on using the new options for re balancing the proportions of the output data set's strata variable levels in relation to a focal group. A video instruction is preferred, If not at least a PDF instruction is highly suggested

I agree with tajrida.  -)

Thanks for the comments!  We will work on writing instructions for using this add-in.  In the meantime, if you have a copy of Data Mining for Business Analytics with JMP Pro (Data Mining for Business Analytics, Textbook Page) this add-in was designed based on materials covered in Chapter 5 (Pages 123 - 126).


Instructions for using the add-in have been added (as an attachment).  Please let us know if you have any questions.



Hello Brady and Mia, 
This is a great add-in. I have a similar situation where I am trying to do a stratified sampling ( split dataset into two groups test and control) by using a numerical variable for balancing the data. I want to ensure that YTD sales and QTD sales are balanced in both. How do I use this add-in to achieve that? Seems like the add-in only like a categorical data for stratification.