cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar

Validation for Continuous Process Data

What inspired this wish list request? 

@FN linked to a scikit-learn article in a comment on theValidation for Continuous Processing Data add-in page.  This is a better way to do cross validation with continuous process data and it reminded me that the techniques used in that add-in would be much more effective and used more often if they were built into the JMP. This situation applies to many analyses my colleagues and I do regularly with manufacturing data.

 

What is the improvement you would like to see? 

- Incorporate grouping by time in to the made a validation column dialog box, including similar functionality to what is in the Add-in linked above to guide the user to an appropriate group size.

- Add an option for an individual table so crossvalidation used everywhere in JMP uses a time-based method referenced in the link below instead of randomly assigning rows.

 

Why is this idea important?

Although @DrewLuebe and I created the validation add-in linked above specifically to help users better understand the predictability of models in data sets that have correlation between rows, including autocorrelation, the built-in crossvalidation behavior and default validation still use per-row splits.  This means that many reported validation fit metrics are artificially better than the actual performance will be with new data .  By baking these techniques into JMP itself, users will have a much better understanding of their data and models no matter what tools they use.