What inspired this wish list request?
@FN linked to a scikit-learn article in a comment on theValidation for Continuous Processing Data add-in page. This is a better way to do cross validation with continuous process data and it reminded me that the techniques used in that add-in would be much more effective and used more often if they were built into the JMP. This situation applies to many analyses my colleagues and I do regularly with manufacturing data.
What is the improvement you would like to see?
- Incorporate grouping by time in to the made a validation column dialog box, including similar functionality to what is in the Add-in linked above to guide the user to an appropriate group size.
- Add an option for an individual table so crossvalidation used everywhere in JMP uses a time-based method referenced in the link below instead of randomly assigning rows.
Why is this idea important?
Although @DrewLuebe and I created the validation add-in linked above specifically to help users better understand the predictability of models in data sets that have correlation between rows, including autocorrelation, the built-in crossvalidation behavior and default validation still use per-row splits. This means that many reported validation fit metrics are artificially better than the actual performance will be with new data . By baking these techniques into JMP itself, users will have a much better understanding of their data and models no matter what tools they use.