Hi @madhu,
Happy new Year 2025 !
There are no right or wrong solutions to your question.
Since you're building a predictive model in a Machine Learning (aka data-driven) way for different responses, here are the use of the different sets :
- Training set : Used for the actual training of the model(s),
- Validation set : Used for model optimization (hyperparameter fine-tuning, features/threshold selection, ... for example) and model selection,
- Test set : Used for generalization and predictive performance assessment of the selected model on new/unseen data. At this stage you should not have several models in competition.
I would consider more useful for "debugging" the use of one validation column for all three responses, to understand how much information can be extracted from the data, and provide a better assessment/comparison of the difficulty of the predictive task depending on the response, using the same training and validation data.
If you change the repartition of observations in your training, validation and test sets with different validation columns, it can be harder to figure out why a certain response is not correctly/precisely predicted : is it because of the stratification/repartition of points used to fit the model ? Or is it because the prediction of this response is more difficult to achieve ?
On a side note, even if you use one validation column, I would recommend creating a stratified validation column formula : the stratification on the features helps ensure the distributions of your training, validation and test sets are similar, preventing any data distribution shifts between the sets that could compromise the generalization of the learning/model fitting of your algorithm. Using a formula validation column type enables you to create simulations and assess robustness of your algorithm with various similar splits of your data.
See Solved: How can I automate and summarize many repeat validations into one output table? - JMP User C...
and Solved: Boosted Tree - Tuning TABLE DESIGN - JMP User Community
You can also use a K-Fold Cross-validation column with a fixed random seed (to ensure reproducibility of model results) to assess robustness and predictive performances of your algorithm on the several responses.
Hope this answer will help you,
Victor GUILLER
"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)