topic Re: Conceptual question about validation in Discussions

Conceptual question about validation

dlehman1 — Tue, 22 Apr 2025 15:43:26 GMT

I'm hoping that people with a better statistical background than I have can shed some light on this question. JMP has robust validation capabilities for all predictive modeling, including cross validation (though not in the Fit Model platform, except for the capability of using a validation set). Data scientists routinely partition their data into training and validation (and sometimes test) data sets. However, most published work using traditional statistical methods (multiple regression, logistic regression) do not use any validation. I've been wondering why that is the case. Among the possibilities I can think of: (1) statistical significance is viewed as making validation unnecessary; (2) historical practice that is slow to change; (3) the emphasis is on inference and not prediction (along the lines of Brieman's two cultures). In any case, I don't understand any justification for not using validation, so I'd like to hear what people think.

Re: Conceptual question about validation

Jed_Campbell — Tue, 22 Apr 2025 16:38:09 GMT

I suspect much of the lack of validation comes from your second option, especially in conjunction with other practices/traditions that are slow to change. If we accept the notion that academia tends to only publish significant findings, and if we accept the adage of "publish or perish," then we are naturally disincentivizing, at least to some small extent, extremely rigorous understanding before publication. This might explain some part of the recent discovery that many scientific studies can't be duplicated. In my opinion, if the scientific community adopted validation more readily and was more willing to publish "mundane" findings, I would expect to see results that are more easily duplicated, but perhaps less exciting.

I could easily be wrong or guilty of over-generalization, though. Like you, I'm interested in hearing others' thoughts.

Re: Conceptual question about validation

statman — Tue, 22 Apr 2025 19:42:40 GMT

First please define what you mean by validation. One take might be to collect a data set from some inference space and partition it randomly(?) so that only part is used to develop a model and the rest is used to see how well that model predicts the data from the held out data. While this may be accepted practice, this is not my idea of validation (I may be alone in this thought).

One belief I have is to truly “validate” the model I must make physical “samples” and evaluate how well the model predicted those actual samples. To me, validation is how well the model predicts the future (changing conditions). If you have designed your study well enough, over a large enough inference space (which can be very challenging and resource intensive), then your model should perform better in future prediction.

Re: Conceptual question about validation

dlehman1 — Tue, 22 Apr 2025 20:52:24 GMT

I mean the first definition (the one you don't like), and I think that is what the literature refers to as validation. Of course, almost all models are eventually used for out-of-sample prediction and the partitioning will not address future changing conditions. Nothing will (at least not perfectly). Ultimately, your second idea is necessary - even if you have "validated" a model in the first sense, you should monitor its performance to see if it works well enough to continue to use. But most published papers (other than data science journals) don't even do the first kind of validation, and I am asking why.

Re: Conceptual question about validation

P_Bartell — Wed, 23 Apr 2025 11:23:38 GMT

I concur with everything that @statman and @Jed_Campbell have contributed. The only thing I'll add is maybe there is a fourth 'reason'? Until model validation is taught as a requirement of the extended problem solving process of which modeling is a component you just won't see much of it included in the literature. Just another piece or consequence of the 'historical practice that is slow to change.'

Re: Conceptual question about validation

SDF1 — Wed, 23 Apr 2025 12:37:50 GMT

Hi @dlehman1 ,

Thanks for posing an interesting conceptual question. In general, I agree with what everyone has commented on so far -- that it's important to define what "validation" means in this case. However, I would say that validation can take on both of those meanings -- a hold out set to improve model prediction as well as generating new samples that test the predictive capability of the model.

To me, both are important and need to be done -- one for improving the model, and the other to make sure the model can work, especially when generating new samples that are near the boundaries of the parameter space where models tend to be less accurate.

As to why published works tend not to use validation is to a large extent what you and @Jed_Campbell commented on: historical practice is slow to change -- but this is more of a result of influences within the scientific community that push toward finding a significant result rather than a robust model. My background is physics, and physicists are notorious for creating simple models for one situation and then extrapolating that out to other situations. Take the simple pendulum as an example. This is the basis for almost all basic physics problems, yet in practice it isn't the greatest model -- there have been so many tweaks and changes to it in order to have the model work in other non-ideal situations (think about having to add the electron spin into the orbital mechanics of the electronic states of an atom).

To me, this simple pendulum doesn't make a very robust model -- but it does satisfy the interest and desire to "find" something significant in the data. Sure, it's a great start, and we all need to start somewhere. However, if it comes to wanting to generate a broader, more robust model that can be utilized in more generalized areas, validation is required in both definitions that were discussed.

More "mundane" results might ultimately lead to more robust, better models and predictive capabilities in science, but cultural changes in the scientific community need to change, and a de-emphasizing of the splashy, flashy new findings probably needs to happen.

In short, I would say that not only is historical practice slow to change, but there also needs to be a cultural change in the scientific community about what/how results should be recognized in science. A sort of reorienting of priorities.

Re: Conceptual question about validation

statman — Wed, 23 Apr 2025 13:00:37 GMT

I have this to offer:

“The literature as it has grown up seems to be unbalanced in its comparative neglect of the Scientific aspects of the problem, and of its Logical aspects. This perhaps might have been expected, since many of the authors, albeit talented mathematicians, have evidently never submitted their minds to the specifically educational discipline of any one of the Natural Sciences, have never taken personal responsibility for experimentation at ground level, and have no direct experience of the kind of material involved…”

Sir Ronald Fisher, Colloq. Int. Cent. Nat. Recherche Scientifique, Paris, No. 110: 13-19 (1962)

Re: Conceptual question about validation

dlehman1 — Wed, 23 Apr 2025 14:24:12 GMT

This is a common viewpoint I have seen and I certainly agree with it in spirit. But I think it is inadequate when dealing with the social sciences. Many of the issues (the effects of minimum wages, competition policy, psychological effects of social media use, etc.) are not amenable to scientific experimentation in the same way as the physical sciences. That doesn't excuse people from ignoring good scientific practice, but it does leave a gap (a large one, in my opinion) regarding how to proceed in such areas of inquiry. And I think this does relate to the issue of validation - in my mind it makes the first sort of validation (random partitioning) more essential in these areas (though not sufficient as you have indicated).

Re: Conceptual question about validation

SDF1 — Wed, 23 Apr 2025 15:12:41 GMT

Hi @dlehman1 ,

Good point about the social sciences. It's much harder to replicate a DOE or to generate a "new sample" without having the previous runs influence a participants responses at a later time. One person's life experience is never the same as another's.

Perhaps turn this on it's head and think of it as a questionnaire to the social sciences. Maybe ask the social science community questions like: when researching your topic, do you use the concept of validation during analysis and model building? Why do you use or not use validation? Are you aware of the practice of using validation? Having some open ended prompts can provide an opportunity for the respondent to reveal more/additional information. I would suspect you will get a multitude of different responses, but there could be a few that stand out as more common -- at least from a conceptual/general idea and not specifics.

Re: Conceptual question about validation

statman — Wed, 23 Apr 2025 15:30:05 GMT

Hmmm, inadequate to involve subject matter experts that work in the field of social sciences to be involved in the development of the survey or sampling plan or DOE or in the interpretation of data derived from such tools?

Why are they not "amenable to scientific experimentation in the same way as the physical sciences"? Agreed the measurement systems are quite challenging, but how does validation (holding a subset of data out) improve this?

Re: Conceptual question about validation

dlehman1 — Wed, 23 Apr 2025 16:06:37 GMT

I find that a very strange comment. Suppose we want to estimate the effect of tariffs on GDP. Experiments are difficult and expensive to run, and take time - during which many other things are changing (of course, we are running such an experiment now!). This is quite different than experiments in the social sciences. Yet, there are such experiments in the social sciences, and sometimes there are natural experiments that can be used (such as existing tariff regimes that exhibit much variation). So, I don't rule out experiments in the social sciences, but I have a hard time thinking that you believe they are just as feasible as in the physical sciences. And, I'd view medicine as somewhere between - there are many RCTs, but due to expense and ethical concerns, these are usually far smaller sample sizes than we would like (thereby omitting many subgroup analyses that we would want).

You ask how validation can improve things. I don't think it is sufficient to replace the ideal experiments you would want, but I think it is necessary in their absence. If the model does not work for data we have, why should we believe it would work in the future?

Re: Conceptual question about validation

statman — Wed, 23 Apr 2025 17:29:22 GMT

No doubt I am strange and likely think differently than you. If you understand the differences between enumerative and analytical situations, I will let you know I am completely biased to the analytical approach (I'm a devout determinist). Using both directed sampling (based on hypotheses) and experimentation (with emphasis on how to increase the inference space while simultaneously increasing the design precision). I am less "interested" in explanatory studies and more interested in predictive modeling (though both may be useful).

No experiments are "ideal" (we wouldn't know if they were anyway). This is why I always propose (and recommend) the investigator develop multiple different experiment/sampling plans. Each plan should be evaluated for potential knowledge gained (e.g., what can be assigned (model), what is confounded, what is restricted (inference)) and that potential knowledge compared to the resources required.

I don't want to discuss and politics. I believe there is a cause(s) for every effect. I don't care what the discipline is. Every discipline has its challenges with using the data acquisition tools. That is no excuse for not using them.