Discussions

MetaLizard62080 · Mar 27, 2024 08:54 AM

Hi,

When performing a DSD, you are given the option for applying rules of "obeys strong heredity" for quadratics, and for interactions.

I believe that this limits the model and allows only for interactions that contains both as main effects. I'm not positive how this works for quadratics unless it only allows for quadratics that are found to have a main effect as well.

When evaluating your DSD, how do you determine what the "best" model is for your data? I get very different answers depending on which options I choose here so I would like to have some metric/rationale for my choice before proceeding with further designs.

txnelson · Mar 27, 2024 11:40 AM

As a start, check out the JMP Help Statistical Details for the Fit Definitive Screening Platform

Jim

Mark_Bailey · Mar 27, 2024 12:36 PM

Enforcing heredity is the most restrictive search, and relaxing heredity is the least restrictive search. These two forms of searching can lead to different solutions. These solutions depend on the data in hand (no other information) and the method to the solution.

Do you have prior information about the effects of your factors to guide you about which model is most reasonable?

You might use both solutions to predict the response under the same conditions (i.e., factor settings) and then perform confirmatory runs to verify (hopefully) one of the models.

MetaLizard62080 · Mar 27, 2024 09:41 PM

Hi Mark,

I have some historical knowledge for this process as we are now going back and statistically characterizing a complex system that was not statistically built.

I would say that removing the heredity rules provides much more expected trends, but I'd like to provide rationale for my choice rather than saying "Based on extensive historical knowledge". The DSD platform is something I'd like to use in the future, and would like to receive some colleague buy-in.

Is there a way to determine when to use heredity rules (A statistical rule of thumb, some sort of design/data evaluation technique, etc.)

If not, is the best process to perform confirmation runs under multiple scenarios and determine which model better fits as you mentioned previously?

statman · Mar 28, 2024 10:45 AM

Here are my thoughts that you can ignore if you find them argumentative:

1. It is less reliance on historical knowledge and more a reliance on Subject Matter Knowledge. You would want a significant amount of data to support a model which is counter to scientific/engineering theory.

2. I struggle with your use of words....I don't understand what "statistically characterizing" or "statistically built" means. We use statistics as an effective and efficient way of understanding causal structure and to build empirical models.

3. There are principles we use as guidance to developing understanding of causal structure using fractional factorials:

Scarcity (there are a few of the many factors that are useful for a useful model)
Hierarchy (first order>2nd order>>3rd order...etc.) We typically build models in Tylor series order.
Heredity (In order for a higher order effect to be active, at least 1 parent must be active). This is used when there are significant effects of the same order and hierarchy can't provide guidance as to which is the active effect.

These are for guidance and are not always true (they are not rules per se). If, after these guiding principles, you still can't select the significant effect from a string of aliased effects, you will need to run more treatments. Where, in relation to the design space, you run these additional runs is situation dependent.

4. When building models, there are a number of elements to take into account. First (and foremost) is the scientific or engineering justification (hypotheses). Then a number of useful statistics, for example:

RMSE (the model with the smaller RMSE is better)
p-Values (statistical significance assuming the MSE is a good estimate of the random errors)
Delta R-square-R-square adjusted (the larger the delta the more likely the model is over-specified)
Residuals (NID(0,variance))

Again none are definitive and require interpretation given how the data was gotten and the situation.

5. There is no one DOE strategy that will be the most effective or efficient. Selection of the appropriate experiment strategy is situation dependent. My suggestion is always to design multiple experiments and assess what can be learned from each and contrast that with the resources required. Then pick one and prepare to iterate.

"All models are wrong, some are useful" G.E.P. Box

Mark_Bailey · Apr 1, 2024 03:00 PM

Model heredity is often used to guide the selection of terms in the initial model. Studies have shown that significant interaction terms are often accompanied by significant main effects for the factors involved. Further experimentation supplements the original data and might help resolve which terms are significant. The best way to confirm a model selected by any analysis method is to predict the response to one or more conditions not included in the original experiment. This new and independent evidence was not used to select the model.

Model heredity is also a peculiar attribute of the particular model used by Fit Definitive Screening. The linear statistical model is a polynomial. JMP codes the factor levels used in the regression analysis. Maintaining model heredity in the regression assures us that a transformation of variables later will not alter the terms in the linear combination.

Discussions

How to choose heredity rules in definitive screen

Re: How to choose heredity rules in definitive screen

Re: How to choose heredity rules in definitive screen

Re: How to choose heredity rules in definitive screen

Re: How to choose heredity rules in definitive screen

Re: How to choose heredity rules in definitive screen

Recommended Articles