cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
gfholland
Level II

DOE: Would a "Top-Down" approach converge on the more conventional "Bottoms-Up" factor analysis?

I am a relative newcomer to the use of DOE in R&D: I'd been a long-time skeptic but have recently come to see the HUGE power that it offers, and have embraced JMP as my tool of choice. In the process of learning how to apply DOE methodologies, I've come to think of its preferred implementation as a sort of "bottoms-up" approach: we start by identifying ALL the factors that might contribute to a measured outcome, and then create a design that allows us to screen the impact of each factor, identifying the most critical ones. We can generate a model to represent the system of interest, dropping from that model factors that do not contribute significantly, then refining our model with those drivers. It seems to me that we might occasionally find that our model STILL does not represent our observations well - and perhaps we've overlooked a factor or two. We can remedy this by augmenting our model. For a complex system with a bundle of factors, this conventional approach makes for a big test matrix. I know I can reduce the size of that matrix by making use of fractional factorials and custom / optimal designs. For a skeptical client AND a complex system, this still makes for a tough sell. But: is there something necessarily BAD about starting with a "top-down" approach? Could we use our subject matter knowledge to identify some handful of factors that we believe to be the performance drivers, and with them a test matrix and then a statistical model of observed data? If this leads to a good fit of observed data, then GREAT. But if this model falls short in adequately addressing the observations, then we could use this evident shortcoming in the model to improve it by incorporation of additional factors (some of that big bundle that we'd chosen to set aside at the outset …) and more testing? Would not the mathematics of "top-down" and "bottoms-up" approaches tend to converge on the same model? Or is my notion of a "top-down" approach really just a variation on a custom design?

7 REPLIES 7

Re: DOE: Would a "Top-Down" approach converge on the more conventional "Bottoms-Up" factor analysis?

There are many reasons for the lack of fit, including your suggestion. I will not address that portion of your discussion now.

 

DOE is usually considered part of a more extensive quality program or initiative. The design follows prior work to identify the relevant factors and responses. This result might come from using the Seven Quality tools, for example. It might come from a Six Sigma or Quality by Design project. Subject matter expertise is always essential, though I have often encountered stifling comments such as "But we know that..." based on anecdotal experience, not scientific data. (If we already know so much, why can't we explain what is happening or know what to do about it?) Many related tools involve collecting many ideas and then organizing and prioritizing them for study.

Re: DOE: Would a "Top-Down" approach converge on the more conventional "Bottoms-Up" factor analysis?

There is really nothing wrong with the "top-down" approach as you called it. However, think about that worst case scenario. You control some factors for the design, then determine it does not fit the data. Oh, we need to add more factors. But where were those factors set during the first set of runs? Were they changing? Depending on the answers to those questions, you may need to start over. Even if recorded, you do have to worry that the experimental runs were not randomized then.

 

As Mark suggests, most other approaches have you identify factors that you want to change, but also factors that MAY influence the design that you do not want to change. That group that is in the MAY group need to be accounted for in some fashion (held constant, become blocking variables, etc.).

Dan Obermiller
statman
Super User

Re: DOE: Would a "Top-Down" approach converge on the more conventional "Bottoms-Up" factor analysis?

Here are my thoughts:

There are two fundamental reasons why we are continuously "optimizing":

1. We never start with all.  The factors we study is always a small subset of all of the factors. Decisions get made as to what to include (or not) in initial studies.  This is most often a function of intuition, gut feel, experience, collaboration.   It is interesting to note, investigators will drop factors from the study with no data.  What happens to these factors in subsequent studies?

2. New or alternative applications of materials/technologies are constantly being invented.

 

I think your proposal misses some important elements of investigation that are not necessarily DOE related, but greatly improve the use of experimentation.  For example, directed sampling to understand which components of variation (sets of x's) have the greatest leverage, provide an assessment of stability and evaluate measurement uncertainty.  Iteration is the key to developing a thorough understanding (cycles of induction-deduction).  I also believe typical investigators don't include noise in their experimental strategies sufficiently to assess whether the model will have consistent "performance" over further conditions.  There is a huge bias on developing optimal strategies for the design factors and very little on noise strategies.

"All models are wrong, some are useful" G.E.P. Box
gfholland
Level II

Re: DOE: Would a "Top-Down" approach converge on the more conventional "Bottoms-Up" factor analysis?

Thanks @STAT  – and @Dan_Obermiller and @Mark_Bailey – for the feedback. You're comments are very helpful.

@statman , "We never start with all" is a great point for my mental calibration. And your comments re: noise are spot on. My case of interest is coming late in a development project in which we've yet to conduct a clear systematic exploration of those factors exerting greatest leverage. At the same time we are trying to define / understand the nominal performance band - and the noise therein.

@Dan_Obermiller , you give me some ideas of how to account for those "other" factors - the "MAY" group. I was uncomfortable with "active ignorance", whereas I can see a means to identify and document those factor settings – even if we don't explicitly explore them in the test matrix.

And @Mark_Bailey , "If we already know so much, why can't we explain…" struck all too close to the mark. As I mention above, I am trying to insert some more systematic testing at a point that is late and arguably "too late" in the game. As I see it, my best opportunity of doing so requires a very skinny set of factors, intentionally setting aside many others in the interest of clarifying the impact of what our collective SME tells us are likely high-leverage drivers.

Thanks, all, again.

Re: DOE: Would a "Top-Down" approach converge on the more conventional "Bottoms-Up" factor analysis?

We could make many comments to help you, but I want to make one more now. I often see designs vary factors over a very narrow range. Why? Because we know or think that we will find 'the answer' in this neighborhood. Again, not much evidence behind such a decision. It is also an example of a 'pick the winner' mentality. If you get what you want, you are done. You don't know, though, if that is the best you could have gotten if you had explored a wider region for all your factors. Narrow ranges also hamper the power of the design to detect actual effects. You can improve the power by increasing the number of runs or widening the factor range.

 

Also, DOE is not about picking the winner. It is about curating a model. The purpose of a design is essential to estimate and test the parameters in the model. Period. Anything else is 'pick the winner' stuff. You want your models to have a long lifetime. They should be able to represent your reality well enough to be helpful, so take care of them.

gfholland
Level II

Re: DOE: Would a "Top-Down" approach converge on the more conventional "Bottoms-Up" factor analysis?

I certainly bear some guilt in the "pick the winner" thinking, and its a criticism I'll bear in mind going forward. Your "curating the model" comment strikes closer to the ideal I seek, and appeals to the power I see in the ability of experimental design to iteratively refine a mathematical model describing observed results. Further, your argument to explore a wide range of factors falls on receptive ears: I think we don't truly know how to spec a system design until we've pushed that design to some breaking point. For my current system of interest, I think the most valuable next piece of data is a failure in initial performance: everything we throw at it seems to work fine (at first …).

I see flaws in my notion of developing a "top down" model, but I see a "bottom up" approach as too time consuming too late in the game. A "top down" approach just might help us generate a good enough model to foster discussion on how we might iteratively improve that model and better control our system, incorporating those other – including some "MAY" –factors that we have intentionally set aside. There is SO much we don't know, but we at least have a system that offers encouragement that we're doing SOMETHING right.

Thanks.

statman
Super User

Re: DOE: Would a "Top-Down" approach converge on the more conventional "Bottoms-Up" factor analysis?

Just to add a further thought...one of my favorite quotes applies to many situations:

"All models are wrong, some are useful" G.E.P. Box

In reality the approach you take is situation dependent.  There are "clues" we look for to help us understand the situation and therefore make rational and logical decisions as to how to proceed.  It is unlikely that anyone will ever design the perfect experiment as most experimentation contains an element of the unknown.  So my advice is to design multiple experiments (or sampling plans), predict the potential knowledge to be gained for each plan (e.g, what effects can be estimated, what is confounded, what is restricted) and weigh this knowledge against the constraints imposed (e.g., resources, time).  Ultimately all you can do is due diligence and then get the data. Be prepared to iterate (in fact, predict ALL possible outcomes and what will you do with that information).

"All models are wrong, some are useful" G.E.P. Box