Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Re: DSDs and sparsity of effect principle

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

DSDs and sparsity of effect principle

Apr 29, 2019 2:27 PM
(1267 views)

This is more of a conceptual question related to DSDs and their reliance on the sparsity of effects/pareto principle. I am trying to create a DoE for characterization of a bioprocess for manufacturing of drug product. The goal is to collect data for the parameters we've identified, create a model, and use the prediction profiler in JMP to set limits for each of the parameters so there are no "failures" in say, 10,000 runs. When selecting our parameters, we do a risk analysis of all the parameters in our process. So we began with about 40 parameters. Then we ordered them and ranked them based on risk of impact to the process outputs. After ranking them, we eliminated the bottom ~80% of parameters, leaving us with about 10 factors to test. What I am curious about, is whether it is appropriate still to assume the sparsity of effects principle that the DSD seems to rely on (If I'm understanding it correctly) in this situation since we already eliminated what we considered to be the bottom ~80% of parameters? Shouldn't a large portion of the remaining factors be significant then? Granted, out of the 40 parameters we began with, a lot of them were minor and we included them only to do our due dilligence.

My followup question is if we have budget to test ~48 conditions, would it be better to do a custom design and simply remove interaction effects I don't believe will be important until I get the N down to below 48? A DSD only requires 25 conditions for 10 continuous factors and since I don't have much characterization experience I'm a little worried it won't provide enough data to confidently model where we should put our process limits if any of the parameter ranges we have decided to test lead to results that fall outside our output specifications. I realize we could always do a DSD and then supplement if we need more runs, but this would be less efficient for us (since it would require more blocks) than just starting with a design that counts for 48 runs.

Thanks for your help!

18 REPLIES 18

Highlighted
##

In the first paragraph of your opening remarks it sounds to me like you are concerned that, wrt to the 10 factors, a majority of them will be active. What does your process and domain expertise tell you? Do you have pre-existing process data that you can leverage using data visualization, exploratory data analysis, and simple to advanced modeling techniques to help inform your knowledge? If you truly have ZERO domain/process knowledge then I think you should go the DSD route as a FIRST experiment to build that knowledge. From there, with remaining resources, work on the optimization part of your problem using a SECOND experiment. In practice the new product/process development teams I worked on found the sequential DOE process to be the most efficient to solving the practical problem. DSDs are first and foremost SCREENING designs.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: DSDs and sparsity of effect principle

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: DSDs and sparsity of effect principle

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: DSDs and sparsity of effect principle

The idea of screening is to separate the "*trivial many from the vital few*."

I would consider adding a few more runs above the minimum number of runs in your DSD. JMP now defaults to 4 extra runs. You might consider adding 1 or 2 more runs. This small increase will dramatically improve the power of your design.

No, **do not** take your knowledge from the DSD or any other initial experiment and start a subsequent experiment with Custom Design. Use the **Augment Design** platform. This platform uses the custom design algorithm but it also uses your existing runs. So you can incrementally improve the power and predictive performance of a design. This approach is known as *sequential experimentation*, which has been advocated for almost five decades.

(By the way, the DSD is a special case of an alias-optimal custom design.)

Learn it once, use it forever!

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: DSDs and sparsity of effect principle

Hi @markbailey, thanks for your help. I have played with the augment design tool a little before. I would like some more clarity on how to differentiate the vital few from the trivial many though when I use the augment design. When you first press on "Augment Design", a screen comes up that asks you to select your response and factor parameters. Let's say that only 4 out of the 10 variables come up as significant from the initial DSD experiment. Should I only select the 4 factors in this screen? Or should I select all 10, and then on the next screen when it asks for model terms I should start removing terms that aren't significant, and including interaction effects and polynomial terms that I am interested in. Or do these do the same thing?

Thank you

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: DSDs and sparsity of effect principle

You only carry forward the factors that you decide are active. That decision might not be clear about all the original factors.

Yes, you should modify the model terms to update them based on your current knowledge or questions.

I recommend **Help** > **Books** > **Design of Experiments**. There are chapters about screening, DSD, and augmenting designs.

Learn it once, use it forever!

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: DSDs and sparsity of effect principle

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: DSDs and sparsity of effect principle

It all depends on how you view production length. Is it a factor (input) to influence the outcome or is it a repeated measures design? Would you select a level when optimizing or determine a window in the design space for it? That is a factor.

Are you interested in the time course? If so, then you can use JMP Pro Functional Data Explorer, although three points is not much of a function. Or you could use another model for the Y( time ) and use model parameters for responses if you have JMP. Either way, it is part of the response.

Bias is a model thing, not a design thing. (Of course, the design must support any of the models in question.)

Efficiency is a model thing, economy is a design thing.

Learn it once, use it forever!

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: DSDs and sparsity of effect principle

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: DSDs and sparsity of effect principle

Please do not 'put words in my mouth.' I did not say that your idea of treating time as part of the response is "improper." It is simply not what I would do. Since there is no 'what,' there is no 'why.'

You think in terms of combinations. I would say that you are *design-oriented*. I think in terms of estimation. I say that I am *model-oriented*. Both orientations work and one way does not have to explain itself or justify itself to the other way. (I work in a group that approaches experiments from both directions.) There is nothing that you have proposed that won't work as long as the design of the experiment, execution of the experimenta runs, and the data analysis (model) are consistent. I start with the model. You start with the design.

The issue of independence is real. You are proposing to run a repeated measures experiment but analyze it as a completely randomized factorial design. A statistical model has some combination of fixed and random effects that should reflect how the data was generated as well as what you want to know about the response. You are creating a series of whole plots with the two fixed factors in which to observe the third factor. The 3 observations in each plot are correlated this way.

Learn it once, use it forever!