Discussions

Gil · Feb 4, 2019 05:10 PM

When looking for a maximal response for a factor at a certain range, for example; Total DNA 100ug to 400ug, what is a more accurate and efficient way (higher power less runs) to do so?

Assign the factor as a continuous factor with 2nd power (X1*X1)
Assign as discrete numeric at 3 levels (100, 250, 400)

In the past I always ran option A but during design analysis, the power I get for X1*X1 is pretty low (~0.4) compared to the main effects and 2nd interactions (above 0.8) so I’m assistant to choose this option, I’m I wrong in this notion?

Thanks for your help,

Mark_Bailey · Feb 5, 2019 12:29 PM

First of all, the design of the experiment (the set of treatments) does not affect the accuracy of the estimates. That is, the bias of the estimates (parameter or response) does not depend on the design. The bias of the estimates depends on the form of the model.

Secondly, the design does affect the variance of the estimates. This effect, in turn, affects the power. As the variance of the estimates increases, the power decreases.

Thirdly, power is not related to the bias of the model. Power is defined as the probability of not making a type II error. It is only possible if the alternative hypothesis (the parameter is not zero) is true. Power depends on the effect size, the response variance, the sample size (DF really), and the significance level. Low power means that there is a small chance of finding a real effect (non-zero) to be statistically significant.

I would not trust a model until, after it has been fit, it has been independently validated with new observations. I recommend testing two conditions: the condition that is predicted to deliver the desired outcome and another condition that predicts an awful outcome. You can trust the model to generally predict all outcomes well that way.

As a side note, be careful when comparing designs. I did not see a big difference in the power using continuous factors versus numeric discrete factors if I was careful to over-ride automatic actions by JMP that changed the number of runs, the model terms, and so forth. I am not saying that the power you observed is wrong. I am just saying that it is easy to get power values that are not comparable.

View solution in original post

Mark_Bailey · Feb 5, 2019 06:49 AM

It won't make much difference. You can specify the model and JMP picks the levels or you can specify the levels and JMP picks the terms. The continuous factor gives the custom designer more freedom (no restriction on levels) and so it will likely be better but not by much. The discrete numeric factor fixes the levels and the additional terms necessary for the optimization of the design have their estimability set to 'if possible,' so they count differently in the power calculations. I. suggest that you try it for yourself. Design the experiment both ways and examine the evaluation information. Just be sure that it is a fair comparison (number of runs, optimiality criterion, model terms and estimability, and so on).

The power for the terms that are powers is always much lower than the power for the first-order terms and two-factor interactions. It might take a large number of runs that is no longer economical to achieve high power for such terms.

Gil · Feb 5, 2019 09:04 AM

Thanks Mark,

Just to better understand, if the power of the term X1*X1 is low, is that means that JMP most likely would not fit the line accurately when using the profiler? for example: one continuous factor at 15 runs would give 0.827 for the main effect and 0.39 for the term, let's say when analyzing the results using a fit model and a profiler, I asked JMP to maximize the result and the answer I got was 300, can I trust it? (or any other result between the tested points 0, 250 and 400).

Discrete numeric at 3 or 4 levels and 15 runs would give a power of 0.946, is that a much better option?

Mark_Bailey · Feb 5, 2019 12:29 PM

First of all, the design of the experiment (the set of treatments) does not affect the accuracy of the estimates. That is, the bias of the estimates (parameter or response) does not depend on the design. The bias of the estimates depends on the form of the model.

Secondly, the design does affect the variance of the estimates. This effect, in turn, affects the power. As the variance of the estimates increases, the power decreases.

Thirdly, power is not related to the bias of the model. Power is defined as the probability of not making a type II error. It is only possible if the alternative hypothesis (the parameter is not zero) is true. Power depends on the effect size, the response variance, the sample size (DF really), and the significance level. Low power means that there is a small chance of finding a real effect (non-zero) to be statistically significant.

I would not trust a model until, after it has been fit, it has been independently validated with new observations. I recommend testing two conditions: the condition that is predicted to deliver the desired outcome and another condition that predicts an awful outcome. You can trust the model to generally predict all outcomes well that way.

As a side note, be careful when comparing designs. I did not see a big difference in the power using continuous factors versus numeric discrete factors if I was careful to over-ride automatic actions by JMP that changed the number of runs, the model terms, and so forth. I am not saying that the power you observed is wrong. I am just saying that it is easy to get power values that are not comparable.

Discussions

Continuous Vs Discrete Numeric Factor

Re: Continuous Vs Discrete Numeric Factor

Re: Continuous Vs Discrete Numeric Factor

Re: Continuous Vs Discrete Numeric Factor

Re: Continuous Vs Discrete Numeric Factor

Recommended Articles