cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
QW
QW
Level III

Are fixed blocking variables the same thing as a categorical variable, just that they don't appear in the prediction profiler?

Hello,

 

I have been analyzing data for an experiment measuring growth of human cells with 3 factors: 2 of which are independent variables that were controlled for, and the 3rd variable being the donor that the human cells came from (of which there are two). I have an n = 8, and the design was a 2^3 factorial. However, I'm currently finding that there are 3 ways to approach analysis of the data with regard to the variable "Donor":

 

1. I can simply include it as a categorical 2 level factor, in which case it enters the model and also shows up in the prediction profiler. As it turns out, there are interactions between "Donor" and the other two factors, which are conclusions that can be easily visualized as well. However, having this in the prediction profiler isn't particularly helpful because donor variability is something I will always have to deal with - I can't just keep 'maximizing' by going back to that same donor over and over.

 

2. I can change the 'Design Role' to a blocking variable. However I notice that when I fit a model, it basically results in the same model, with the exception that "Donor" no longer shows up in the Prediction Profiler. This is nice because I can elucidate what the other 2 factors (which were actual, controllable independent variables that I'd like to manipulate) were contributing. I read in an earlier post that blocking variables also can only show up in main effects, but this is not true since I am able to still model interactions. At this moment, I consider this the best approach.

 

3. Given that the 2 donors I chose are a subset of the infinitely large population of people present and future, I would probably consider this blocking variable to actually be "Random" and not "Fixed". However, based on some reading it seems like you do need around 5+ levels of the blocking variable for the estimation of variance to be accurate. As such, given that my dataset is small (n = 8, only 2 levels of the blocking variable), this may not be ideal.

 

Is my thinking on the right path?

 

16 REPLIES 16

Re: Are fixed blocking variables the same thing as a categorical variable, just that they don't appear in the prediction profiler?

I agree with you. I am now confused by your example of noise factors. I do not understand what noise factors have to do with blocking.

 

No, I am suggesting that there is another factor with a fixed effect that is not a simple block, like Lot or Day. It might not have been identified, but it is an uncontrolled lurking variable that varies with the block changes. Blocking accounts for simple mean shifts. An interaction indicates that another factor is active. That is all that I meant.

statman
Super User

Re: Are fixed blocking variables the same thing as a categorical variable, just that they don't appear in the prediction profiler?

Mark,

Hmmm I am really confused by your statement "I do not understand what noise factors have to do with blocking".  Blocking is done so that within each block, the noise is constant, thus increasing precision.  The noise, that was held constant within the block, is purposefully changed between blocks to increase the inference space.

Perhaps it is the definition of noise.  I define noise as those factors/variables that you are not willing to manage (future tense).  What I mean by manage is to control, or set levels for. The reason you might not be willing to manage factors is:

1. You can't, or don't have the technology

2. It is too costly

3. It is inconvenient

I spend most of my time in product development (R&D).  Much of that time is spent understanding how the design factors will affect product performance, and we are really trying to understand causal structure. There are decisions that must be made as to which factors can be managed and which ones cannot. Subsequently we get data to support or invalidate those decisions.  The engineer must consider how effective the design factors are at impacting product performance in the hands of the customer.  The engineer must consider sources of variation: materials, manufacturing/assembly, test, distribution, storage, use, etc.  Historically, product development is done with an extremely small number of samples from an extremely small inference space. Depending on product complexity and cost, they might get only a few "beta" units to determine how well their design will perform IN THE FUTURE.  Perhaps because of this, they choose to test product performance on this small scale while holding the noise as constant as possible (thereby increasing the precision of their experiments).  The "beta" units come from 1 lot of raw material, 1 manufacturing line (which is likely not a production line), 1 person doing assembly, tested over a short time period where ambient conditions don't change much, with 1 measurement system, the variations in customer use are all held constant (using the ink example earlier...hold pressure, angle, substrate and environmental conditions CONSTANT).  Then they extrapolate these results into the future and claim performance and reliability statistics as well.  You probably know this!  The question is, how do we test a small sample of beta units with increased precision and increased inference space.  Thanks goodness for DOE! The engineers/scientists need strategies to handle noise in these situations.  The design structure is only part of the equation.  What percentage of all of the factors in a process and typically experimented on? How are the "other" factors handled? Blocking is one of those strategies (though there are of course others).  While blocking confounds many noise factors, it does provide the opportunity to answer the questions: If the noise changes in the future, which invariably it will since it is by definition not being managed, will the product still perform as intended?  If the effect of design factors an engineer is choosing to improve product performance DEPENDS on noise (this is the definition of an interaction), then how will the design engineer specify the setting for the design factor?  The earlier this is discovered the more options you have and the more cost effectively it can be fixed or mitigated.  When design factors have the same effect on product performance over changing noise, you have a robust product design.

"All models are wrong, some are useful" G.E.P. Box

Re: Are fixed blocking variables the same thing as a categorical variable, just that they don't appear in the prediction profiler?

Again, I agree entirely with you. I didn't understand how the example of developing a ballpoint pen illustrated blocking. All the factors in that example, such as the angle, were controlled and none of them defined blocks. They were all part of the treatment in the laboratory experiment.

 

I am sorry for the confusion. We really do agree!

statman
Super User

Re: Are fixed blocking variables the same thing as a categorical variable, just that they don't appear in the prediction profiler?

Mark,

I'm obviously not communicating well.  We almost always agree, so let me try again.

Let me clarify the example.  We used blocks to manage the noise for the experiment.  In the first block we used one angle, one pressure, one substrate type and one set of ambient conditions (essentially the low levels for these).  In the second block we changed all of those to another "level" (the high level).  So those noise variables were held constant within the block, increasing precision and then were confounded with the block to increase inference space and test product robustness.  In this case, treating the block as a fixed effect allows to estimate if the there any interactions between the noise variables (angle, pressure, substrate and ambient) and design factors (ink formulations, design of the pen tip, etc.).  And yes there are!

With this approach, you can study the design factor effects over changing noise to either:

  • gain confidence the results you have initially obtained are repeatable over changing noise,
  • estimate the robustness of the factors to noise (quantified by block-by-factor interactions), or
  • determine if the direction of your study needs to change (large block effects should be disaggregated to identify potential factors for further study).

I have done this with many products with great success.

 

"All models are wrong, some are useful" G.E.P. Box

Re: Are fixed blocking variables the same thing as a categorical variable, just that they don't appear in the prediction profiler?

Thanks for the further explanation!

 

So the noise factors were treated like hard-to-change factors (not randomized), but they defined a block for analysis. It reminds me of Taguchi's designs with inner arrays replicated over runs in the outer array.

statman
Super User

Re: Are fixed blocking variables the same thing as a categorical variable, just that they don't appear in the prediction profiler?

Yes, but split-plots are not created from noise factors and are not analyzed the same way (that is you do not analyze nor interpret the results for split-plot designs the same as for blocked designs).

Actually it was Cox in the 1950's that first had the idea of experimenting on noise variables for each treatment of the design factor experiment.  He called these cross-product arrays (similar to the Taguchi inner and outer arrays, but with analysis more akin to split-plot designs vs. aggregating the data to get 1 signal to noise ratio). 

 

"All models are wrong, some are useful" G.E.P. Box

Re: Are fixed blocking variables the same thing as a categorical variable, just that they don't appear in the prediction profiler?

Yup!