I'm trying to understand what is the best method to use to study how multiple factors affect the final performance.
Say, one object goes through several consecutive process steps A, B, C, D and be measured for its final performance using metrics X, Y, Z. So ABCD are independent variables and XYZ are responces. For each independent step, there are a few different process conditions: A1, A2, A3, B1, B2.... The final experimental results table will contain entry objects that have a combination of different process conditions, for example, one object is produced by going through process conditions of A1-B2-C1-D3, another sample will be A2-B3-C4-D1 etc.
My goal is to find out which process step is cause significant changes on responses.
Should I use MANOVA? I saw one example of using MANOVA
But in this example, there is only one variable (treatment) and see how responses change to different treatment.
In my case, the response will be affected by multiple variables stacked on top of each other. How to use MANOVA or is other method better suited? How many data points for each is required to have enough degree of freedom for calculation?
Thanks so much for any help! I use JMP8.
This is a difficult question to answer without more information. Is the design completely factorial, such that all levels of A, B, C, and D are observed in the study? Are the responses continuous variables that are at least something like normally distributed? Do you have sufficient number of observations? In particular, suppose you have 4 levels of each of the design variables, thus giving 256 possible combinations. Do you have roughly four observations for each combination? If any of these assumptions are not quite met, MANOVA might be in trouble.
IF you do not have all 256 combinations (under the assumption of four levels for each of the design variables), can you recode and possibly use what is in the example? That is: Treatment1 is A1-B1-C1-D1 and so on until you exhaust the existing combinations. This might work better for MANOVA. The difficulty will then be in constructing appropriate contrasts to look at the effects of the design variables.
If you were using SAS, I would consider looking at PROC PLS. I don't know what the equivalent method in JMP would be.
JMP can do PLS as well - it's under Analyze/Multivariate Methods.
Thanks a lot for the replies. That's very helpful in clearing my thoughts.
The variable A, B, C... are the attributes of the history that the samples go through. Here is one detailed example.
A group of total 108 samples have the following process/history/condition:
A) Two thickness A1, A2. There are 78 samples at thickness A1, 30 samples at thickness A2. So thickness will be one variable.
B) All samples go through one process in 18 batches- B1, B2, B3, B4, B5, B6, .... B18. Each batch contains 6 samples. B1-B18 are all independent runs for this process. Batch to batch variation will be one variable.
C) After step B), all samples go through another process which has 3x higher throughput, ie one batch can accommerdate 18 samples. So there is total 6 batches. C1, C2, C3... C6. Batch to batch variation will be another variable.
After that, the final performance X is measured for each sample. The goal is to find out which is the major contribution of variation seen in X, is it mostly from thickness difference A? Or from batch to batch variation in process B? or in process C?
I only have JMP8. Really appreciate for any suggestion on how to use JMP to understand the above issue. Thanks so much!
Still somewhat confused by the B and C situation, and concerned about the imbalance passing the two thicknesses to the B process. I want to address the latter first. How are the samples assigned to batch? 18 does not divide evenly into either 78 or 30, so the analysis will confound thickness effects with process B batch effects. 18 of the batches will have only a single A2, and 12 will have 2 if a quasi-balanced design is used. I would be afraid that having different numbers of A1 and A2 will greatly increase the batch to batch variability.
Then the 6 samples in B1 are then processed through the C process in 6 separate batches.Is there a constant ratio of 13 A1 and 5 A2 samples in each batch? How are samples aggregated into C batches?
I read this as a mixed model, with a single fixed effect (A) and four random effects of interest (B, C, A*B, and A*C, the three-way being essentially noise). I would proceed to model each response variable separately, rather than as a MANOVA, although you could treat the responses X, Y and Z as repeated measures on the batch. I do not have enough JMP experience to know whether this is possible, but in SAS, it could be done with appropriate RANDOM statements in PROC GLIMMIX.
I hope the first part of this post is of use to you in thinking about this design.
Thanks so much for your time and help!
Please let me clarify:
A) Total 108 samples of two thicknesses: 78 of A1 and 30 of A2.
B) These samples went into process B (a thermal treatment in anoven). I’d love to have all samples go through it all together but due to the oven space limitation, only six samples can be accommodated each time so this step ended up with 13 batches for thickness A1 (B1, B2, B3… B13) and 5 batches for thickness A2 (B14, B15… B18). (Total: 13*6+5*6=108) This step introduced batch to batch variation on top of thickness difference.
C) Then, all these samples went into process C (a pressure step in a chamber).This chamber is bigger than the oven used for process B so 18 samples can be processed per run for total 6 runs, C1, C2…C6 (18*6=108). Each C batch contained samples from three B batches. Samples of same thickness were grouped together as much as possible, except for final batch C6 which contained left over A1 thickness from process B (B13) and left over A2 thickness from process B (B17,B18).
After that, all samples were measured for parameter X. I’d like to find out how thickness, b-to-b variation in process B and process C affect X.
Or to further simplify and narrow down, since C1-C4 contains only thickness A1, the question could just be asked as to how much variation seen in X were from b-to-b variation of process B or C at thickness A1? Maybe a two-way ANOVA? But I’d like to learn how to use multi-way (>3) ANOVA because I also have other process problems with more factors.
Thanks so much and look forward to hearing from you or any experts!
Because of the confounding issue, the only thing I can think of is a mixed model approach. I really can't see any other way to pull out what you want. In any case, here are some more thoughts. I really think Process B and Process C are random effects--for instance, in Process C, instances 1 through 4, is there ever any reason to believe that C1 has a different mean from C2? I think that the different instances of the process would only add variability, rather than some fixed amount. For process B, the same applies, and here we have a full separation, so that the variability added may be different for the two thicknesses.
I do not know how to do this in JMP, but I do know that there is a mixed model estimation procedure available. From picklists, I would include A as a fixed effect, and B, C, and the interaction of A and B as random effects. It may also be possible to include the interaction of A and C, but C6 complicates matters somewhat. With some simulated data, I was able to do the analysis in SAS, but the interaction random effects are just a problem. You can't delete C6 from the analysis, as then you only have one batch (C5) with A2. The best I was able to do was
proc glimmix data=one;
class a b c;
random b /solution group=a;
I hope there is an equivalent in JMP--perhaps others can "jump" in and help.