Repeat anything enough times and it becomes meaningless. The repeated use of the word “tomorrow” in the first line foreshadows the conclusion of the famous soliloquy whereby Macbeth, upon learning of the death of his wife, searches the entirety of the temporal universe (from “all our yesterdays” right through to “the last syllable of recorded time”) for life’s significance and finds nothing. Years later, Andy Warhol echoed this theme by demonstrating that any image, however familiar or iconic, can too be made meaningless through endless repetition.
Many people struggle with the same phenomenon in their data. The effect of a treatment can be tested easily by taking a single set of measurements and analyzing the results using a one-way ANOVA to evaluate the significance of the effect between groups. The problem occurs when these measurements are repeated, complicating the analysis (think drug trials, where subjects are tested daily throughout the length of the trial, or the performance over time of products produced on different machines).
There is a great post on this topic in the JMP support notes titled "Analyzing Repeated Measures Data in JMP Software," in which three approaches are discussed. The purpose of this post is to demystify repeated measures using the univariate split-plot approach.
The example below (Figure 1) consists of three parts, each measured using one of two machines by four operators who recording the result in Y. In this case, there is no time variable at all; the repeated measures component comes from each part being measured a total of four times. Here, we’re trying to answer the question of whether Operator or Machine, or both, have a significant effect on the value of Y.
Figure 1: Table of repeated measures data; three parts made on two machines (6 parts total), each measured repeatedly by four operators.
We can use Standard Least Squares to answer these questions. The way to do that is to construct a model that includes Machine, Operator, and the interaction of Machine and Operator. We can also include Part in the model to ensure any variation due to differences in the parts are accounted for and not included in the other effects. Setting it as a random variable ensures the variation due to part is not treated as specific to these parts in particular but to all parts in general (see Figure 2). In this way, part defines the batches of a split-plot design, with Machine taking the role of the hard-to-change variable.
Figure 2: Fit Model dialog window.
However, this misses a crucial element of the story: The three parts measured using Machine A are not the same parts measured using Machine B. They are simply the ordering of the parts measured on each machine (the first three parts on A, and the first three parts on B). In order to make this clear, it is necessary to nest Part in Machine, as in Figure 3. Setting up the effect in this way ensures the software knows that “parts are different when machine is different.”
Figure 3: Fit Model dialog window, with the random variable Part nested in Machine.
The results from the Effect Summary (Figure 4) indicate that the measurements are significantly different for the different operators, as well as the machines. However, it doesn’t appear to matter which operator uses which machine, as the interaction effect of the two is above the criteria for significance (although we might have them test a few more parts to be sure).
Figure 4: The effects of Operator and Machine on Y are significant (PValue below 0.05), but the interaction between the two is not.
By analyzing your repeated measures data in JMP, you avoid the phenomenon described above and ensure your data is not meaningless.