BookmarkSubscribeSubscribe to RSS Feed



Jul 7, 2016

When two models collide and a third one jibes

In my previous post on Lord’s Paradox (1967), I showed how two models, even when fitted to the same data, can lead to opposite statistical inferences. These models are the analysis of covariance (ANCOVA) and the t-test on a difference score computed from two repeated measures of data. In this post, I elaborate on that previous example by discussing a third model: the Linear Mixed Effects (LME) Model.

Let’s start by recalling what your data in our hypothetical example look like and the key results from the ANCOVA and t-test in JMP (for the full results see my previous post):

SummaryPreviousPost.PNGFigure 1. Spaghetti plot of two-occasion data and summary of results from previous post.

The ANCOVA model suggests there’s a significant effect of having used a competitor’s product on the second occasion’s customer satisfaction, while controlling for customer satisfaction at time 1 (some might call this an effect of using the competitor’s product on the “residualized” customer satisfaction change). The t-test, on the other hand, suggests there are no differences in how customer satisfaction changed over time between those who did and didn’t use your competitor’s product.

Using Linear Mixed Effects Models

Knowing that different models make different assumptions about the data, you might wonder what would happen if you had fit a LME model to your data. Indeed, many fields recognize LME models as the cutting-edge approach for modeling data with a hierarchical or nested structure (e.g., repeated measures nested within individuals). Here, the LME model can be used to estimate an intercept and slope (i.e., change) for customer satisfaction over time and the effect of your key predictor on them. With luck – remember you only have two occasions of data! – you might even be able to estimate random effects for the intercept and slope too. That is, estimates of how customers deviate from the average trajectory.

The original data require some manipulation so they’re ready for an LME model (scroll down to download the data file from this post, which includes scripts for all LME models below). Let’s take a quick look:

SimDataLong.PNGFigure 2. Simulated data after manipulation into long format.The data now have one Customer Satisfaction variable and a Time variable which indicates the time of the assessment (0 = time 1 and 1 = time 2). You can use the Fit Model platform in JMP with Mixed Model personality in JMP to fit an “unconditional” model for customer satisfaction (Hint: turn off the option to center polynomials because 0 already represents a meaningful value of our predictors, and get indicator parameterization by selecting this option from the red triangle menu). This is a model focused solely on characterizing the customer satisfaction trajectories over time:


LME1.PNGFigure 3. Results from unconditional LME model with covariance of random effects of intercept and time.

Notice there’s no standard error associated with the Residual. This isn’t surprising because we’re asking too much from the data! Our data don’t have enough information to estimate this many random effects. One solution to properly identify the model is to eliminate the covariance between Intercept and Time, which in this case is zero:


LME2.PNGFigure 4. Results from unconditional LME model excluding covariance of random effects of intercept and time.

This unconditional model gives us useful information about the process under consideration: because Time is coded 0/1, the significant fixed effect of the Intercept suggests customers have an average satisfaction score of 4.04 on the first occasion of measurement that is significantly different from zero (see Figure 5). Importantly, customers deviate significantly from this average value given the significant random effect of Customer ID (this is the variance or random effect for the intercept). Moreover, the non-significant fixed and random effects of Time suggest the average customer satisfaction score doesn’t change on the second occasion of assessment and this is true for all customers. Because this model is just about customer satisfaction, we’re essentially modeling what’s depicted here:


SpaghettiPlot_GrandMean.pngFigure 5. Spaghetti plot of individual customer satisfaction trajectories with grand mean trajectory superimposed.

With a better understanding of the customer satisfaction trajectory, we can go back to the original question posed by your manager: Is there an effect of having used a competitor’s product in how your customers’ satisfaction changed over time? Well, we already know customer satisfaction DIDN’T change over time, so you might opt to stop here, but we might discover something new if we explore the results of a “conditional” LME model. That is, one in which our key predictor of interest is entered as a predictor in the model:


LME3.PNGFigure 6. Conditional LME model of customer satisfaction over time and the grouping predictor.

As before, we excluded the covariance between Intercept and Time because only two occasions of measurement don’t provide enough information to estimate such effect – this can have important implications that I’ll elaborate upon in the Key Points below.


Our conditional LME model results shed light on Lord’s Paradox and help us understand the big picture.

Interpreting Results of Conditional LME Model

First, none of the random effects are significant, suggesting that fixed effects’ estimates characterize our whole population of customers. Second, the Intercept and key grouping predictor are significant. Because Time is coded 0/1, the Intercept, 3.54, represents the average customer satisfaction score at time 1 for those who used the competitor’s product (this is in line with the red line in the spaghetti plot of Figure 1). There’s no effect of Time, so the average 3.54 customer satisfaction score didn’t change from time 1 to time 2 among those who used the competitor’s product. The significant effect of the key predictor (the grouping variable) points to a significant difference of 1 unit in customer satisfaction at time 1 between those who did and didn’t use the competitor’s product. That is, those who didn’t use your competitor’s product had an average customer satisfaction score of about 4.54. Finally, the lack of a significant interaction of the grouping variable with Time indicates no detectable difference across time in customer satisfaction for those who didn’t use the competitor’s product.


The LME models provide us with a bigger picture of the patterns in our data, but this comes at the cost of increased complexity in modeling and interpretation of the models – certainly when compared to the ANCOVA and t-test! So I’ll leave you with some key points on Lord’s Paradox and the three models we fit to these data. 

Key Points

  • The ANCOVA and t-test models resulted in contradictory inferences. The conflicting conclusions came about because the data are observational (i.e., come from a non-randomized study) and have baseline differences on the predictor of interest (whether customers had used a competitor’s product).
  • Some think of the ANCOVA model as a model of “residualized change.” ANCOVA is appropriate when you’re analyzing data that come from one population or when you don’t have baseline differences on a predictor of interest (this is likely the case in randomized studies).
  • The t-test on a difference score captures within-person change, but fails to include information about the initial levels of your construct (e.g., customer satisfaction).
  • A linear mixed effects model is useful for characterizing a process unfolding over time. Here, it clarified the seemingly contradictory results between the ANCOVA and t-test. But with only two time points, some aspects of the process can be lost, such as the association between the starting point and the ensuing change – this was the covariance between Intercept and Time in the LME models above, which we set to zero. To the degree such covariance is non-zero, estimates in the model can be biased.
  • Relatedly, when interested in how a variable changes over time and the effects of other variables in that change, strive to get at least three occasions of measurement. This will allow you to fit an LME model, have a better characterization of change, and estimate the effects of interest.
  • Lord’s Paradox is a complex phenomenon and researchers have differing opinions in how to best handle the potential contradictory inferences. A good resource to learn more about this topic is van Breukelen (2013).


Lord, F. M. (1967). A paradox in the interpretation of group comparisons. Psychological Bulletin, 68, 304-305.

van Breukelen, G. J. P. (2013). ANCOVA versus CHANGE from baseline in nonrandomized studies: The differences. Multivariate Behavioral Research, 48, 895-922.