cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
] />

JMPer Cable

A technical blog for JMP users of all levels, full of how-to's, tips and tricks, and detailed information on JMP features
Choose Language Hide Translation Bar
LauraCS
Staff
Phantom Variables in the Structural Equation Models Platform

I know, it’s not Halloween season, but phantom variables are important no matter the time of year! Let’s review how to specify and use phantom variables in the Structural Equation Models (SEM) platform of JMP Pro.

Phantom variables in SEM are most commonly used in longitudinal analysis. Imagine having three repeated measures collected at different time intervals. Perhaps we administered a multiple-choice test in a given year (let’s call it Year 1), then we administered it again in Year 2, and then again in Year 4. Notice that the spacing between assessments is unequal.

An analyst might be interested in fitting a lag-1 autoregressive (AR) model to these data. This model would tell us the effect that a score at a given year has on the next year’s score. Often, we are interested in testing whether those yearly effects are equal across timepoints. (Of course, there are other models that would be interesting, such as a linear latent growth curve model, an AR model with a random intercept, and others. However, this simple AR model makes it easy to demonstrate how to specify and use phantom variables.)

The SEM platform has a useful model shortcut to specify autoregressive models under the Longitudinal Analysis submenu (look for Cross-Lagged Panel Model).

LauraCS_0-1770911726314.png

 

The model shortcut prompts users to input the time value at which each repeated measure was obtained. These values are critical for the SEM platform to know how to specify the model (otherwise, which variable comes before which?). A great convention is to include time values at the end of each variable name, which enables use of the Guess From Names button in the initial pop-up menu.

LauraCS_1-1770911726329.png

 

After defining time values, the next window shows the SEM platform’s interpretation of the repeated measures. We must make sure the times and order are correct in the table. We also have options to set equality constraints, add latent variables to account for measurement error, add random intercepts, etc. Here, we’re going to uncheck all the options and click OK.

LauraCS_2-1770911726344.png

 

The resulting model is technically correct; but remember that in many applications, we want to test the equality of the autoregressive effects. The current model specification doesn’t allow us to do that because Year 1 --> Year 2 is not the same as Year 2 --> Year 4 (the first lag is one year but the second is two years).

LauraCS_3-1770911726346.png

 

Phantom variables to the rescue

Phantom variables are simply latent variables that don’t really exist – that is, they have no indicators (observed variables) linked to them. Phantom variables provide a nice solution for testing the equality of autoregressive effects when assessments are not equally spaced.

To specify a phantom variable, we need to select any one manifest variable in the To List, name the phantom variable intuitively, and click the + button. The latent variable will show up in the path diagram, but we need to remove its factor loading that’s linking it to the manifest variable (select the loading in the diagram and click Remove in the action buttons). We also have to do something about its variance; for now, we’ll fix it to zero (select the variance in the diagram and click Fix To or Remove in the action buttons). Finally, we have a true phantom variable in our model:

LauraCS_0-1771359984881.png

 

Now we need to treat the phantom variable as if it was a real variable in our model. It will act as a place holder for the missing Year 3 data and will allow us to get accurate year-to-year autoregressive effects. To do this, select the Year 2 --> Year 4 regression effect in the diagram and click Remove. Then, select the Year 2 variable in the From List, the Year 3 phantom variable in the To List, and click the one-headed arrow. Repeat these steps to connect the phantom variable to the Year 4 variable.

LauraCS_5-1770911726356.png

 

 

 

Now, each regression effect represents the same time lag. However, the model is not quite ready to be fitted yet. We still need to add equality constraints to identify the model. For our simplest model, we’ll just set equality constraints on the two paths linked to the phantom variable (this is the minimum needed to identify the model). Select the paths and click Set Equal in the action buttons:

AR_Model_Phantom_wEQ_var0.png

 

Now click Run to fit the model. We can examine the results, but recall that our goal is to test whether autoregressive effects are equal across time. Thus, we need to fit another model where we impose equality constraints on all autoregressive effects. Select all three autoregressive effects and click Set Equal followed by Run to fit the second model:

LauraCS_7-1770911726372.png

 

Before examining the results of the second model, let’s first determine which of the two models fits best. Select both models in the Model Comparison table and click the Compare Selected Models button below the table. The chi-square difference test answers our key question:

LauraCS_8-1770911726378.png

 

In this example, the significant difference in chi-square suggests there’s a significant increase in model misfit when we place equality constraints across all autoregressive effects. In other words, the Year 1 --> Year 2 effect is not equal to Year 2 --> Year 3 or Year 3 --> Year 4. Results from the first model are the ones we ought to interpret.

LauraCS_9-1770911726388.png

 

For every unit increase in Year 1 scores, we expect a 1.527 unit increase in Year 2 scores. This increase is significantly greater than the increases for Year 2 --> Year 3 and Year 3 --> Year 4, which are 1.304 units. Importantly, we don’t have data to test empirically the equality of effects from Year 2 --> Year 3 and Year 3 --> Year 4. However, the phantom variable helped us estimate the autoregressive effects in the proper units by acting as a place holder for Year 3.

Residual variances for phantom variables

In some cases, one might also want to test the equality of residual variances over time. This is possible by specifying residual variances for each phantom variable (other variables already have residual variances), and setting appropriate equality constraints across time. In our example, the minimum number of equality constraints required so the phantom variable can have a residual variance is displayed below.

LauraCS_1-1771361199663.png

A model with equality constraints across all residual variances also requires equality constraints in all autoregressive effects.

LauraCS_2-1771361386866.png

 

Note that in different applications, phantom variables may have a variance fixed to one or even negative one (now, that's spooky)! You can learn about these alternative specifications, and more technical details, in this scientific article by David Rindskopf:

Rindskopf, D. (1984). Using phantom and imaginary latent variables to parameterize constraints in linear structural models. Psychometrika49(1), 37-47.

Key points

  • Phantom variables are latent variables without indicators and without a freely estimated variance.
  • Phantom variables act like place holders that allow us to reparameterize a model.
  • For autoregressive models with unequal time lags, phantom variables allow us to specify a model that provides estimates for equal time lags, and thus enable us to test for equal effects across time.
  • In JMP Pro, we “trick” the SEM platform to create a latent variable with one indicator. Then, we remove everything but its soul so what remains is just a lingering phantom with the unfinished task of helping us achieve our modeling goals.

 

Last Modified: Feb 17, 2026 4:54 PM