JMP Blog

anne_milley · Sep 16, 2025 08:58 AM

We recently had the pleasure of hosting Dr. Ken Bollen, world-renowned expert on structural equation models (SEMs), on Statistically Speaking. As is often the case when we feature such a well-known subject-matter expert on a powerful topic, we had more questions than time to answer them during the livestreamed event. Bollen has kindly taken the time to answer the questions we didn’t get to ask. To see the on-demand webinar and the questions he did answer, you can watch it, as well as access his slides.

Any concern with possible collinearity between Z variables?

Collinearity between Zs that are measures of the same latent variable are a good thing in that if the Zs are affected by the same latent variable and they are highly correlated, then it suggests that the measures are also highly correlated with the latent variable that they measure. Results of the SEM can estimate that correlation between the Zs and their latent variables.

If the Zs are explanatory variables and are highly correlated, then you have the usual problems associated with high collinearity – large standard errors and more sampling variability in the coefficient estimates.

In the home value example SEM diagram, could you explain why there would be an arrow for measurement error pointing to the latent variable? Would measurement errors not be assumed to influence observed errors?

Remember that there are two types of errors. One is measurement error, but the other is equation error where the latter contains all the other influences on the latent variable that are not explicitly included in the equation. In the home value example, we would not expect the lot size, square footage, attached garage, etc., to completely explain home value. The error pointing toward home value captures the other influences on home value that are not explicitly included in the model.

In a good fit model, should p value be less than or greater than 0.05%?

By convention, a good fitting model should have the model chi square test and the equation-by-equation tests with high p-values. The conventional 0.05 Type I error is a common standard, but there is nothing magical about it. The other consideration is that large samples enable the detection of even minor specification errors, so there is a need to consider the statistical power of these tests.

An assumption of factor analysis is usually that errors of items measuring the same factors are uncorrelated. In practice, this is often not the case when all items are assessed using the same assessment method. (This is especially relevant for questionnaire data, where cognitive biases may affect people's responses to multiple items, but it could also apply smart watches that may have persistent errors in each measure.) Latent variable models in such cases only remove some measurement errors but retain measurement errors that affect all items. I think "latent variables are free of measurement error" is oftentimes an overstatement. While it is standard to ask about advanced psychometric properties that can be expressed numerically (e.g., measurement invariance across subgroups and time points), it is uncommon to expect items for a specific latent variable to be assessed with different assessment methods to make sure that latent variables are free of method biases. What is your opinion on this?

I would encourage researchers to formulate their hypothesis about cognitive bias or other systematic influences that crosscut measures so that they can build models that incorporate and test for these effects. If we have at least a couple of measures from a different source, it might be possible to build a model that tests for such artifacts. Or if we are measuring different latent variables with different measures and you assume that all measures across all latent variables have a common systematic influence on them, you might be able to build a model to test this. Basically, it requires that the model you build is identified so that it’s possible to estimate unique values of the model parameters. In those cases where your model is underidentified, it might be that some key relationships are in equations that are identified and then you could estimate and test those equations with MIIV-2SLS. If all equations are underidentified, then at least you know a limitation of your model and the possible assumptions you introduced to identify the model.

Why are latent variables correlated in measurement models?

Just like observed variables, many latent variables correlate with each other. The major exception is if you have a randomized treatment or some other reason to plausibly specify no association among the latent variables. In the blood pressure example, it seems quite reasonable to assume that systolic and diastolic blood pressures are positively correlated and would be hard to defend them as being uncorrelated.

Is SEM applicable for both cross-sectional and longitudinal data?

Yes, they are. A fascinating application in longitudinal data is to use a general longitudinal model like the latent variable-ALT model as a means of choosing the best longitudinal model to use. The LV-ALT include autoregressive-cross-lagged, growth curve models, random effects models, fixed effects models, etc., as special cases so you can determine which is the best fitting. See Bianconcini and Bollen, 2018, Structural Equation Modeling (journal).

What would be an issue with SEM path models with only observed variables and no latent variable included?

The main issue is that you are assuming no measurement error in any of your observed variables, so you need to ask yourself if that is plausible.

You mentioned Bayesian SEM briefly. How do we decide when Bayesian estimation is preferable over maximum likelihood, especially in small samples or complex models?

The decision process would be similar to other decisions on whether to take a frequentists vs. a Bayesian approach. Bayesian can be helpful in small samples, but there is a tradeoff in heavier reliance on your prior distribution assumptions. In my forthcoming book, “Elements of Structural Equation Models” with Cambridge University Press, I have a section on Bayesian SEM that discusses the Bayesian approach. It might be useful to you.

Screenshot 2025-09-15 at 1.34.27 PM.png

The fundamental principle in hypothesis testing is that to reject H0, the test results must be both statistically and practically significant. Thus, in SEM and to reject H0: the data fits the model “well,” requires that BOTH chi squared < .05 AND poor goodness-of-fit indices. Do you agree?

I would agree that a statistically significant chi square and poor fit indices would make me question my model. The chi square test, whether for the model as a whole with ML or equation-by-equation with MIIV-2SLS, has the usual advantages and limitations of any significance test. So, you need to be aware of them when judging the test statistics. Fit indices are also far from perfect. The standards of fit are not rigorously and comprehensively established, and we are still learning more about their properties. Though I would be happy with both a nonsignificant chi square and strong fit indices, I still need to remember that there might be other models with as good or better fit.

We thank Ken again for taking the time to share his experience and expertise on this powerful approach to modeling and to explain how other approaches might fall short. For the many academics who have expressed interest in SEM, remember that JMP Student Edition, based on JMP Pro, is available for free to students and faculty who want to try JMP’s easy-to-use yet rigorous SEM capabilities. If you’d like to see more, be sure to join us for the Technically Speaking webinar that features SEM developer Dr. Laura Castro-Schilo and JMP Systems Engineer Christian Stopp applying SEMs to real-world problems.