JMPer Cable

MeichenDong · Sep 12, 2025 12:30 PM

Randomized controlled trials (RCTs) and other carefully designed statistical experiments are the gold standard for answering questions about cause and effect. Randomly assigning experimental units (for example, participants in a clinical trial) to treatment and control groups balances the two groups so as to minimize the chance of substantive differences between them other than those caused by any intervention they undergo. In this case, an average causal effect can be simply estimated by taking the difference in mean outcomes between the groups.

However, oftentimes researchers want to investigate the effect one variable has on another but do not (or cannot) randomly assign participants to specific levels of the intervention. How can you measure the effect of an exposure on an outcome with observational or happenstance data rather than RCT data?

This is where causal inference comes in. Causal inference is a collection of concepts and methods to help determine when and how to make causal conclusions from observational data. Let’s say your research question is: “To what extent is a new treatment effective at lowering cholesterol, compared to the control?” What methods would you use to answer this question if you only had observational data, say measured from an insurance claim database?

You may decide to use a simple linear regression or t-test to determine whether the treatment and control groups are significantly different in cholesterol levels. However, perhaps those subjects with low-fat diets are both less likely to take the new treatment and less likely to have high cholesterol, while those with high-fat diets take the new treatment more often and have higher cholesterol. Without accounting for diet, then, you might conclude that the new treatment is not very effective. Confounders such as diet often induce spurious correlations, so these simple analyses can easily lead to very biased results that, in effect, do not answer the causal question.

Instead, you need to perform causal inference. In other words, you need extra assumptions that allow you to turn functions of your data into causal estimates. The following three assumptions are typically sufficient for making causal inferences:

Positivity: Every observation has a positive probability of receiving each level of treatment.
Consistency: Everyone who received the same level of treatment experienced the same version of it.
Conditional exchangeability: All variables that confound the relationship between the treatment and the outcome are measured, meaning the groups are “exchangeable,” given the confounders.

When these assumptions are met, causal effects can be estimated from observational data. Estimation methods use information about the confounders to “balance” the different groups, as would be done in an RCT, post-data collection. For more details, What If is an excellent, free resource for learning more about causal assumptions and estimators (Hernán & Robins, 2020).

To estimate causal effects with observational data in JMP, use the new Causal Treatment personality. Consider the following workflow for performing a thoughtful causal analysis:

Figure 1. An example DAG from Dong et al. (2025).

1. Draw a directed acyclic graph (DAG) that represents the data-generating process.

A DAG has directed arrows, showing which variables are causes and which are effects. A DAG is acyclic, meaning that if you follow the arrows directionally down a path, you will never return to the variable at which you started. Drawing a DAG will help you categorize your variables and identify confounders.

2. Consider whether the three assumptions are met.

Is there a covariate profile that never appears in the treated group? Have you defined your treatment well? Are you aware of any confounders that haven’t been measured?

3. Choose an estimator.

The Causal Treatment personality provides different ways to estimate causal effects, depending on the type of treatment variable and which models the user specifies.

4. Interpret the results.

If the three causal assumptions are met, the average treatment effect can be interpreted as the mean difference in outcomes had the entire population received one intervention versus another.

In addition to the Fit Model Causal Treatment personality, JMP has platforms such as Structural Equation Models and XGBoost that can be used to estimate causal effects (Dong et al., 2025). The Causal Treatment personality is a straightforward starting point for estimating causal effects from data, regardless of your level of experience. It directly outputs several of the most common and robust causal estimates of interest and provides visualizations to assess causal assumptions.

This blog was co-authored with Safiya Sirota (@safiya_jmp), who contributed to this work during her internship with JMP.

References

Dong M, Castro-Schilo L, Wolfinger R (2025). Causal Inference for Observational Data in JMP Pro using Structural Equation Models, Propensity Scoring, and Machine Learning . [Manuscript in preparation].

Hernán MA, Robins JM (2020). Causal Inference: What If. Boca Raton: Chapman & Hall/CRC