Subscribe Bookmark
jiancao

Staff

Joined:

Jul 7, 2014

JMP Pro for linear mixed models — Part 2

In an earlier blog post, I introduced the new Mixed Model capability in JMP Pro 11 and showed an example of random coefficient models. In this post, I continue my discussion of using mixed models for repeated measures and panel data. I’ll leave modeling geospatial data as well as tips and tricks for a future post.

 

Example 2: Analysis of Repeated Measures — accounting for correlated errors

 

In the analysis of repeated measures, multiple measurements of a response are collected from the same subjects over time. This example is taken from JMP documentation. In the study, subjects were randomly assigned to different treatment groups. Each subject’s total cholesterol level was measured several times during the clinical trial. The objective of the study is to test whether new drugs are effective at lowering cholesterol. What makes the analysis distinct is the correlation of the measurements within a subject. Failure to account for it often leads to incorrect conclusions about the treatment effect. (The data, Cholesterol Stacked, is available in the JMP software’s Sample Data Directory. See my earlier post for more information.)

 

JMP Pro offers three commonly used covariance structures:

 

 

 

  • Unstructured provides a flexible structure that estimates covariance for all pairs of measurement times. In this example of six repeated measures, 15 covariance parameters as well as six variance estimates will be estimated. This structure is most lenient but not without risk of overfitting.

 

 

  • AR(1) (first-order autoregressive) estimates correlation between two measurements that are one unit of time apart. The correlation declines as the time difference increases. This is a parsimonious structure with only two variance parameters to be estimated.

 

 

  • CS (compound symmetry) postulates that the covariance is constant regardless of how far apart the measurements are. The number of parameters to be estimated is two.

 

 

I follow the steps outlined in my previous post to specify the mixed model for the analysis of repeated measures. The Fixed Effects part of the model includes Treatment, Month, AM/PM and their interactions. 

 

Fixed effects part of the model

Fixed effects part of the model

 

 

I will consider different covariance structures for the within-subject errors. First, let’s consider Unstructured. Apply the Time column as Repeated and the Patient column as Subject — this defines the repeated measurements within a subject. It is important to note that JMP requires that the Subject column be uniquely valued and that the Repeated column be categorical for the Unstructured option.

 

Unstructured Covariance Structure

Unstructured Covariance Structure

 

 

I now focus my discussion on the Repeated Effects Covariance Parameter Estimates. 

 

Results using Unstructured

Results using Unstructured

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

One way of testing the statistical significance of the covariance estimates is to calculate the z-scores and find their p-values, as I did in my example of a random coefficient model. However, we can check the confidence limits: If the 95% confidence interval for a covariance estimate includes zero, then we can say that it is not statistically significant from zero at α=5%. After sorting the report, we can see that all six variance estimates are significantly different from zero but most of covariance estimates are not. This suggests that a parsimonious structure, such as AR (1), should be considered.The Fixed Effects report shows a highly significant treatment effect. Cholesterol level is also found to vary significantly from month to month and from morning to afternoon.

 

Next, we consider AR(1). The Repeated column used in AR(1) must be a continuous variable. So, Days — the number of days from the trial start date at each measurement — is used. 

 

AR(1) Covariance Structure

AR(1) Covariance Structure

 

 

The Repeated Effects Covariance Parameter Estimate report shows that the within-subject correlation is 0.95 and statistically significant. Fixed effects results are similar (not shown) — treatment effect and time effects are statistically significant. 

 

Results using AR(1)

Results using AR(1)

 

 

To complete our example, let’s fit the model with a CS structure. To do so, select Residual as the Repeated Covariance Structure — but there's no need to specify Repeated and Subject columns with this option; instead, we add Patient as a random subject effect on the Random Effects tab. That is, within-subject covariance is modeled through the random subject effect. For more details on the implementation of compound symmetry structure in JMP, refer to JMP documentation.

 

CS Structure with random subject effect and residual error

CS Structure with random subject effect and residual error

 

 

 

 

CS Structure with random subject effect and residual error

CS Structure with random subject effect and residual error

 

 

Based on the 95% confidence limits, the covariance between any two measures on the same subject is not statistically significant at α=0.05 (actually, p-value= 0.0621). Fixed effect test results are similar to the previous models and are not shown here. 

 

Results using CS

Results using CS

 

 

So, which repeated structure should be adopted? We can compare AICc from the Fit Statistics reports (not shown): Unstructured—703.84, AR(1)—652.63 and CS—832.55. So, AR(1) is the winner.

 

Example 3: Panel Data Models — controlling for unobserved heterogeneity

 

This example is taken from Vella and Verbeek (1998), which is discussed in Introductory Econometrics by Jeffrey Woodridge as Example 14.4. See references below for more information and where to get the data.

 

The original data came from the National Longitudinal Survey of Youth 1979 Cohort (NLSY79). In this analysis, each of the 545 male workers worked every year from 1980 through 1987. We’re interested in estimating the effect on wage earnings of union membership controlling for education, work experience, ethnicity, etc. Although NLSY79 collects detailed background information on the workers that can be used as control variables, there is still individual difference that cannot be observed or measured. Panel data provides a way of accounting for individual heterogeneity: If it can be assumed to be uncorrelated with all the explanatory variables, we can treat heterogeneity as a random effect.

 

I follow Woodridge’s discussion in his book to specify a wage equation using panel data. 

 

Fixed effects part of the Log(Wage) Equation

Fixed effects part of the Log(Wage) Equation

 

 

 

 

Random effects part of the Log(Wage) Equation

Random effects part of the Log(Wage) Equation

 

 

I apply Residual structure to the model error term. The model is called one-way random effect model in econometrics, also known as a variance component model. The results are shown below.

 

One-way Random Effect Model Results

One-way Random Effect Model Results

 

 

From the Random Effects Covariance Parameter Estimates report, we find that individual heterogeneity accounts for 47.8% (=0.11/(0.11+0.12)) of the total variation. This indicates a large unobserved effect, suggesting an OLS analysis would likely yield misleading results.

 

The Fixed Effects Parameter Estimates report shows an estimated rate of return to education at 9.2% and a union premium of 10.5%, both of which are highly statistically significant.

 

References

 

Francis Vella and Marno Verbeek (1998), "Whose Wages Do Unions Raise? A Dynamic Model of Unionism and Wage Rate Determination for Young Men," Journal of Applied Econometrics, Vol. 13, No. 2, pp. 163-183. (Data can be downloaded from the Journal’s website.)

 

Jeffrey Woodridge (2012), Introductory Econometrics: A Modern Approach, CENGAGE Learning.

 

(Acknowledgement: I would like to thank Christopher Gotwalt and Laura Lancaster for their help.)

10 Comments
Community Member

LI Zhen wrote:

I'm trying to do some panel data models, but I cannot find the entrance.(JMP Pro 10.0.0) Can anyone tell me why?

Staff

Jian Cao wrote:

In JMP 10, you can fit panel data models in Fit Model. To specify a random-effect, add it as model effect and then click Attribute and select Random Effect. Fixed effects are added just like you would add other explanatory variables to model.

Staff

Jian Cao wrote:

JMP Pro 13 adds several new covariance structures such as Antedependent, Toeplitz, and Unequal Variances, extending its applicability to a variety of new contexts.

http://www.jmp.com/support/help/13/Fit_Model_Launch_Window_2.shtml#1013652

Community Trekker

I noticed that JMP Pro 13 allows assigning up to two Subject terms for the Exchangeable Structure of the Repeated Structure tab in the Mixed Model dialog.  Can you give me an example where two subject terms would be appropriate or preferred for repeated measures longitudinal regression?  In particular, I'm wondering whether I could use ID and Eye as Subject terms (where Eye is nested within ID) and Year_Cat as the repeated term.  Or should I use only ID as the Subject term and concatenate "Eye" and "Year_Cat" as the Repeated term?  Thanks in advance.

Staff

When you assign two subject terms for the Exchangeable structure, JMP will concatenate them into a single subject id to be used to identify the “within-subject” measurements. If you think the variability between eyes within a person is random, then use both patient ID and eye as subjects.

Community Trekker

Hi Jian,

 

What would be the difference in the model structure for the AR(1) cholesterol data example if you also added "Patient" to the random effects tab? Is this different from having "Patient" placed as a subject in the repeated structure tab?

 

Thank you!

Staff

Yes, there is a difference.  The Subject columns entered in the Repeated Structure tab (in this case, patient) identifies the subjects where the repeated measures are from.

 

For the AR(1) repeated structure, you can add Patient to the Random Effects tab. This allows the estimation of between subject variation component in addition to the within -ubject component represented by AR(1). In the example as shown, the between-subject random effect is statistically not significant,so it is dropped.

 

Community Trekker

Thank you

The links provide above to the first post are dead. The landing page says that the blog has been retired.

Community Manager

I've updated the links. Thanks!