Choose Language Hide Translation Bar
LauraCS
Staff

Modeling Trajectories over Time with Structural Equation Models (2021-US-45MP-914)

Level: Intermediate

 

Laura Castro-Schilo, JMP Sr. Research Statistician Developer, SAS

 

The structural equation modeling (SEM) framework enables analysts to study associations between observed and unobserved (i.e., latent) variables. Many applications of SEM use cross-sectional data. However, this framework provides great flexibility for modeling longitudinal data too. In this presentation, we describe latent growth curve modeling (LGCM) as a flexible tool for characterizing trajectories of growth over time. After a brief review of basic SEM concepts, we show how means are incorporated into the analysis to test theories of growth. We illustrate LGCM by fitting models to data on individuals' reports of Anxiety and Health Complains during the beginning of the COVID-19 pandemic. The analyses show that Resilience predicts unique patterns of change in the trajectories of Anxiety and Health Complaints.

 

 

Auto-generated transcript...

 


Speaker

Transcript

Lauren Vaughan See.
  Okay, we are recording okay all right, if you could please confirm your name company and abstract title.
Laura Castro-Schilo, JMP Laura Castro shiloh JMP SAS and my abstract okay so it's a modeling trajectories with structural equation modeling and that's it right.
Lauren Vaughan yep and um let's see you do understand this is being recorded for use in the JMP discovery Summit Conference, and you will, it will be available publicly in the JMP user community do you get permission for us to use this recording.
Laura Castro-Schilo, JMP Yes.
Lauren Vaughan Excellent OK, I will turn it over to you Laura.
Laura Castro-Schilo, JMP Right, and so I just.
  have to share my screen somewhere here right.
  Now.
  But.
Lauren Vaughan perfect.
Laura Castro-Schilo, JMP two seconds.
  Hi, everyone. I'm Laura Castro-Schilo and today we're talking about modeling trajectories with structural equation models.
  And we're going to start this presentation by first answering the question of why we would use SEM for longitudianal data analysis.
  And then we'll jump into a very brief elevator version of an introduction to SEM. If this is your first exposure to SEM, I strongly encourage you to look for some of our previous presentations in Discovery Summits
  that are recorded and available for you to watch, so that you can get a better understanding of the foundations of SEM.
  But hopefully even without that intro, hopefully, this brief version will set you up to understand the material that we're going to talk about here today.
  And that introduction we're going to focus on how we model means in structural equation models. We're going to see that means allow us to extend traditional SEM into a longitudinal framework.
  And modeling those means will have implications for how our path diagrams look like, and also we'll see how those diagrams map on to the equations
  in our models. We're going to focus specifically on latent growth curve models, even though we can fit a number of different longitudinal models in SEM.
  And then we'll use a real data example to show how we model trajectories on anxiety and health complaints during the pandemic.
  And at the end we're going to wrap it up with a brief summary, and I'll give you some references in case you're interested in pursuing some longitudinal modeling and you want to learn more about this topic.
  So Singer and Willet are two professors from Harvard's School of Graduate Education, and I think they said it best
  when they claimed in a popular textbook of theirs that SEM's flexibility can dramatically extend your analytic reach. Indeed, this is probably the most important reason why you might want to use SEM for longitudinal data analysis.
  Now, specifically when we're talking about flexibility, we're referring to the fact that you can fit a number of different models
  in SEM that are longitudinal models and can be quantified in terms of fit and can be compared empirically, so that you can be sure that you're characterizing your longitudinal trajectories in the best possible way.
  There's a number of different models that we can figure, you can see them listed there and we can,
  you know, things like repeated measures ANOVA, which can make some pretty strong assumptions about the data. SEM allows us to relax some of those assumptions and actually test empirically whether those assumptions are attainable.
  SEM is also really flexible when it comes to extending the univariate models into a multivariate context. So if you're interested in looking at how changes in one process influence or are associated with changes in another process, SEM is going to make that very easy and intuitive.
  Now we know SEM has a number of nice features, and all of those apply in the longitudinal context as well. Things like the ability to account for measurement error explicitly,
  to be able to model unobserved trajectories by using latent variables and also using a cutting edge estimation algorithms for when we have missing data, which actually happens pretty often when we have longitudinal designs.
  Another interesting feature is that it allows us to incorporate our knowledge of the process that we're studying. So we'll see that that prior knowledge about what we expect the functional form in our data to be can be mapped onto our models in a very straightforward way.
  But there's also reasons why we should not use SEM for longitudinal analysis.
  I think, most importantly, the structure of the data is what might limit us the most. So in SEM we're going to be
  required to have measurements that are taking up the same time points across all of our samples. So say if, for example, we're looking at anxiety and we have repeated measures...
  three repeated measures over time, the structure of the data have to be like what I'm showing you here right now, where, you know, we might have anxiety at one occasioin and that's
  represented as one column, one variable in our data tables, and then we have anxiety at a second time point and at a third time point.
  So what this means is that everybody's assessment of the first time point has to have taken place at the same time, and that's not always the case. And so there's going to be other techniques that are more appropriate if, in fact, your data are not time structured.
  We also have to acknowledge the assumption of multivariate normality. Sometimes we can, you know...
  SEM might be a little robust to this assumption, but we still need to be very careful with it.
  And it's also in large sample techniques. So that data table I just showed you, you know, we really want to have substantially more rows than we have columns in the data, and this might not always be the case.
  So just as a reminder, if you haven't been exposed to SEM, this is also a nice brief intro,
  is that in SEM, well, one of the most useful tools are called path diagrams, and these are simply a graphical representations of our statistical models.
  And so, if we know how diagrams are drawn, then it'll be much easier for us to use them to specify our models and also to interpret
  other structural equation models. So these are the elements that form a path diagram and you can see here that a square or rectangle are used exclusively to denote manifest or observed variables in our diagrams.
  And that's in contrast to unobserve variables, which are always represented with circles and ovals. Now arrows in path diagrams are...
  they represent parameters in the model. So double-headed arrows are always going to be used for variances or covariances, and one-headed arrows represent regressions or loadings.
  In the context of longitudinal data, there's another symbol that is really important, and that is a triangle. The triangle represents a constant, and it's used in the same way that you use a constant in regression analysis,
  meaning that if you regress a variable on a constant, you're going to obtain its mean. So we model means and we put some
  constraints in the mean structure of our data by having a constant in our models. So let's take a look at a simple regression example.
  If you wanted to, you know, fit a simple regression in SEM, this would be the path diagram that we would draw. So you can see X and Y are observed variables, we have X predicting Y, with that one-headed arrow, and both X and Y have
  variances. In the case of Y, because it's an outcome, that's a residual variance.
  And we also have to add the regression of Y on the constant if you want to make sure that we get an estimate for the intercept of that regression. So here, this arrow would represent the intercept of Y, and notice that we also have to regress X on that
  constant in order to acknowledge the fact that X has a mean.
  And now we can use some labels, so that we can be very explicit about which parameters these arrows represent. And then we can see how those
  arrows...so we can trace the arrows in the path diagram in order to understand the equations that are implied by that diagram.
  So let's focus first on Y. You can see that we can trace all of the arrows that are pointing to Y in order to obtain that simple, you know, regression equation. We have Y is equal to tau, one...times one (which is just that constant so we don't have to write the one down here)
  plus beta one times X (which we have right down here) plus the residual variants of Y.
  Now we also can do the same for X, because in SEM all of the variables that are in our models need to have some sort of equation associated with them. And here we want to make sure that we acknowledge the fact that X has a mean, so we regress that on the constant and it also has a variance.
  So again, those path diagrams are away to depict the system of equations in our models.
  And those diagrams, it's very important to understand that they also have important implications for the structure that they impose on the various covariance matrix of the data and on the mean vector.
  And I think it's easiest to explain that concept by actually changing the model that we're specifying here. Rather than having a regression model,
  what I'm going to do is, I'm going to fix all of those edges to zero. So all of these effects, I'm just going to say I'm going to fix them to zero,
  which is the same as just erasing the errors from the diagram all together, and now you can see how the equations for X and Y have changed there. This is a very...
  well, this is a very interesting model. It's simple, but it actually has a lot of constraints, right, because it implies that X and Y have a variance but that their covariance is exactly zero; there's nothing linking these two nodes.
  And it also implies that the means for both X and Y are exactly zero, because we're not regressing either of them on the constant in order to acknowledge that they have a non zero mean.
  So now we if we really want to fit this model to some sample data, then that means we have some samples statistics from our data.
  And the way that estimation works in SEM is we're going to try to get estimates for our parameters in a way that match the sample statistics as closely as possible, but still
  retaining the constraints that the modal imposes on the data. And so, in this particular example, if we actually estimate this model, we would see that we are able to capture the variances of X and Y
  perfectly, but the constraints that say that the covariance is zero and that the means are zero, those will still remain. And so the way in which we
  figure out whether our models fit well to the data is in fact by comparing this model implied covariance and means structure to the actual sample statistics, and so we can look at the difference between those and obtain our residuals.
  And these residuals can be further quantified in order to obtain metrics that allow us to figure out whether our models fit well or not.
  Okay, so that's our intro for SEM, and these are going to be the concepts that we're going to be using throughout the presentation, in order to understand how we model trajectories with SEM.
  Now what better way to start talking about trajectories then to imagine some data that actually have some trajectories. And so I want you to think for a second,
  how anxious are you about the pandemic? If it had been asked of you early in 2020, when the pandemic was first started.
  And perhaps a group of researchers approached, you they asked this question, and then they came back a month later and asked you the question same question again.
  And maybe they came back a couple months later, and also asked about your anxiety. So we might obtain this data from a sample of individuals and the data would be structured in the way that is presented here, where each of those time points would be different variables in in the data.
  And now let's imagine that we have the interest of looking at some of the trajectories from that sample, and we want to plot them so that we can start thinking about how we would describe these trajectories.
  So let's take three individuals. This is going to be a fabricated example just to illustrate some concepts, but imagine that the first individual gives us the exact same score of three at
  each of the time points that we asked this question. And maybe in this example, maybe anxiety ranges from zero to five, where five means there's...you're more anxious about the pandemic. So for this individual, the trajectory of this person is perfectly flat, right. It's a very
  simple trajectory, and maybe for a person...individuals two and three, you know, maybe we get the exact same pattern of responses.
  And so, if this were to be real, and we had to describe these trajectories to an audience, it would actually be really easy
  to do that, right, because we could just say, there's zero variability in the trajectories of individuals, and really, just describing a flat line would would do the rest, right. So we can use the equation of the line to say, you know, anxiety at each time point takes on
  these values. And we would have to clarify, right, that the mean, or rather the the intercept for this line is equal to three and the slope is zero, so that we really just described that flat line.
  Well that'd be a really easy to do, but of course this is a very unrealistic pattern of of data, so we're not expecting that we would observe this in the real world.
  So let's imagine a different set of trajectories where there's actually some some variability on how people are changing.
  And in this case, we could still find an average trajectory, right, a line of best fit through these data. And if we only use that the equation of that line to describe the data,
  we would really be missing the full picture, right. That would not do a very good job of showing that some individuals, you know, number one is increasing, whereas
  individual three is is decreasing. So instead we have to add a little more complexity to that equation we saw earlier, in order to account for the variability in the intercept and the slope.
  So again, if we had to describe this to an audience, one thing we can do is in this equation, I'm adding a sub index I
  to represent the fact that anxiety for each individual at each time point can take on a different value.
  Now notice that the intercept and the slope for the equation also have that I,
  indicating that we can have differences...you can have variability on the intercept and the slope, and we can still use the average trajectory to describe the average line, right, such that that intercept
  can still be three and the slope is zero. But notice that we add these additional factors here that capture the variability
  of the intercept and the slope, and, specifically, these are the values for each individual that are expressed as deviations from the average trajectory.
  And then we'll see that we're going to have to make some assumptions about those factors in terms of their distribution, which should be normal with a
  mean of zero and an unrestricted covariance matrix.
  But even these trajectories are also quite unrealistic, right, because I'm showing you these perfectly straight lines. And when we get real data, it's never ever going to look that perfect. Indeed,
  these three trajectories are much more likely to look like this, right, where even if we are assuming that there is an underlying sort of an unobserved linear trajectory,
  those are not the trajectories we observed. In other words, we have to acknowledge that any data that you observed at any given time point is going to have some error, right. And so
  we're still able to capture that error into our equation and we'll make some assumptions about that error being normally distributed.
  But again, the idea is that we have these unobserved error free trajectories and that's not what we really get when we are observing the individual assessments, right, into in our data.
  So our equation is going to describe that average trajectory and it's also going to describe the individual trajectories as departures from the individual line...I'm sorry, from the average line.
  Alright, so not everything that we have described so far is actually known as a linear latent growth curve model in SEM.
  And if this looks like a mixed effects or random coefficients model, if you're familiar with those, it's because it is actually very, very similar.
  Now we only have three time points here, so this is a very simple linear growth curve, but we can still have,
  you know, more complex models that incorporate some nonlinearities if, in fact, we have more time points so that we're able to capture those nonlinearities, and we can do that for polynomials and there's other ways actually to capture nonlinearities in growth curve models.
  Today we're going to keep it very simple and we're going to stick to the linear models, though.
  All right now, I want to bring it all together by really showing you how those equations of that linear latent growth curve model, how they can be mapped to a path diagram that can be used to fit our structural equation models. And so we're first going to start by
  by using the simplest equations here, the equation for the intercept and the slope. And remember that that intercept and slope represent unobserved values, right, represent
  unobserved growth factors, and so we're going to use latent variables, these ovals, to represent them in our path diagram. And notice that the intercept is equal to a mean, plus that
  variance factor, right. And so that is why we regress the intercept on the constant in order to obtain it's mean, and we also have this double-headed arrow in order to represent that variability in the intercept.
  And we do the same for the slope. Now notice, we also have a double-headed arrow linking the intercept to the slope, and that is to represent the covariance, right, that we make that assumption over here.
  And it just means that we're going to acknowledge that individuals that perhaps start higher on a given process might have an association to how they change over time, okay, and that is what this covarience allows us to estimate(?).
  Now, ultimately, what we're modeling is our observed data, right, our observed measurements for anxiety, and so here is the full path diagram
  that would characterize the linear growth curve. And notice, I'm going to focus on one anxiety time point first, that first time point,
  and again using the idea of tracing the path diagram, we can see how anxiety at time one is equal to one times the intercept, which is right here in this equation,
  plus zero times the slope, so this part just falls out, plus that error term. So in other words, what we're saying is that anxiety at time one is simply going to be the intercept of that individual plus some error.
  And then we can do the same, tracing the path diagram, to see what's the equation for anxiety at the second time point. You can see that it's, once again, that intercept plus one times the slope, so here is basically the
  initial value of that person, the intercept, plus some amount of change.
  And then, at the third occasion is basically again the...just by tracing this, we see that the equation implies that we have a starting point, which is the intercept plus now is two times the slope.
  Right, so notice how in these latent variables, the factor loadings are fixed to known values, and we are fixing those values to something that forces these trajectories to take a linear shape.
  So here the factor loadings of that slope is basically the way in which time is coded in the data, and this is the reason why everybody in SEM actually needs to have the same
  time point for a measurement, right, because everyone that has the value of anxiety at time one is going to have that
  same time code, which is embedded into the way in which we fix these factor loadings.
  Alright, so now, in this particular specification can actually work, you know, perfectly fine if we have, for example, yearly assessments of anxiety.
  But notice here what I'm emphasizing is that there's equal spacing between the time points, right, and that's important because, in order for this to really be a linear growth curve, there needs to be equal spacing here.
  But obviously this could be weekly assessments or they could be assessments that are taken every month and that's fine. This is going to work out great.
  Now, it could be that you don't have equal spacing and that can also be handled fine in SEM as long as everybody has the assessment at the same time point.
  So here's an example where there's one month spacing between the first measure of anxiety and the second one, but then from the second to the third, there were
  two months, and so what we have to do is fix the loading of that last...the slope loading here, instead of two, it has to be now fixed to three, right, in order to capture...and notice from one, we jump from one to three and that's what assures us that we still have a linear trajectory here.
  Alright.
  So it's time for the demo, and what I want to share with you is some data that come from the COVID-19 Psychological Research Consortium. It's a group of universities that got together and wanted to really start collecting longitudinal data to understand the extent of
  the damage really that the pandemic is having on people's mental health and even their physical health. And so we have three waves of data.
  And these are from a subsample of the UK, and just like I showed you in that previous slide, the repeated measures are in fact from
  March 2020, and then a month later in April, and then two months later in June. And we're going to be looking at repeated measures for anxiety.
  The survey for anxiety could vary from...could range the scores from zero to 100, where 100 means higher anxiety.
  And then we're also going to look at health complaints over time. Those could range from zero to 28, whereas, you know, higher score for percent more health complaints.
  And we're going to look at one time invariant variable which is resilience and this one was assessed at the beginning in March 2020.
  Okay, so let's take a look at the data.
  So I have the data right here. And notice, we have a unique identifier for each of our individuals, so each row represents a person. Actually,
  there's some missing data there that we're not going to worry about right now. But
  notice we have some demographic variables and then further to the right here, we have our data on anxiety and those are the repeated measures that we're going to focus on first.
  Now I do want to say that initially, you would want to, you know, plot your data with some nice longitudinal graphs,
  but we're going to skip straight into the modeling because I want to make sure we have time to show you how to use the SEM platform for these models.
  So I'm going to go to analyze, multivariate methods, structural equation models. And I'm going to use those three anxiety variables and I'm going to click on model variables and okay, in order to launch the platform.
  So notice that, as a default, we already see a path diagram that is drawn here on the canvas and we can make changes to that diagram in a number of ways.
  I usually use the the left list, the from and to list, where we can select the nodes in the diagram and we can link them with one-headed arrows or two-headed arrows, right. I can just show you here, so by selecting them, we can make some changes here.
  And I can click reset here on the action buttons, in order to get us back to that initial...initial model, and we can also add latent variables by selecting our observed variables in this tool list and then also adding latent variable here with that plus button.
  So nice thing for us today...and I'm sorry about my dog is barking in the background, but we probably have some mail being delivered.
  But the nice thing today for us is that we have this really useful model shortcut menu. And if we click on here, we're going to see that there's a longitudinal analysis menu with a lot of different options for growth curves.
  So let's start with the intercept only latent growth curve. And here the model that's being specified for us is one where each of our anxiety measures is only specified to load onto an intercept factor.
  And so this is one of those models where there's only a flat line, but we have a variance on the intercept acknowledging that individuals have flat lines, but they could have different intercepts for them.
  Now we don't know if this model is going to fit the data well. In many instances, it won't because it's a no growth model, and
  nevertheless, it's actually quite useful to fit this model as a baseline so that we can compare our other models against this one, right. And we do
  label the model no growth as a default here when you use that shortcut. So I'm going to click run and very quickly, we can see the the output here.
  There's two fit indices are really important for SEM. These are over here. The CFI is something that we want to have as close as possible to one, and you can see here, this is...
  this is pretty low. Usually you want to have .9 or higher,
  at the least. And RMSEA, we want it to be a most .1. We really wanted to be as close to zero as possible.
  And so, this is very high, and so, not surprisingly, it's a poor fitting model, so we're not even going to look at the estimates from it, because we know it doesn't fit very well.
  But we're going to leave it there because it's a good baseline to have in order to compare against. So going back to the model shortcuts, we could look at the linear growth curve model.
  And when I click that, I automatically get that slope factor added and notice that
  the factor loadings are there, and as a default, we just fix them to zero, one and two. Now the way in which this
  shortcut works, is that it assumes that your repeated measures are in the platform in ascending order. It's really important, because if they're not, then these factor loadings are not going to be
  specified like...they're not going to be fixed to the proper values.
  In fact, here you can see that June is fixed to two, but I know that there's two months in between April and June and so I'm actually going to have to come in here and make the change by selecting this
  loading and clicking on fix to, and I'm going to fix it to three, because I know that that's what I need to have to really have that linear growth curve.
  And so that's it. We're ready to fit the model and so I'm going to click run.
  And notice what a great improvement in the fit indices we have, right. The CFI is nearly perfect and the RMSEA is definitely less than .1, so this is a very good fitting model and we can now
  look at the parameter estimates to try and understand what are the trajectories of anxiety.
  The first bar we can see is the means of the intercept and the slope.
  They are statistically significant and they tell us the overall trajectory in the data, so on average individuals in March started with an intercept of 60, about 67 units,
  and over time on average, they're decreasing by about five and a half units every month. Because of the way that the slope factor loadings are coded, we know that this estimate represents the amount of change from one month to the next.
  Some of the very interesting estimates in this model are the variability of the intercept and the slope.
  And notice they're also substantial in this model, which basically means that, yeah, we have that average trajectory, but not everybody follows that trajectory.
  That means that some individuals can be increasing, while others are decreasing and others might be staying flat. And so a natural question at this point can be,
  you know, what are the factors that help us distinguish between those different patterns of change? And that is a question that can be really
  easy to tackle in this framework and we're going to do that by bringing in factors that predict intercept and slope.
  So on the red triangle menu, I can click on add manifest variables, and let's take a look at resilience as a predictor.
  So I'm going to click OK, and by default, resilience has a variance and a mean and that's okay, because I want to acknowledge has a non zero mean and variance,.
  but I want it to be a predictor, so I'm going to select in the from list, and I'm going to select intercept and slope in the to list.
  And we're going to add a one-headed arrow to link them together and have the regression estimates, so we can understand whether resilience explains differences in how people are changing.
  And so I'm just going to click run here, and we see that this is, in fact, a very good fitting model.
  And it has some really interesting results, because it shows that the estimate of
  resilience predicting the intercept, that initial value of anxiety is, in fact, statistically significant and negative. And it can be interpreted as any
  standardized regression coefficient, meaning that, for every unit increasing resilience, this is how much we should expect the intercept in anxiety to change, right. So the more resilient you were in March,
  the more likely you are to have lower score for your intercept in anxiety in March, so that's really interesting, but then again resilience in this model does not seem to have an effect on how you're changing over time.
  Okay, well, that's really interesting, but I really want to get to the idea of fitting multivariate models in SEM, so let's go back to the data.
  And I've already specified ahead of time...I saved a script that models again, just a linear univariate model of health complaints over time.
  So we have an intercept and we have a slope and I fit this model, you can see it fits very well as well, and so we can look individually at both
  anxiety and health complaints over time. And that is often times a good way to start to look at the univariate models first.
  And so here health complaints, as a reminder, could range from zero to 28, and we can see that the trajectory according to the means here, average trajectory
  is described by an overall intercept of about four and it has increases over time of about .3 units.
  And in this case, there seems to be significant variability in the intercept and not so much...not not for the slope, so people are generally changing in the same way. Overall, individuals seem to be increasing by .3 units every month in their health complaints.
  Okay, so now let's use this red triangle menu, and once again we're going to click add manifest variables, but what we're going to add are all three repeated measures for anxiety.
  So I'm going to click OK, and as a default, we're going to put the means and variances of anxiety, but I don't want the means of anxiety to be freely estimated.
  What I really want is for the means to be structured through the intercepts and slope factors. So I have to select those edges, and I'm going to remove them so that
  instead, what I'm going to start building interactively here is a linear growth curve that looks just like this one, but for anxiety.
  So I'm going to start by selecting all the three measures here, and I'm going to name this latent variable intercept of anxiety. I'm going to click plus.
  And now there's the intercept factor but notice as a default, we will fix the first loading to one for any latent variable.
  But because we want this to take on the meaning of an intercept, we actually want to fix these two loadings to one. I'm going to click here,
  fix those to one, and now we have to add the slope. So I select all three of them, and I'm going to say slope of anxiety. I'm going to click plus.
  Now that slope is over here. Again as a default, we fix this first loading to one, but I know that I want to code this in a way that that first
  factor loading is zero, so I'm simply going to select that factor loading and I'm going to click delete to get rid of it, because that's the same as fixing it to zero, and then I'm going to fix this loading to one.
  And that last loading needs to be three, in order to have that linear growth.
  Now we're almost done. Remember that the most interesting question that we'll be able to answer in this bivariate model is
  to look at the association of growth factors across processes. So we're going to select all of these nodes in the from and to list and we're going to link them with double-headed arrows. Those are going to represent the
  covariances across all of these factors, and the last thing we need is to add
  the means of intercept and slope for anxiety. So we're going to click over here, and that's it. We're ready to fit our bivariate model. I'm going to click run.
  And notice it runs very quickly. The model fits really, really well, and these mean estimates, once again, describe the trajectories for each of the two processes. I'm going to hide them, for now, so that we can interpret some of the other estimates with a little more ease.
  I think there's some really interesting findings here. You can see these values are in a covariance matrix, so
  we could actually change this to show the standardized estimates, just so that we can interpret these covariances in a relation metric.
  But what's really interesting is to see that there are positive significant associations between
  the intercept, that is, the the baseline starting values of individuals in their health complaints and how they're changing in their anxiety over time.
  In other words, the higher your intercept is, your initial value of health complaints, the more likely you are to have higher rates of change and anxiety. And we also see that positive association between the baseline values in health complaints and anxiety.
  And there's another positive association here that's really interesting, because this is a positive association between rates of change.
  So the more you're changing in health complaints, the more likely you are to be changing in your anxiety. So if you're increasing in one, you're increasing on the other, so that's really insightful. What again...
  we can still come back and add a little more complexity by trying to understand the different patterns of change in this model, so we can go to add manifest variables and look at how resilience
  impacts all of those growth factors. So I simply add it as a predictor here very quickly. The models do start to get a little cluttered, so we're going to have to move things around to make them look a little better, but this is ready to run.
  It runs very quickly. It fits really well and we could, you know, we could hide some of these edges, like we can hide the means and
  even the covariances for now, just so that it's easier to interpret these regression.
  effects. And so you can see that resilience has a negative association with both
  health complaints and anxiety at the first occasion. In other words, the more resilient you are in March, the more likely you are to have lower values in the health complaints and in anxiety, so that's really cool.
  And we also see here that for the rates of change, in the case of anxiety, the rate of change is not significant, the prediction isn't,
  but it is significant -- this line really should be a solid, because you can see that there is a significant association...negative association between resilience and the rate of change in health complaints, such that the more resilient you are, the
  more likely you are to be decreasing in health complaints over time. That's really interesting, especially when you tie a
  well-being or mental health aspect, like resilience, into something more physical, right, like that health complaints.
  Alright, so we're running out of time, but the very last thing I want to show you here, just because I really want to show you the extent to which SEM is so flexible and can answer all sorts of interesting questions.
  I actually fit a model that is a bit more complex, where I'm looking at three different predictors of all of those growth factors.
  And I also brought in measures of loneliness and depression in June at the last occasion. And what I did here, again I left this with all the edges, just so that you could really see the full specification of the model.
  But I can hide some of the edges, just to make it easier to understand what's happening here. What I did is I added
  loneliness and depression, and I'm trying to understand how the patterns of growth are predicting those outcomes, alright. So here you see those regressions.
  And we're also adding some interesting predictors like the individual's age, the number of children in the household, in addition to to resilience, as we saw before.
  And I could spend a long time just really unpacking all of the interesting results that are here.
  Without a doubt, you see, solid lines represent significant effects, so you can see that your patterns of growth and health complaints significantly predict depression
  at that last month in June. So that's, to me... I find that fascinating and you can also see how resilience in this case has a number of different significant...
  number of different significant effects on how people are changing over time. Here is is an interesting effect,
  where for every unit increase in resilience, we expect the rate of change in health complaints to decrease by .02 units, so it's a small effect but it's still a significant effect, so it's really interesting.
  And there's a number of things that you could explore just by looking at the output options.
  At the very bottom here, I included the R squares for all of our outcomes and you can see we're not explaining that much variance in the intercepts and slopeo factors here, so that means that there's still a lot more that we can learn by bringing additional predictors to this model.
  Okay, so let's go back to our slides, and
  I want to make sure that we summarize all the great things that we can achieve with these models.
  You can see that growth curve models allow us to understand the overall trajectory and individual trajectories of change over time.
  They allow us to identify key predictors that distinguish between different patterns of change in the data
  and allow to examine effects that those growth factors have on outcomes. And when it comes to multivariate models, it's really nice to see how how change processes...
  changes in a process can be associated to changes in a different process.
  Now it's important that we remember in our illustration that the data were observational, so we cannot make causal inferences, and also, we were using manifest variables for anxiety, but anxiety is an unobservable
  construct, so really just be aware that if you really wanted to, we, and if we had experimental data, we could use experimental data so that we could make causal inferences and we could have also specified latent variables for anxiety,
  such that we had more precision on our anxiety scores.
  Alright, so I think, even though we cannot make causal inferences, it's pretty fair to say that resilience appears to be a key ingredient for well-being, and so I want to make sure that this is the take home message
  today, because I think as the months continue to pass during this pandemic, we all need to find ways in which we can foster our resilience, so that we can, you know, deal with whatever comes as
  well as we can. And so with that, I want to make sure that you have some references in case you want to learn more about longitudinal modeling and I thank you for your time.
Comments

Hi @LauraCS, I'm a technical intern at JMP and am looking to explore SEM's applications in science and industry. I was wondering if you could make the .JMP data files for this discovery summit available so that I can explore the SEM functionality within JMP myself.

 

Thanks,

Jordan

LauraCS

Hi @jordanwalters,

 

Sounds like a fun project you're working on! :) Indeed, the capabilities of SEM with longitudinal data should prove valuable for lots of industry applications, and this methodology is widely used in science too. I'm attaching the data from the example here, although it's worth stating that the data and analyses I conducted in my presentation were for demonstration of the SEM functionality only. The data are more complex than what they appear (for example, there's selection bias among those who continue participating in the study over time, there are non-Gaussian distributions, etc) and you could learn more about them in these links:

https://www.sheffield.ac.uk/psychology-consortium-covid19

https://osf.io/v2zur/

 

HTH,

~Laura

Thanks for providing these @LauraCS