Choose Language Hide Translation Bar

Latent Class Analysis for identifying subclasses of Depression using JMP Pro 16 (2021-US-30MP-893)

Karishma Yadav, Student, Singapore Management University
SEET Fei Fei Sue-ann, Student, Singapore Management University
TAN Yi Ying, Student, Singapore Management University
Tin Seong, Associate Professor, School of Computing & Information Systems, Singapore Management University

 

According to WHO, “Depression is a leading cause of disability worldwide and is a major contributor to the overall global burden of disease”. A major stumbling block in the care of depressed patients remains the accurate diagnosis of the severity of depression. Patient Health Questionnaire (PHQ-9), a 9-question instrument is widely used for diagnosing and determining the severity of depression. However, the popularly used 5-Category of depression severity based on the sum of responses to the 9 questions was overly subjective.  In view of this limitation, our paper aims to demonstrate how Latent Class Analysis of JMP Pro can be used to provide a data-driven and objective approach to determine depression severity classes.  The study was conducted using Mental Health-Depression Screener from National Health and Nutrition Examination Survey (NHANES) 2017-2018, conducted by the Centres for Disease Control and Prevention, USA.  The analysis results reveal that Latent Class Analysis improves our understanding of the characteristics of depression classes better than the conventional 5-Category method.   

 

 

 

Auto-generated transcript...

 

Speaker

Transcript

Karishma YADAV Hello.
Yi Ying TAN Hello.
Karishma YADAV Sorry, we can't hear you marry.
Yi Ying TAN me, I mean sorry.
Karishma YADAV yeah I can hear you.
Larry LaRusso Now I would imagine this works.
Karishma YADAV yeah it works.
Larry LaRusso Okay, great.
Sorry it's interesting.
For some reason my my laptop Mike doesn't pick me up outside of JMP so.
I keep having to JMP back and forth, but I know that every time somebody says, I can't hear.
You guys doing.
Karishma YADAV we're doing good.
Larry LaRusso I like your body is it Krishna is that are you saying.
Karishma YADAV yeah it's Scottish.
Larry LaRusso Irish MMA Nice.
I recognize that.
well.
Larry LaRusso And then he is how you say.
Yi Ying TAN Yes, yes.
Larry LaRusso awesome well yeah thank you both for doing this.
We certainly appreciate you taking the time to record the talk and and really I'm I'm here to just sort of help with production aspect of it so.
If you notice in the upper left hand corner you'll probably see that we're automatically recording just want to make sure you're both okay with that yep.
Karishma YADAV yeah okay.
Larry LaRusso Excellent and so.
Our intention, obviously, is to put this make this part of the discovery conference and we're going to put it on our Community pages on community.com.
So it will actually live beyond the conference as well, people will be able to access that and everything, are you guys feel comfortable with that as well.
Karishma YADAV yeah good.
Larry LaRusso awesome Thank you very much, those are the two legal questions.
And, and your backgrounds look excellent, I just want to make sure there's no logo infringement or anything everything looks great there have you recorded for us before, is this the first time you're doing the discovery summit.
Karishma YADAV yeah, this is the first time.
Okay.
Larry LaRusso Well then, I'll just I'll let you know when it comes time for you guys to go I'm going to probably new my.
headphones and then eliminate my video as long as you're talking and I'm not I shouldn't pop up on the screen, but if I do that then there's no chance of me him mistakenly showing up so I'm gonna when when it's time for us to go I'm going to move in turn my video off.
And it'll feel like you're just speaking to yourself, but I'm listening and just making sure everything's going Okay, and that the video looks good and everything in the slide will find.
It for some reason and and I recorded more than a dozen of these probably if, for some reason, something glitches.
I'm just going to unmute and say hey I need you guys to stop we're gonna have to re record or whatever, but I don't anticipate that happening so just chalk on you know truck on and do talk from beginning to end.
If if either one of you just feels like oh my gosh I messed that up completely that's not what I wanted to say, we just stop and then say hey I'm going to start again and do it again oh yeah.
You should do that.
Yi Ying TAN Okay.
Larry LaRusso But, but I would encourage you not to worry about a people like the intention here is to be pretty casual with this and it doesn't need to be perfect.
I would say, like I said, the only thing I would worry about is like, if you were like wait a second I do in the event side or You said something wrong and you realize that we have the ability to just just say I'm going to start again in and it's fine but, but like I said.
people watching us are going to be very, very forgiving it's it's a it's a it's a zoom call I wouldn't worry about it, something terrible.
Yi Ying TAN 30pm in Singapore, you know, like your plan to do it more than.
Once.
Larry LaRusso The other thing is.
If you would give me a verbal cues so I don't know which one of you are going to start first, but if you'll just say.
Thank you for joining me welcome to our talk or something like that, because it sort of gives me a starting point to make sure that Okay, I know that she.
wants to start it right here, and then at the end if you'll say thanks for joining me if you have any questions, please.
feel free to type them into the chat or something like that, because there'll be a page that's built that has comments at the bottom.
Or if you just want to end it with thanks for joining me, you know I don't know if the contact information or just something to give me of audio starting point and ending point so that we can chop that video and make sure it looks.
And then the last thing I'll say and then really I'll let you guys do the presentation.
Is we're recording three different versions of this.
One with just the slide so the slider take the entirety of the screen, the other one with you speaking in this probably how you're seeing me right now is I'm just taking a little foreigner, and then the fly takes up 80% of the screen.
And then the third one is.
it's you full screen, as the speaker only and we're doing that because most of the time will have your slide and you.
But if you hang on a slide for more than a couple of minutes just for variety, we might flop, in just your picture you know, like just have you talked for 20 seconds and then go back.
So we have three different versions and we've got some really good.
Producers that will go in and sort of make some creative decisions about how to show it so just know that it might JMP back and forth almost like a you know, like a TV show and sometimes you'll be talking sometimes to the slide and then some most of the time.
Karishma YADAV Okay sounds good yeah.
Larry LaRusso Okay, do you have any other questions.
Yi Ying TAN yeah.
Karishma YADAV So if we have to do a retake.
By chance, then will it resume or we're gonna have to start from wrong, the entire time.
Larry LaRusso So you know we might have a 40 minute recording if you only have a 20 minute talk and it's fine I mean if if you know how easy it is to just cut chop off the first whatever minute so for some reason, and again I.
don't be too hard on yourself, but if you do feel that no I really do want to restart just Larry I'm going to start again and then say.
Welcome to my talk, you know and then we'll know that first version will get rid of it so yeah we're not going to have to restart we're just going to work until we hang up.
we'll do the chopping.
Yi Ying TAN All right.
Larry LaRusso Thank you, let me just don't have anything else, I want to make sure you.
Your sound sounds great the backgrounds look awesome.
I think we're good to go so I'm going to turn this over to you, and again I'll be listening, the whole time, so if you have any questions or you need to stop or something just let me know otherwise enjoy the talk, I will.
I will I'll talk to you guys after you're done.
Yi Ying TAN Thank you.
So we can start anytime.
Larry LaRusso Right yeah just start, if you want to take a second get yourself ready, it looks like your slides are already up so just whenever you're ready say thanks for joining me and then just you're off.
Yi Ying TAN and
Larry LaRusso I'm going to mute myself and turn my video off.
Thank you.
Yi Ying TAN Good morning. First of all I'd like to thank the committee for shortlisting our project, it is a great honor to participate in JMP Discovery Summit.
Before I start the presentation or introduce our team, we are from the SMU's master of it in business program. I am Yi, and we have Karishma and Sue-Ann with me today.
Tin Seong Kam is our professor. He is not joining us today because he has other commitments. So as you can see from this cover page,
in this project we study the use of latent class analysis for identifying subclasses of depression using JMP Pro. During the presentation we may use the short form of LCA to refer to latent class analysis.
Okay.
To give you more background on our project, we're looking at depression, a mental illness that is getting more and more common amongst teens and adults of all ages. According to WHO,
the World Health Organization, depression is leading cause of disability worldwide and it's a major contributor to the overall global burden of disease.
The ability to get an accurate diagnosis of the severity of depression has always been the key stumbling block in the care of depressed patients.
There is a questionnaire that is commonly used to help with a diagnosis of depression and to determine the severity.
it's called the PHQ, public health questionnaire.
It is a nine question instrument that is...
that contains these questions. The first have...the first one would be you have little interest in doing things; the second being, you feel down depressed or hopeless.
The third being, do you have trouble sleeping or are you sleeping too much? The fourth would be do you feel tired or have little energy?
The fifth questions would be do you have poor appetite or are you overeating? And next few questions also have...we have, do you feel bad about yourself?
Do you have trouble concentrating on things? And eighth question would be, do you move or do you speak slowly or do move or speak too fast? The last question that we have included in our...in our project is,
do you have suicidal thoughts?
A, B, C, D. On the right, you know, there's a table there, and you can see, like in the way that you read this table is that A and B have the same score of 10 here,
And the severities are considered moderate. C and D, on the other hand, have the same score of 20 and are considered severely depressed under PHQ-9.
So the interviewer will ask them how frequently they experienced the symptom, in a scale of zero to three. As you can see, like you know, the response in this table is mainly like just 0, 1, 2, and 3.
Zero means they're not experiencing any symptoms and three is the most frequent, nearly every day. The overall score is derived by summing up the responses of all nine questions, ranging from zero to 27.
Now, if we dig deeper in the results of C and D here,
you can see that they both have a score of 20, but the symptoms that they're experiencing are quite different. For example, C often feels bad about himself,
here, like you know but I'm sorry, I'm pointing more on the different right, but D never felt that.
And C rarely has any suicidal thoughts, but D is experiencing it nearly every day. The question we have here is that
do this symptoms have the same contribution to the diagnosis of depression? So from a clinical standpoint, PHQ-9 has been assessed to be a valid instrument of measuring severity of depression. However, our team opined the popular...this popularly used
five categories of depressions severity based on the sum of responses
is overly subjective.
Hence the purpose of our project is to demonstrate...to find out how latent class analysis of JMP can be used to provide a data driven and
objective approach to determine that depression severity classes. The analysis results reveal that LCA improves our understanding of the characteristics of depression classes better than the conventional five category methods.
So using the same set of nine questions taken from the questionnaire data set of national health and nutrition examination survey from 2017 to 2018, we use LCA to classify 5,000 respondents
into depression severity classes. We then compare that to the PHQ-9 results that are using mere summation of responses...
response variables to arrive at the severity levels of depression. This comparison of classes will reveal the differences in the severity of depression classes between the two methods.
Yeah, let me go into more details in the next slide.
How we first prepared the data, so we first downloaded that the publicly available data set from the national health and nutrition examination survey for the years I've mentioned just now, 2017 to 2018.
We use only these years' data to maintain data quality, because we were slightly worried that there will be a difference in the way respondnts give an answer over different time periods. So we then tidy up the data in JMP Pro.
We use JMP Pro 16. In the data set, there's a total of 10 columns originally ??? responses.
But we noticed that the final question is like a consolidation question. It asks like, how difficult have these problems made it for you to do your work, take care of things at home or get along with people? So because of the nature of
it being a consolidation question, we have excluded from our work.
After that we conduct a few data cleaning and preparation steps, such as checking for missing values, renaming some of the columns to reflect the variables better for latent analysis.
The PHQ-9 score was also tabulated in a new column and recoded into five categories of depression severity, namely none, mild, moderate, moderately severe and severe. With that, finally we run the latent class analysis and interpret the results.
So I will run you through the detailed steps of the analysis that we performed. Now first of all, we used the JMP Pro 16 to run the LCA to group the interview individuals with distinct patterns into clusters.
And the way that we are choosing the clustering model is based on these models' BIC and AIC values.
From the model reports, we evaluated the effect sizes of respective variables to identify the separation between latent classes.
And because that indicates the accuracy of the categorizing of individuals into different...correctl latent class. We also look at log worth values to determine the influence of each variables on the model.
Next, we carried out a contingency table analysis to compare the LCA classes to the five categories of depression severity classes derived using the traditional PHQ-9 methods.
Our hypothesis is that the popularly used PHQ-9, based on the sum of responses I mentioned, is overly subjective, as it does not consider the statistical...statistical weightage of individual variables. Comparing this to the LCA results is populated using
statistical methods, which allows us to to examine the valid...validity of the results better.
So with this, I'll be handing over to to Karishma to tell us more about the findings and conclusions.
Over to you, Karishma.
You are mute, Karishma.
Karishma YADAV Oh sorry.
Thank you, Yi. Hi everyone. I'm Karishma Yadav and I will be taking you forward with our presentation.
So we know that it is important for researchers in the field of mental health to be able to identify subgroups of individuals
within the general population, who have similar patterns of mental health symptoms or strengths, so these groups can be studied further to investigate matters, such as which mental health classes are commonly prevalent,
and what future outcomes they predict and whether mental health classes change over time. In addition, identifying these subgroups is important for practitioners
to...in the field to treat and prevent mental health cases. So an increasingly popular method for identifying subgroups is latent class analysis, which is...which will be abbreviated as LCA
from now on in the presentation. So LCA is a cross sectional latent variable mixture modeling approach. Like all latent variable mixture modeling approaches, LCA aims to find the heterogeneity within the population.
It does this by analyzing individuals' pattern of behavior, such as mental health indicators, and finding common types, called classes. So each individual is probabilistically assigned to a class,
which is calculated using the formula that is displayed. And the results are subgroups of individuals who are most similar to each other and most distinct from those in other classes. Next slide.
So this is how we build the LCA model in JMP Pro 16. So as mentioned earlier by Yi, in the pre processing steps we created
PHQ-9 variables by summing the nine input variables, the questions that are asked in the PHQ-9 questionnaire, and then the PHQ-9 score, which ranges from zero to 27, is used to arrive at five classes.
The classes are based on cut points, which are prevalently used in many medical diagnosis by clinicians. So the classes are mentioned by Yi earlier are
none, mild, moderate, moderately severe, and severe. So after pre processing, we decided to build a model in JMP Pro by selecting LCA as the clustering analysis method, and then chose the nine questions of the PHQ-9 latent
as the input features shown here. So we ran the LCA model for clusters, ranging from three to 15 to find the best fit model.
Next slide.
So this is the result of the LCA. The latest class models are estimated iteratively, adding potential classes to determine which model is best fit to the data. So judging the best fit can take a lot of consideration,
including information criteria. So here we have used information criteria.
As as we can see, cluster five is giving us the lowest BIC, the Bayesian information criteria. We also tried to run the model over increased number of clusters, but the AIC kept on shifting with the increasing number of classes, so we chose BIC as the optimum cluster size.
So therefore, the individuals were divided into five classes, based on the questionnaire responses. Next slide.
So these are the charts that show the distribution of individuals in each of the classes and the proportion of responses in each variable.
So it also shows the response...response probabilities of each of the nine variables in each of the classes. So it can be seen that in cluster one,
all the variables have not at all as the predominant response, and this is the lowest risk
cluster. In cluster two, there is a decreased proportion of not at all as the response variable and
increased proportion of several days as the response. Cluster three has increased proportion of several days and more than half of the days as the response, and clusters four and five have further increased proportions of these high frequency responses. Next.
none, mild, moderate, moderately severe, and severe.
So, as we can see class one has a size of about 48%, and individuals in this class do not experience any depressive symptoms at all, or the response is not at all for all the nine variables. This group of individuals has no risk of experiencing depression.
Class two has mild symptoms, with the size of about 24%.
Individuals...the individuals in this class have two predominant features listed here, which are trouble sleeping or sleeping too much
several days a week, feeling tired or have low energy several days a week. And individuals in class three
have moderate symptoms and a class size of about 11%; 70% of them have responded that they feel tired or low energy several days a week. They also experienced increased severity of symptoms listed in class two.
Additionally, they also experienced other symptoms, like have no interest in doing things, feeling down and depressed and hopeless.
Class four has in size of 10.44% and this group of individuals has even higher severity of depressive symptoms, so they they experience about seven out of nine of the depressive symptoms
quite frequently. Then we have class five with our class size of about 6%. They experienced all the nine symptoms and the severity is
very high. They experience it like every day, or more than half the days. Next slide.
So these are the effect sizes of the nine variables and the log worth. So the effect sizes are, as you can see, values between zero and one,
and higher effect sizes indicate higher likelihood of individuals being classified accurately in a latent class.
So majority of the input variables have effect sizes higher than 75%, and log worth, on the other hand, is effectiveness of the variables in predicting the clusters.
The larger the log worth, the stronger the variable is in the model. The results indicate that not all variables share the same log worth values.
This confirms our suspicion that the variables might not contribute equally to the diagnosis and that summation of PHQ-9 scores will only be accurate and reliable if each variable has the same level of influence in the model is negated. So next.
So finally,
this is the contingency table. As we can see the
depressive symptoms derived out of the
PHQ-9 summation and contingency table arrived at the classes that we derived using
letent class analysis are quite different.
The P value is about
.001,
which says that there is a no sampling error. So we did this to justify that the results will be consistent if we sample about 1,000 times, 1,000 different samples, only one out of them will have a different result than this.
Next.
So, finally, we conclude the depression severity levels derived using statistical clustering models like LCA will provide more accurate assessment of depression severity,
compared to those derived using mere summation of responses of the PHQ-9.
Further studies need to be done to determine the weightage of the respective variables to strengthen the PHQ-9 nine model that has proved to be more reliable and efficient. This is important
because underestimating the depression severities of individuals and the percentage of depression population
imposes significant risks to the society. Without an accurate diagnosis method, it becomes difficult to determine the adequate resources required to prevent and treat depression.
So that's all for our presentation, and questions are welcome in the comment box that you can see below the video. Thank you.
Larry LaRusso Excellent.
awesome Thank you very much.
I'm very happy with it, I mean I thought it was it sounded great your slides are awesome.
Karishma YADAV Thank you.
Larry LaRusso yeah everything goes smoothly according to you, I I don't I didn't see any problems and actually very interesting talk I.
I was very interested in not quite as the technical you guys, as far as what sticks have it go but.
I'm sure it'll be pocket.
Karishma YADAV Thank you.
yeah.
Larry LaRusso Well, get some sleep.
Karishma YADAV yeah.
Larry LaRusso it's approaching 11 now but um.
Thank you very much, and again we'll let you know if there's any issues or whatever, and if you would you guys.
I get a link that I can share with external folks I'll send it to you all, if you just want to look at it and see what it looks like but it's just going to be the raw P so it's going to have all our front end stuff, and all this back end stuff but.
If you're just curious I'll go ahead and send that to you just for your interest and then just know that we will produce it, you know and do some will make it, you know interactive and look nice before because.
Okay, all right.
yeah yeah I have a great day evening, whatever the case, might be, and thanks again for for joining me.
Karishma YADAV Thank you.
Larry LaRusso Okay, good to meet you all take care.
Karishma YADAV Thank you, same here good night and good.
morning to you.
Larry LaRusso bye bye.
hi.