Hi, my name is Brittany Burlison, and my copresenter is...
I'm Kailey Wilson.
We are both second year master's students at Oklahoma State University,
getting a master's in business analytics and data science.
Today, we are going to present our research in what is most important
in determining heart disease and stroke.
We will be going over our research overview,
the methods that we've used in our data overview, our data analysis,
and results and implications, and what we've done in JMP.
Heart disease and strokes are two major diseases
that have been around for years, and there's still no cure for them.
Heart disease is a leading cause of death in the United States.
A person dies every 30 seconds from heart disease.
Of these deaths, one in six die due to a stroke,
and strokes are the leading cause for long-term disabilities.
For our research, we are looking to see if these two major diseases
have any common factors that will be able to predict each other.
We are interested in seeing what factors are most important in determining
whether a person will suffer from stroke or heart disease in their lifetime.
We are wanting to take variables
that correlate to the Social Determinants of Health to see what variables play
a bigger role in determining these major health issues.
For our data,
we will be using for analysis is the data
from the Behavioral Risk Factor Surveillance System, in short, BRFSS,
from the CDC.
This is a phone survey that collects data from citizens regarding
a plethora of information.
We will be using data from 2016 to 2020.
This contains over 500 fields and over 2 million observations.
Some of the fields contain information about households,
current health conditions, behaviors and demographics.
Additionally, some States have the option to be more specific health questions,
and those are considered too.
We will be looking at the variables that people are asked.
For the methods and plans that we are going to use.
Our data site contains over 500 variables, as we mentioned,
so we have narrowed that list down to 11 that we have deemed
the most important in determining heart disease or stroke.
We have referenced the social determinants of health
to help us make this decision on which variables we should keep.
And we have determined a few that we'll go over in the next slide.
So we're using JMP, specifically, the fit model resource in JMP
and graph builder.
The factors that we are considering is a person's sex, their age, and their race.
So for our variable selection,
we have determined that income, housing, education, mental health, health coverage,
overall general health, smoking status, diabetes state, divorce, and medical costs
were the most important variables to look at.
We will be using stroke and heart disease as our response variables.
We will look at these variables by gender using the sex variable.
Then we will concatenate all five years of our data in JMP,
run a fit model test to determine which preselected variables
are the most important in determining heart disease and stroke.
Kaylee will go over our data analysis and what we have found.
Thank you, Britney.
The first response variable that we looked at is heart disease.
When sex is 1, that means it's a male.
So as we can see in our output,
that the most important variables, based on their log worth,
were general health, diabetes, and if they were a smoker.
Even though the RS quare is pretty low, which means that only 8 % of the data
is explained by these variables, since the p- value is very small
that means that the variables that we have selected are very significant.
Same we can see over when it's a female.
Similarly, the most important variables are general health, diabetes,
and if they smoke or not.
We can come to the similar conclusion
that the RS quare is very low,
which makes sense since there are 500 variables.
But the variables that were selected are still very significant.
Next, we wanted to look at what heart disease looked like
based on general health.
So general health was a variable that was split into nine buckets.
One being excellent health and nine being very poor.
So we can see that.
When heart disease is one, that means that they had heart disease,
and when it's two, that means they did not have heart disease.
As we can see, when general health is two or three,
which means very good general health or good general health,
those two had the highest number of heart disease.
Next, we wanted to look at stroke.
For stroke, for a female,
the most important variables, out of the variables we selected
were diabetes, general health, and then education.
Similarly, we have a very low RS quare,
but our significance or our p- value is very small,
which means that all of these variables are still very significant.
Similarly, for males,
the most important variables are diabetes, general health.
T hen the RS quare for this one is the smallest RS quare we have seen,
but we still have a p- value of less than...
A very small p- value, which means it is still very significant.
S imilarly, as we did for heart disease, we built a graph too
based on general health,
to look at where stroke fell in the general health response.
And the general health it falls into is, again, two and three,
which means people with very good health,
or good health, are most likely to have stroke.
Then we went and we created our own variable
for when someone had a heart disease
and stroke, they would return a value of one,
and when they didn't have it, it would be zero.
So here we can see for heart disease and stroke, the most important variables
are general health, diabetes, income, if they smoke, and education.
This RS quare is our highest RS quare, which is really good.
This means that most of the data is represented in this
and our p- value is still very small,
which means that all of these are significant.
Then again, we made a graph to see where the general health it fell.
We can see that for when someone has heart disease and stroke,
it falls in three, which is good general health.
Our conclusions is, we found that the most important variables
determining whether or not a person will have heart disease
is general health, diabetes, smoking,
and if their parents are divorced, and that was for the males.
Then for females, it's general health, diabetes, smoking and income.
Then I'm looking at stroke.
For a female, it's diabetes, general health and education.
In males, it is diabetes, general health and health insurance.
Then for both of them combined,
the most important ones are general health, diabetes,
income, if they smoke, and their education.
So drawing to a close, our overall implications.
We would say, to help prevent heart disease,
people should improve their overall general health, monitor their diabetes,
decrease their nicotine use, etc.
Then to help prevent stroke,
people should improve their general health,
monitor their diabetes as well,
and think about improving their health care plan.
Then overall, people should just focus on
their general health to prevent heart disease and stroke and any other diseases.
We believe that doctors and healthcare providers,
if they take this into consideration,
these are super important factors in determining whether a person
will suffer from heart disease or stroke in their lifetime,
and they will be able to provide better health care options
to their patients.
Additionally, we feel that if the general public
take these factors into consideration, it can help reduce the risk of stroke
or heart disease overall in the general public.
We thank you for listening to our research,
and if you have any questions, please let us know.
Thank you.