Solved: how do I adjust for covariates

Report Inappropriate Content · Nov 17, 2019 05:57 AM

I have a data set with two cohorts . They have outcome differences ( Length of stay, continious variables) on univariate analysis. They are however not similiar in baseline characterstics ( have different severity of illness ( continious), age ( continious) and diagnosis ( categorial variables ).

How do I adjust the outcome variable based on age, disease severity and diagnosis ?

Mark_Bailey · Nov 21, 2019 01:28 PM

A proper model makes all the difference! The choice is at the discretion of the researcher but it should be guided by prior knowledge, theoretical foundations, objective comparison with alternative models, and evaluation of the model assumptions.

The parameter estimates are used in a linear combination to determine the distribution parameters. They can be difficult to interpret on their own. You might click the red triangle at the top and select Profilers > Profiler. You can change factor levels and see the change in the predicted response.

I have another idea about the lack of fit. I do not see the test for over-dispersion, which is a common occurance. The Poisson distribution has a single parameter. It is the mean and the variance of the distribution. Real distributions of counts often exhibit a variance that is greater than the mean. You should find check boxes when you select Generalized Linear Models for the over-dispersion tests and intervals and for the Firth bias-adjusted estimates. I recommend selecting both of these options.

Otherwise, you might consider adding terms for potential interaction and non-linear effects.

View solution in original post

Mark_Bailey · Nov 17, 2019 07:46 AM

Include the covariates as additional predictors in the multiple regression model.

Select Analyze > Fit Model. Select the response data column and click Y. Select the predictor/factor and covariate data columns and click Add. Now click Run. Use the Effect Tests to determine the significance of each term. Use the Parameter Estimates to determine the importance of each term. Click the red triangle and select Factor Profiling > Profiler to understand the contribution of changes in each predictor and covariate to the response.

See Help > Books > Fitting Linear Models for more details and examples.

Sandeep123 · Nov 17, 2019 09:48 AM

Thank you. That is the most logical way of doing it.

I am not entirely sure that the result I am getting is correct however.

On univariate comparison for the outcome variables (ICU length of stay, hospital length of stay and ventilator days) between cases and controls there is very significant difference (P value< 0.0001) . There are indeed differences in age groups, and severity of illness score

The severity of illness scores are higher in the control groups then the cases (mean score of 2.5 in the control compared to -4.5 in the case).Which would mean that the control group should have a longer length of stay. And when I adjust for the severity of illness score the difference would be even more prominent.

I'm assuming that the regression model is not taking the negative sign into account (The patients who are less sick have a negative score.. The score becomes more positive as they become sicker)

Mark_Bailey · Nov 17, 2019 03:59 PM

You are not making sense. "On univariate comparison for the outcome variables (ICU length of stay, hospital length of stay and ventilator days) between cases and controls there is very significant difference (P value< 0.0001) . There are indeed differences in age groups, and severity of illness score" is not a univariate comparison. Also, why would you expect there to be no differences?

Aside from the issue of the assumptions of the regression model, the differences (e.g., higher severity of illness score in control group) should be modeled without difficulty. In fact the larger the difference, the more significant the effect and the more important the parameter estimates.

I have no idea what you mean by "adjust."

I have no idea what you mean by "the regression model is not taking the negative sign into account."

Sandeep123 · Nov 17, 2019 2:18 PM

I am sorry I am a clinician not a statistician. Unfortunately the place where I work does not have statistician so I have to try to figure it out myself.

I will try again

I have a retrospective data of case and control (required escalation of care or not). Outcomes of these two groups , length of stay in the hospital are significantly different when I use Fit Y by X by non-parametric Wilcoxon test. ( cases have higher Length of stay)

the two groups are not similar however and controls have severity of illness which is much higher than the case .

Since the patient whose severity of illness or higher are expected to stay in the hospital also longer, it should be expected that when I include the severity of illness in the model, the difference in length of stay between the two groups would become more significant.

However when I use the fit model using ICU length of stay as Y and the severity of illness score and categorical variable of case/control in the model effects. I get the following parameter estimates

Intercept 3.99 , prob >t < 0.0001

PIM 3 -0.17, 0.0039

Case/control -0.22 0.514

I'm assuming from this that there is no significant difference between case/control regarding length of stay while including severity of illness in the model.

Mark_Bailey · Nov 18, 2019 09:40 AM

I am inclined to agree with your conclusion but that is only if the analysis indicates that your data meet the model and regression assumptions. This seems to be an important study. Care should be taken in the design of the study and the analysis of the data that was collected in the study. This analysis is not a simple t-test, and a t-test is not that simple.

Is it possible to share more information about the study. For example, repeat the set up for Fit Model but be sure that the Emphasis = Effect Leverage. May you include a picture of Actual by Predicted and the set of Leverage Plots? Also, the Residual by Predicted Plot or Standardized Residual plot? Those plots would go a long way towards assessing if your analysis is OK and then you can move on to conclusions with confidence.

Sandeep123 · Nov 20, 2019 10:33 PM

One of my colleague suggested to use generalised linear model in the personality and select poisson distribution with log link to compare the patients length of stay data. Length of stay is in days ..so it is sort of a count data it's the number of days

This model results in very significant difference, which is making me very suspect about its validity. Pretty much every variable has a P value of less than 0.0001. When I used the default standard least square model, the significant difference between case and control very quickly goes away as I add more variables in the model but with generalised linear model, it remains very significant.

Mark_Bailey · Nov 21, 2019 09:58 AM

I agree that a Poisson log-linear regression model is reasonable for counts of days. A better model will show more significance. Can you share the regression results from JMP? Some of the information is the estimates. The rest is about the quality of the model. All of the information is helpful to assess the validity of your model.

Sandeep123 · Nov 21, 2019 10:55 AM

thank you

I sent the clip of the analysis to you

Mark_Bailey · Nov 21, 2019 11:03 AM

First of all, let me explain that we prefer that all exchanges happen in the discussion area, not in private messages. Why? Because other members of the community who might have similar questions or problems are deprived of part or all of the solution. I understand that sometimes the nature of the problem or the data involves privacy issues and cannot be posted publicly.

Per your message, the regression results are encouraging! Is it possible to show me the residual plot? This plot would help answer the question about one or more influential observations that are skewing the estimates and the tests. The plot also helps assess goodness of fit.

The Deviance test is highly significant, too. This test is for lack of fit. Your model is biased and will not provide accurate predictions. It usually indicates that you are either missing important variables or, more likely, are missing terms to address non-linear effects. You might consider adding cross terms (e.g., sex*age) or powers (e.g., age*age).

how do I adjust for covariates

Re: how do I adjust for covariates

Re: how do I adjust for covariates

Re: how do I adjust for covariates

Re: how do I adjust for covariates

Re: how do I adjust for covariates

Re: how do I adjust for covariates

Re: how do I adjust for covariates

Re: how do I adjust for covariates

Re: how do I adjust for covariates

Re: how do I adjust for covariates

Recommended Articles

Multiple-Group Analysis in Structural Equation Modeling