## Is this a real effect ? Matched pairs - Bivariate - Regression

Occasional Contributor

Joined:

Aug 16, 2017

Hello,

please kindly help a newbie here - is this a real effect.

I am testing a new method to test (mean 10) new method to a goldstandard (paco2). I am also examining the effect of another variable to test for an improved model.

I started with a Bland Altman Analysis (mixed pairs) and my methods shows good correlation 0.69 (small CI), I also test the independent variable to my gold standard, and even here I find an "effect".

So I started to do a bivariate analysis for both (mean 10) - not shown and SaO2. Mean 10 shows a good correlation to paCO2, although r square is only 0.48. Also my other variable (SaO2) shows some effect in the bivariate analysis with p <0.001.

No analyzing those both effects (picture to the right) - both seem to have a significant effect. And the model improved with r square of 0.67.

But is this a real effect ? Or am I doing my analysis wrong here ?

Any comments and support much appreciated ! Many thanks, Mark

2 ACCEPTED SOLUTIONS

Accepted Solutions

Staff

Joined:

Jun 23, 2011

Solution

I am not sure about the first comparison shown, paCO2 versus SaO2, but the plot reveals that the Matched Pairs analysis is not appropriate. Why? This analysis assumes that the only difference is in the average level. That is, there can be a fixed difference but not a proportional difference. The plot should show a horizontal band of data. Instead you can see that it is tilted down from the upper left corner to the lower right corner. The computations are correct but the results here are meaningless.

The second comparison is Mean 10 versus paCO2. This case is appropriate for Matched Pairs because the data points appear as a horizontal band. The confidence interval and the two-sided t-test for the difference are both significant at alpha = 0.05. The point estimate indicates that Mean 10 is lower by 9 units.

The third comparison changes the focus. Now it is paCO2 versus SaO2. I can only see the the top of the Analysis of Variance report. It hows a high F ratio, > 500, which is probably significant at alpha = 0.05. I cannot see the Parameter Estimates report. These two variables appear to be linearly related, ignoring other variables.

The fourth comparison is a multiple regression of paCO2 versus Mean 10 and SaO2. This analysis is also a different focus. Are you comparing Mean 10 to paCO2 or not? The analysis shows that there is a strong relationship between paCO2 and both predictors. The Mean 10 and SaO2 are not correlated enough to impact the significance, though. Mean 10 has a slope of 0.9 (assuming that all the error is in Mean 10). SaO2 has a slope of -0.3, indicating a negative bias on paCO2.

Is this effect real? (I assume that you mean SaO2 bias when you ask about 'the effect.') I would first continue the regression analysis with diagnostics so that you can trust the results as far as they go. Have you examined the residuals at all? Have you examined the variance inflation factor for the parameter estimates? Finally, an empirical model would have to be empirically confirmed/verified/validated. Can you make samples with desired levels of Mean 10 and SaO2 to confirm the model prediction for paCO2?

Learn it once, use it forever!

Joined:

Jun 5, 2014

Solution

You ask, "...is the effect real?" My colleague Mark Bailey has offered wise advice. I'll just pile onto his second to last sentence around confirming/validating empirically with a bit of a complementary perspective.

What does your knowledge of the process tell you wrt to the POSSIBLE existence of an effect? One of the points I always try to make to those that are new to the world of statistical inference is that if your p-values, t-values, F-ratios, and any other inferential statistic tell you that you've now 'proven', and note the quotes, because statistical inference really proves nothing (but that's another story for later) an effect akin to 'water runs uphill'. Well then something is amiss. Maybe there were lurking variables in the experiment. Maybe there was measurement system bias. Maybe there was (fill in the blank).

If your knowledge of physics, chemistry, biology, or socio economic behavior says the effect is unlikely or impossible...then in my book that trumps p-values and their brethren every day.

4 REPLIES

Staff

Joined:

Jun 23, 2011

Solution

I am not sure about the first comparison shown, paCO2 versus SaO2, but the plot reveals that the Matched Pairs analysis is not appropriate. Why? This analysis assumes that the only difference is in the average level. That is, there can be a fixed difference but not a proportional difference. The plot should show a horizontal band of data. Instead you can see that it is tilted down from the upper left corner to the lower right corner. The computations are correct but the results here are meaningless.

The second comparison is Mean 10 versus paCO2. This case is appropriate for Matched Pairs because the data points appear as a horizontal band. The confidence interval and the two-sided t-test for the difference are both significant at alpha = 0.05. The point estimate indicates that Mean 10 is lower by 9 units.

The third comparison changes the focus. Now it is paCO2 versus SaO2. I can only see the the top of the Analysis of Variance report. It hows a high F ratio, > 500, which is probably significant at alpha = 0.05. I cannot see the Parameter Estimates report. These two variables appear to be linearly related, ignoring other variables.

The fourth comparison is a multiple regression of paCO2 versus Mean 10 and SaO2. This analysis is also a different focus. Are you comparing Mean 10 to paCO2 or not? The analysis shows that there is a strong relationship between paCO2 and both predictors. The Mean 10 and SaO2 are not correlated enough to impact the significance, though. Mean 10 has a slope of 0.9 (assuming that all the error is in Mean 10). SaO2 has a slope of -0.3, indicating a negative bias on paCO2.

Is this effect real? (I assume that you mean SaO2 bias when you ask about 'the effect.') I would first continue the regression analysis with diagnostics so that you can trust the results as far as they go. Have you examined the residuals at all? Have you examined the variance inflation factor for the parameter estimates? Finally, an empirical model would have to be empirically confirmed/verified/validated. Can you make samples with desired levels of Mean 10 and SaO2 to confirm the model prediction for paCO2?

Learn it once, use it forever!

Joined:

Jun 5, 2014

Solution

You ask, "...is the effect real?" My colleague Mark Bailey has offered wise advice. I'll just pile onto his second to last sentence around confirming/validating empirically with a bit of a complementary perspective.

What does your knowledge of the process tell you wrt to the POSSIBLE existence of an effect? One of the points I always try to make to those that are new to the world of statistical inference is that if your p-values, t-values, F-ratios, and any other inferential statistic tell you that you've now 'proven', and note the quotes, because statistical inference really proves nothing (but that's another story for later) an effect akin to 'water runs uphill'. Well then something is amiss. Maybe there were lurking variables in the experiment. Maybe there was measurement system bias. Maybe there was (fill in the blank).

If your knowledge of physics, chemistry, biology, or socio economic behavior says the effect is unlikely or impossible...then in my book that trumps p-values and their brethren every day.

Occasional Contributor

Joined:

Aug 16, 2017

Peter,

many thanks for the wise advise. It is a suprising physiological result for me as well, but at the end I had to trust my calculations. Many thanks, Marc

Occasional Contributor

Joined:

Aug 16, 2017

Mark,

Yes I looked at the residuals and I got not any matching pattern, but normally distributed. (see figure 1)

I also looked specifically at our database to create some real time variables, and in fact the fit of prediction of paCO2 is better with implementing the second variable. In a graphical analysis the the prediction formula for paCO2 based on Mean and SpO2 (I replaced SaO2 with SpO2) is perfect, and gives me a perfect correlation coefficientg (figure 2).

Can I save the rows which lie outside the 90% density ellipse ?

Thanks a lot again for all your help ! Marc

fig 1fig 2