cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
md2021
Level I

Prediction Profiler Confidence Intervals

Hello,

 

Does anyone know how confidence intervals are calculated within the prediction profiler? My results do not make sense. The range of values presented in the confidence interval are above the range of the input data values. See below:

 

Screenshot of data (values in the 9s):

md2021_1-1614119906475.png

Screenshot of prediction profiler with confidence interval, within a Fit Least Squares Model (values in the 10s): 

md2021_0-1614119866578.png

I do not understand the disconnect between the input data and the resulting confidence interval in the profiler. My file is attached if anyone is willing to take a look and find any mistakes. Thank you in advance! 

5 REPLIES 5
dale_lehman
Level VII

Re: Prediction Profiler Confidence Intervals

I'm not sure what you are looking at that you find inconsistent.  If I look at the model output and the confidence intervals for the regression coefficients, they seem to match the profiler confidence intervals pretty closely, so I don't see an inconsistency.  Perhaps you are looking at something else?  You mention the input data, but I'm not sure what kind of confidence interval you are expecting by looking at the input data directly - maybe my eyes just aren't good enough, but I don't think the input data would give you any guide about the confidence interval that would show up in the profiler (since the latter is a feature of the regression model).

Re: Prediction Profiler Confidence Intervals

Dale is correct. The confidence interval is determined from the regression model. The problem is your regression model. Notice that the prediction is also above the range of the input data. Why is this? I would suspect that you have a poor form of a regression model.

 

A few things to consider:

* with this many rows of data, is each row of data truly independent of the other rows?

* It looks like you have a physical lower bound for your response. A linear regression model cannot model that. Perhaps look at a more advanced type of model that can account for that physical boundary.

* you have some very unusual data points (really high response). Has that data been validated?

 

Dan Obermiller

Re: Prediction Profiler Confidence Intervals

The prediction confidence interval is the predicted mean response plus or minus a multiple of the standard error of the predicted response. The standard error is the square root of the variance of the predicted response, which is a function of the predictor levels and the covariance matrix for the parameter estimates.

 

Here is a simple example in which I regressed :weight versus :height from the Big Class sample data table. I set the :height = 59, the same as the first observation.

 

predict.PNG

 

Then I held the Alt key on Windows (Option key on Macintosh) and clicked the red triangle at the top of Fit Least Squares to get this dialog from which I selected Mean Confidence Interval Formula.

 

menu.PNG

 

You can examine the column formula to see the calculation.

 

formula.PNG

 

The linear predictor starts the formula to estimate the mean, then the rest is subtracted for the lower confidence bound. The 2.024 multiplier is the t quantile for 95% confidence and the error degrees of freedom.

 

What quantity did you expect for the interval?

md2021
Level I

Re: Prediction Profiler Confidence Intervals

I was expecting something around [9.6, 9.7]. This is what another program (Simio) calculated, and I cross-checked it with my manual calculations in Excel using this formula:

md2021_0-1614179606141.pngmd2021_1-1614179623909.png

where alpha = 0.05 and t = 1.96. I did not expect an exact match, but I did expect some overlap. 

dale_lehman
Level VII

Re: Prediction Profiler Confidence Intervals

I don't understand why you would expect something around 9.2 or 9.3.  The mean for your response variable is 4.2.  I think you are looking at the profiler the wrong way.  It is not a prediction for any particular observation - you can simulate the prediction (and its confidence interval) for any hypothetical set of factors you enter for the factors shown in the profiler.  So, the profiler is showing a prediction interval for single hypothetical observations. 

 

You might just save the confidence intervals (either mean or individual) after running the model (found under Save Columns) and look at the confidence intervals for the entire data set predictions.  They look about right to me (aside from any improvements to the model that might be appropriate).