I have struggled for a long while looking for uncertainty measures from machine learning models that are comparable to the standard errors that you routinely get from regression models. Only recently I have become aware of some of the capabilities of the profiler - in particular, the bagged predictions. But I don't really understand how to (or if I should) interpret those bagged predictions. When I run a machine learning model (for example, a neural net) and I save the bagged predictions from the profiler, I get a Bagged mean, the standard error of the bagged mean, and the bagged standard deviation. Comparing these with a regression model (multiple regression, for example), I've observed the following relationships:
- The predicted bagged mean from the NN is very similar to the prediction formula from the multiple regression.
- Mean prediction intervals from the multiple regression model are much narrower than the individual prediction intervals as expected (in the example I am looking at, the standard error for the mean prediction is about 1/10 the size of the standard error of the individual predictions).
- The standard error of the bagged mean from the NN is much smaller than the bagged standard deviation (about 1/10 the size in the example I am looking at).
These observations tempt me to think of the standard error of the bagged mean from the NN as analogous to the standard error of the mean predictions from the regression model. Similarly, the bagged standard deviation may be similar to the standard error of the individual predictions from the regression model.
However, the standard errors from the NN and the regression models do not resemble each other at all! So, my question is whether my interpretation makes any sense - or, exactly how can the standard errors from the bagged mean be interpreted or used.
Thanks in advance for any insights. I am attached an concrete example in case it helps with my question (this is the validation data set from my modeling example - with the predictions from the multiple regression model and NN included).