Hi - I am doing a multiple linear regression to predict ecoli. My independent variables are turbidity and streamflow. My dependent variable is ecoli. I am training the regression model based on 188 values of turbidity, streamflow, and ecoli. I will then use future turbidity and streamflow values to predict the expected ecoli value.
My question is this. How do I account for the fact that each of the ecoli meaurements in my training dataset (n=188) can vary, on average, 16% due to measurement variations that are unrelated to turbidity and streamflow. (That is, I collected ecoli duplicates for 29 of the 188 samples and the average variability was +/- 16%.) So....the ecoli used to train the regression model is imprecise. How do I account for this when I attempt to use my regression results to predict a future value of ecoli?
Thanks in advance!