Hi @ragnarl,
I can see how this appear contradictory! All mean squares involve dividing a sum of squared deviations by their degrees of freedom. In your formula working with the results of LinearRegression() you appear to be dividing by n, not df.
rmse_lin_reg=sqrt(sum((z-y)^2)/nrows(y));
The degrees of freedom for the mean squared error in a simple linear regression is n-2 (1 df lost to estimating the intercept, and 1 more is lost to estimating the slope of x). If you adjust your script as below to have rmse_lin_reg=sqrt(sum((z-y)^2)/(nrows(y)-2))
you will find the same value for MSE (and thus RMSE).
I hope this helps!
@julian
x = [1.309, 1.471, 1.49, 1.565, 1.611, 1.68];
y = [2.138, 3.421, 3.597, 4.34, 4.882, 5.66];
{Estimates, Std_Error, Diagnostics} = Linear Regression( y, x, <<printToLog );
z=Estimates[1]+Estimates[2]*x;
rmse_lin_reg=sqrt(sum((z-y)^2)/(nrows(y)-2));
as table(x,y,<<Column Names({"x","y"}));
biv=Bivariate(
Y( :y ),
X( :x ),
Fit Line( )
);
rmse_lin_fit=((biv<< report())["Summary of Fit"][Number Col Box(1)] <<get())[3];
show(rmse_lin_reg,rmse_lin_fit);
returns
rmse_lin_reg = 0.13656988109602;
rmse_lin_fit = 0.13656988109602;