<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Why is the RMSE returned by the Linear Fit deviating the sqrt(sum(residuals^2)/n)? in Discussions</title>
    <link>https://community.jmp.com/t5/Discussions/Why-is-the-RMSE-returned-by-the-Linear-Fit-deviating-the-sqrt/m-p/332787#M58107</link>
    <description>&lt;P&gt;I am confused regarding the Root Mean Square Error reported in the "Summary of Fit" in the Linear Fit platform.&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;x = [1.309, 1.471, 1.49, 1.565, 1.611, 1.68];
y = [2.138, 3.421, 3.597, 4.34, 4.882, 5.66];
{Estimates, Std_Error, Diagnostics} = Linear Regression( y, x, &amp;lt;&amp;lt;printToLog );
z=Estimates[1]+Estimates[2]*x;

rmse_lin_reg=sqrt(sum((z-y)^2)/nrows(y));

as table(x,y,&amp;lt;&amp;lt;Column Names({"x","y"}));
biv=Bivariate(
	Y( :y ),
	X( :x ),
	Fit Line( )
);
rmse_lin_fit=((biv&amp;lt;&amp;lt; report())["Summary of Fit"][Number Col Box(1)] &amp;lt;&amp;lt;get())[3];
show(rmse_lin_reg,rmse_lin_fit);&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The result of the last line:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;rmse_lin_reg = 0.111508840972607;
rmse_lin_fit = 0.136569881096024;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;What is wrong here?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you&lt;/P&gt;</description>
    <pubDate>Fri, 09 Jun 2023 23:43:03 GMT</pubDate>
    <dc:creator>ragnarl</dc:creator>
    <dc:date>2023-06-09T23:43:03Z</dc:date>
    <item>
      <title>Why is the RMSE returned by the Linear Fit deviating the sqrt(sum(residuals^2)/n)?</title>
      <link>https://community.jmp.com/t5/Discussions/Why-is-the-RMSE-returned-by-the-Linear-Fit-deviating-the-sqrt/m-p/332787#M58107</link>
      <description>&lt;P&gt;I am confused regarding the Root Mean Square Error reported in the "Summary of Fit" in the Linear Fit platform.&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;x = [1.309, 1.471, 1.49, 1.565, 1.611, 1.68];
y = [2.138, 3.421, 3.597, 4.34, 4.882, 5.66];
{Estimates, Std_Error, Diagnostics} = Linear Regression( y, x, &amp;lt;&amp;lt;printToLog );
z=Estimates[1]+Estimates[2]*x;

rmse_lin_reg=sqrt(sum((z-y)^2)/nrows(y));

as table(x,y,&amp;lt;&amp;lt;Column Names({"x","y"}));
biv=Bivariate(
	Y( :y ),
	X( :x ),
	Fit Line( )
);
rmse_lin_fit=((biv&amp;lt;&amp;lt; report())["Summary of Fit"][Number Col Box(1)] &amp;lt;&amp;lt;get())[3];
show(rmse_lin_reg,rmse_lin_fit);&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The result of the last line:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;rmse_lin_reg = 0.111508840972607;
rmse_lin_fit = 0.136569881096024;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;What is wrong here?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you&lt;/P&gt;</description>
      <pubDate>Fri, 09 Jun 2023 23:43:03 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Why-is-the-RMSE-returned-by-the-Linear-Fit-deviating-the-sqrt/m-p/332787#M58107</guid>
      <dc:creator>ragnarl</dc:creator>
      <dc:date>2023-06-09T23:43:03Z</dc:date>
    </item>
    <item>
      <title>Re: Why is the RMSE returned by the Linear Fit deviating the sqrt(sum(residuals^2)/n)?</title>
      <link>https://community.jmp.com/t5/Discussions/Why-is-the-RMSE-returned-by-the-Linear-Fit-deviating-the-sqrt/m-p/332808#M58110</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/4675"&gt;@ragnarl&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I can see how this appear contradictory! All mean squares involve dividing a sum of squared deviations by their degrees of freedom. In your formula working with the results of LinearRegression() you appear to be dividing by n, not df.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;rmse_lin_reg=sqrt(sum((z-y)^2)/nrows(y));&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The degrees of freedom for the mean squared error in a simple linear regression is n-2 (1 df lost to estimating the intercept, and 1 more is lost to estimating the slope of x). If you adjust your script as below to have&amp;nbsp;&lt;CODE class=" language-jsl"&gt;rmse_lin_reg=sqrt(sum((z-y)^2)/(nrows(y)-2)) &lt;/CODE&gt;you will find the same value for MSE (and thus RMSE).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I hope this helps!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/2026"&gt;@jules&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;x = [1.309, 1.471, 1.49, 1.565, 1.611, 1.68];
y = [2.138, 3.421, 3.597, 4.34, 4.882, 5.66];
{Estimates, Std_Error, Diagnostics} = Linear Regression( y, x, &amp;lt;&amp;lt;printToLog );
z=Estimates[1]+Estimates[2]*x;

rmse_lin_reg=sqrt(sum((z-y)^2)/(nrows(y)-2));

as table(x,y,&amp;lt;&amp;lt;Column Names({"x","y"}));
biv=Bivariate(
	Y( :y ),
	X( :x ),
	Fit Line( )
);
rmse_lin_fit=((biv&amp;lt;&amp;lt; report())["Summary of Fit"][Number Col Box(1)] &amp;lt;&amp;lt;get())[3];
show(rmse_lin_reg,rmse_lin_fit);&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;returns&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;rmse_lin_reg = 0.13656988109602;
rmse_lin_fit = 0.13656988109602;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 12 Nov 2020 12:31:39 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Why-is-the-RMSE-returned-by-the-Linear-Fit-deviating-the-sqrt/m-p/332808#M58110</guid>
      <dc:creator>jules</dc:creator>
      <dc:date>2020-11-12T12:31:39Z</dc:date>
    </item>
  </channel>
</rss>

