<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: When to remove outliers when reducing model? in Discussions</title>
    <link>https://community.jmp.com/t5/Discussions/When-to-remove-outliers-when-reducing-model/m-p/849302#M102542</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/49050"&gt;@MetaLizard62080&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you expect high assay variability, do you account for this noise source by using blocking ?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It's great if you can have a model that you're able to validate with domain expertise, this should reduce the possibility of errors. Can you perhaps repeat the tests (or measurements only) that seem to be strange ? This could help you figure out if it's a "systematic error" or "random error" and inform your decision-making.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Instead of directly removing points that seem strange and not described precisely by the model, I would still use them, but lower their influence on the model, by creating a column "weight" with value 1 for "normal" points and a lower value for "strange points", and use this "weight" column in the Model dialog as a Weight variable :&amp;nbsp;&lt;A href="https://www.jmp.com/support/help/en/18.1/index.shtml#page/jmp/elements-in-the-fit-model-launch-window.shtml#ww213135" target="_blank" rel="noopener"&gt;Elements in the Fit Model Launch Window&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Removing points so that the model converges faster will bias the model, and probably create falsely optimistic model results.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hope some of these points make sense to you,&lt;/P&gt;</description>
    <pubDate>Sun, 23 Mar 2025 16:25:21 GMT</pubDate>
    <dc:creator>Victor_G</dc:creator>
    <dc:date>2025-03-23T16:25:21Z</dc:date>
    <item>
      <title>When to remove outliers when reducing model?</title>
      <link>https://community.jmp.com/t5/Discussions/When-to-remove-outliers-when-reducing-model/m-p/849068#M102489</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;When I am reducing a model, I watch the externally studentized residuals to know when outliers are appearing. When should I remove these outliers throughout the model reducing process?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If I see no outliers when the model contains all terms, but then I remove a term and an outlier appears, do you remove that outlier immediately and continue reducing, or continue reducing and then remove the outliers at the end?&lt;/P&gt;</description>
      <pubDate>Fri, 21 Mar 2025 15:13:37 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/When-to-remove-outliers-when-reducing-model/m-p/849068#M102489</guid>
      <dc:creator>MetaLizard62080</dc:creator>
      <dc:date>2025-03-21T15:13:37Z</dc:date>
    </item>
    <item>
      <title>Re: When to remove outliers when reducing model?</title>
      <link>https://community.jmp.com/t5/Discussions/When-to-remove-outliers-when-reducing-model/m-p/849297#M102539</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/49050"&gt;@MetaLizard62080&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Did you read responses from&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/4358"&gt;@statman&lt;/a&gt;&amp;nbsp;and I about your previous post&amp;nbsp;&lt;LI-MESSAGE title="Choosing to exclude from 2 equal outliers in DoE" uid="848077" url="https://community.jmp.com/t5/Discussions/Choosing-to-exclude-from-2-equal-outliers-in-DoE/m-p/848077#U848077" discussion_style_icon_css="lia-mention-container-editor-message lia-img-icon-forum-thread lia-fa-icon lia-fa-forum lia-fa-thread lia-fa"&gt;&lt;/LI-MESSAGE&gt;&amp;nbsp;?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;To make it clear and repeat it, &lt;SPAN&gt;Studentized residuals may be a good way to identify outliers&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;based on an assumed model&lt;/STRONG&gt;&lt;SPAN&gt;.&amp;nbsp;&lt;/SPAN&gt;See more infos about how the studentized residuals are calculated here :&amp;nbsp;&lt;A href="https://www.jmp.com/support/help/en/17.2/#page/jmp/row-diagnostics.shtml#ww1673660" target="_blank" rel="noopener noreferrer"&gt;Row Diagnostics (jmp.com)&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;"&lt;EM&gt;Points that fall outside the red limits should be treated as&amp;nbsp;&lt;STRONG&gt;probable&lt;/STRONG&gt;&amp;nbsp;outliers. Points that fall outside the green limits but within the red limits should be treated as&amp;nbsp;&lt;STRONG&gt;possible&lt;/STRONG&gt;&amp;nbsp;outliers, but with less certainty.&lt;/EM&gt;"&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;You can use them as a model diagnostic of the complexity adequacy of the model fitted on the data. This illustrates notably how/why your studentized residuals results are dependant on model, as removing/adding a term in the model change the diagnostic about which points may be model-based outliers : the behavior of these points are not described/predicted well by the model, but that doesn't make them outliers in every other cases/modeling options. &lt;BR /&gt;You should &lt;STRONG&gt;NOT&lt;/STRONG&gt; discard/delete points based on model-based outliers analysis, these tools are great to refine your model and adjust its complexity, with the help of other statistical metrics and criterion (R²/R² adjusted, RMSE, p-values, Information criterion like AICc/BIC, ...).&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;See other related posts :&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;LI-MESSAGE title="Supress the effect of outliers when fitting the model and in predictions" uid="747067" url="https://community.jmp.com/t5/Discussions/Supress-the-effect-of-outliers-when-fitting-the-model-and-in/m-p/747067#U747067" discussion_style_icon_css="lia-mention-container-editor-message lia-img-icon-forum-thread lia-fa-icon lia-fa-forum lia-fa-thread lia-fa"&gt;&lt;/LI-MESSAGE&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;LI-MESSAGE title="Outlier Analysis" uid="750807" url="https://community.jmp.com/t5/Discussions/Outlier-Analysis/m-p/750807#U750807" discussion_style_icon_css="lia-mention-container-editor-message lia-img-icon-forum-thread lia-fa-icon lia-fa-forum lia-fa-thread lia-fa"&gt;&lt;/LI-MESSAGE&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Identifying and analyzing outliers should be done before modeling, with adequate tools.&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;If you want to investigate if points in your dataset may be outliers, try to use multivariate methods based on distances like&amp;nbsp;Mahalanobis, Jackknife or T² distances :&amp;nbsp;&lt;A href="https://www.jmp.com/support/help/en/18.1/index.shtml#page/jmp/outlier-analysis.shtml" target="_blank" rel="noopener"&gt;Outlier Analysis&lt;/A&gt;&amp;nbsp;You also have a range of other analysis in the menu&amp;nbsp;&lt;A href="https://www.jmp.com/support/help/en/18.1/index.shtml#page/jmp/explore-outliers.shtml" target="_blank" rel="noopener"&gt;Explore Outliers&lt;/A&gt;. &lt;BR /&gt;In any case, a statistical analysis is not sufficient to discard points that may be outliers, you have to investigate these strange points and understand how/why the measured values of these points seem strange compared to others : measurement error, experimental error, operator change/error, or perhaps something unexpected is happening ?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Hope this answer may help you,&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 23 Mar 2025 14:30:17 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/When-to-remove-outliers-when-reducing-model/m-p/849297#M102539</guid>
      <dc:creator>Victor_G</dc:creator>
      <dc:date>2025-03-23T14:30:17Z</dc:date>
    </item>
    <item>
      <title>Re: When to remove outliers when reducing model?</title>
      <link>https://community.jmp.com/t5/Discussions/When-to-remove-outliers-when-reducing-model/m-p/849299#M102541</link>
      <description>&lt;P&gt;Hi Victor,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I did read the response. Often in my line of work, we have high assay variability which can easily explain erratic results that could be deemed as outliers.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I usually start my analysis with JackKnife Z using the multivariate platform to asses general responses. This does not always reveal outliers to the model though. For example, in my last DoE, JackKnife shows values &amp;lt; 2 for an outlier that is found in every case by Externally Studentized residuals. I was unable to find a reason why this point was an outlier, but without removing it, my model showed an adj R^2 of 0.61 whereas with the point removed, the Adj R^2 increased to 0.99. Along with this, the model with the point removed also made "Scientific Sense" whereas without the point removed, it was generally chaotic.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I always like to compare possible models when I do remove outliers to see if there is even a significant impact to the prediction. In this case, depending on when I remove the point, (At beginning, in the middle, or at the end) of model reduction, I did find I had different models, but the practical predictive capability was roughly the same. In one case, I had a very slight quadratic for example, but it was not a dominating factor. While in this case, all three models were most likely similarly useful, I would like to know the best practice for settling on the most likely model.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I understand there is more to removing points than just following the studentized residual process, however if I know there is an outlier or have a strong sense there could be, &lt;STRONG&gt;is it best to remove that before, in the middle of, of after the model reduction&lt;/STRONG&gt;, as that will influence the results the model converges to.&lt;/P&gt;</description>
      <pubDate>Sun, 23 Mar 2025 14:40:49 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/When-to-remove-outliers-when-reducing-model/m-p/849299#M102541</guid>
      <dc:creator>MetaLizard62080</dc:creator>
      <dc:date>2025-03-23T14:40:49Z</dc:date>
    </item>
    <item>
      <title>Re: When to remove outliers when reducing model?</title>
      <link>https://community.jmp.com/t5/Discussions/When-to-remove-outliers-when-reducing-model/m-p/849302#M102542</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/49050"&gt;@MetaLizard62080&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you expect high assay variability, do you account for this noise source by using blocking ?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It's great if you can have a model that you're able to validate with domain expertise, this should reduce the possibility of errors. Can you perhaps repeat the tests (or measurements only) that seem to be strange ? This could help you figure out if it's a "systematic error" or "random error" and inform your decision-making.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Instead of directly removing points that seem strange and not described precisely by the model, I would still use them, but lower their influence on the model, by creating a column "weight" with value 1 for "normal" points and a lower value for "strange points", and use this "weight" column in the Model dialog as a Weight variable :&amp;nbsp;&lt;A href="https://www.jmp.com/support/help/en/18.1/index.shtml#page/jmp/elements-in-the-fit-model-launch-window.shtml#ww213135" target="_blank" rel="noopener"&gt;Elements in the Fit Model Launch Window&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Removing points so that the model converges faster will bias the model, and probably create falsely optimistic model results.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hope some of these points make sense to you,&lt;/P&gt;</description>
      <pubDate>Sun, 23 Mar 2025 16:25:21 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/When-to-remove-outliers-when-reducing-model/m-p/849302#M102542</guid>
      <dc:creator>Victor_G</dc:creator>
      <dc:date>2025-03-23T16:25:21Z</dc:date>
    </item>
  </channel>
</rss>

