<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: PLS validation &amp;amp; variable loadings in Discussions</title>
    <link>https://community.jmp.com/t5/Discussions/PLS-validation-amp-variable-loadings/m-p/484350#M72865</link>
    <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/24144"&gt;@Moukanni&lt;/a&gt;,&lt;BR /&gt;&lt;BR /&gt;- I am not sure what is your objective behind masking your data manually and/or choosing cross-validation.&lt;BR /&gt;If you want a test set (a set not used by the model for training, and not seen during validation) to assess how the PLS results are on "new"/unseen data (and provided you have a large dataset), then yes, you can hide manually a portion of your dataset (hide &amp;amp; exclude the rows, run the model, save the prediction formula, and compare predicted vs. actual responses on this hidden dataset), or if you have JMP Pro, create a validation column (in "Analyze", "Predictive Modeling", "Make Validation Column") where you'll specify the proportion of rows in your training, validation and/or test set.&lt;BR /&gt;If you want to validate your model through a K-fold cross-validation, that means JMP will automatically split your dataset in K parts, train the PLS model on K-1 parts, then validate it on 1 part, and repeat this operation so that each "part" (fold) has been one time a validation part and K-1 times a training part. This is a good validation technique if you want to assess the robustness of your model (different training and validation sets compared) on a small dataset.&lt;BR /&gt;&lt;BR /&gt;- Not sure on the second question too, if you want to know which factors are the most important in the PLS model, you can have a look at the variable importance plot and the computed VIP scores. See :&amp;nbsp;&lt;A href="https://www.jmp.com/support/help/en/16.2/#page/jmp/variable-importance-plot.shtml" target="_blank" rel="noopener"&gt;Variable Importance Plot (jmp.com)&lt;/A&gt;&amp;nbsp;and&amp;nbsp;&lt;A href="https://www.jmp.com/support/help/en/16.2/#page/jmp/vip-vs-coefficients-plots.shtml#" target="_blank" rel="noopener"&gt;VIP vs Coefficients Plots (jmp.com)&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;I hope it will help you !&lt;/P&gt;</description>
    <pubDate>Thu, 05 May 2022 07:50:43 GMT</pubDate>
    <dc:creator>Victor_G</dc:creator>
    <dc:date>2022-05-05T07:50:43Z</dc:date>
    <item>
      <title>PLS validation &amp; variable loadings</title>
      <link>https://community.jmp.com/t5/Discussions/PLS-validation-amp-variable-loadings/m-p/484252#M72860</link>
      <description>&lt;P&gt;Hello JMP community,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a couple of questions about PLS:&lt;/P&gt;&lt;P&gt;- Do I have to hide a portion of my data manually, or is it done automatically when I choose cross-validation?&lt;/P&gt;&lt;P&gt;- Is there a threshold of variables loadings on factors that distinguish the most important variables captured by each factor (latent variable)&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you for your assistance!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 09 Jun 2023 00:48:57 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/PLS-validation-amp-variable-loadings/m-p/484252#M72860</guid>
      <dc:creator>Moukanni</dc:creator>
      <dc:date>2023-06-09T00:48:57Z</dc:date>
    </item>
    <item>
      <title>Re: PLS validation &amp; variable loadings</title>
      <link>https://community.jmp.com/t5/Discussions/PLS-validation-amp-variable-loadings/m-p/484350#M72865</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/24144"&gt;@Moukanni&lt;/a&gt;,&lt;BR /&gt;&lt;BR /&gt;- I am not sure what is your objective behind masking your data manually and/or choosing cross-validation.&lt;BR /&gt;If you want a test set (a set not used by the model for training, and not seen during validation) to assess how the PLS results are on "new"/unseen data (and provided you have a large dataset), then yes, you can hide manually a portion of your dataset (hide &amp;amp; exclude the rows, run the model, save the prediction formula, and compare predicted vs. actual responses on this hidden dataset), or if you have JMP Pro, create a validation column (in "Analyze", "Predictive Modeling", "Make Validation Column") where you'll specify the proportion of rows in your training, validation and/or test set.&lt;BR /&gt;If you want to validate your model through a K-fold cross-validation, that means JMP will automatically split your dataset in K parts, train the PLS model on K-1 parts, then validate it on 1 part, and repeat this operation so that each "part" (fold) has been one time a validation part and K-1 times a training part. This is a good validation technique if you want to assess the robustness of your model (different training and validation sets compared) on a small dataset.&lt;BR /&gt;&lt;BR /&gt;- Not sure on the second question too, if you want to know which factors are the most important in the PLS model, you can have a look at the variable importance plot and the computed VIP scores. See :&amp;nbsp;&lt;A href="https://www.jmp.com/support/help/en/16.2/#page/jmp/variable-importance-plot.shtml" target="_blank" rel="noopener"&gt;Variable Importance Plot (jmp.com)&lt;/A&gt;&amp;nbsp;and&amp;nbsp;&lt;A href="https://www.jmp.com/support/help/en/16.2/#page/jmp/vip-vs-coefficients-plots.shtml#" target="_blank" rel="noopener"&gt;VIP vs Coefficients Plots (jmp.com)&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;I hope it will help you !&lt;/P&gt;</description>
      <pubDate>Thu, 05 May 2022 07:50:43 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/PLS-validation-amp-variable-loadings/m-p/484350#M72865</guid>
      <dc:creator>Victor_G</dc:creator>
      <dc:date>2022-05-05T07:50:43Z</dc:date>
    </item>
    <item>
      <title>Re: PLS validation &amp; variable loadings</title>
      <link>https://community.jmp.com/t5/Discussions/PLS-validation-amp-variable-loadings/m-p/484427#M72871</link>
      <description>&lt;P&gt;Thank you, Victor! this helps a lot!&amp;nbsp;&lt;/P&gt;&lt;P&gt;My objective is to validate the model through K-fold cross-validation.&amp;nbsp;&lt;/P&gt;&lt;P&gt;For the second question, I'm referring to the PLS X loadings on each factor; is there a commonly used threshold that highlights the variables belonging to the same system. For example in exploratory factor analysis, variable loadings (&amp;gt; 0.4) on a given factor suggest that these variables highly likely come from the same system.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you so much!&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 05 May 2022 14:12:41 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/PLS-validation-amp-variable-loadings/m-p/484427#M72871</guid>
      <dc:creator>Moukanni</dc:creator>
      <dc:date>2022-05-05T14:12:41Z</dc:date>
    </item>
  </channel>
</rss>

