<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic cross validation using k-fold fit quality in Discussions</title>
    <link>https://community.jmp.com/t5/Discussions/cross-validation-using-k-fold-fit-quality/m-p/855682#M102670</link>
    <description>&lt;P&gt;I am using Lasso fit with leave one out or K-fold cross validation. Please advise the best way to view the R square and other fit quality metrics (e.g., AIC) in the output. It would be helpful to have this for the training set and the validation set (average of all hold outs).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 26 Mar 2025 16:53:38 GMT</pubDate>
    <dc:creator>daniel_s</dc:creator>
    <dc:date>2025-03-26T16:53:38Z</dc:date>
    <item>
      <title>cross validation using k-fold fit quality</title>
      <link>https://community.jmp.com/t5/Discussions/cross-validation-using-k-fold-fit-quality/m-p/855682#M102670</link>
      <description>&lt;P&gt;I am using Lasso fit with leave one out or K-fold cross validation. Please advise the best way to view the R square and other fit quality metrics (e.g., AIC) in the output. It would be helpful to have this for the training set and the validation set (average of all hold outs).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 26 Mar 2025 16:53:38 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/cross-validation-using-k-fold-fit-quality/m-p/855682#M102670</guid>
      <dc:creator>daniel_s</dc:creator>
      <dc:date>2025-03-26T16:53:38Z</dc:date>
    </item>
    <item>
      <title>Re: cross validation using k-fold fit quality</title>
      <link>https://community.jmp.com/t5/Discussions/cross-validation-using-k-fold-fit-quality/m-p/855694#M102671</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/66673"&gt;@daniel_s&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Welcome in the Community !&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;To get performance metrics of your LASSO regression on the individual folds and on average, I think it's easier to launch the Generalized regression from the&amp;nbsp;&lt;A href="https://www.jmp.com/support/help/en/18.1/#page/jmp/model-screening.shtml#" target="_blank"&gt;Model Screening&lt;/A&gt;&amp;nbsp;platform, by checking only the options "Generalized Regression" and "Additional Methods", specifying the type of terms that can enter the model (for example introducing interactions and quadratic effects), the number of folds and the seed for reproducibility (if needed) :&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Victor_G_0-1743008501964.png" style="width: 400px;"&gt;&lt;img src="https://community.jmp.com/t5/image/serverpage/image-id/74275i937736926B92062B/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Victor_G_0-1743008501964.png" alt="Victor_G_0-1743008501964.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;Once the platform is launched, you'll have a new window open with all infos about individual folds and summary :&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Victor_G_1-1743008541354.png" style="width: 400px;"&gt;&lt;img src="https://community.jmp.com/t5/image/serverpage/image-id/74276iD0951F60F3B91CD9/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Victor_G_1-1743008541354.png" alt="Victor_G_1-1743008541354.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hope this answer will help you,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 26 Mar 2025 17:02:46 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/cross-validation-using-k-fold-fit-quality/m-p/855694#M102671</guid>
      <dc:creator>Victor_G</dc:creator>
      <dc:date>2025-03-26T17:02:46Z</dc:date>
    </item>
    <item>
      <title>Re: cross validation using k-fold fit quality</title>
      <link>https://community.jmp.com/t5/Discussions/cross-validation-using-k-fold-fit-quality/m-p/855878#M102686</link>
      <description>&lt;P&gt;Thank you. That is helpful. I gather that I then would select the best fold and "run selected" in which you will then get the lasso fit results with the best K fold used as a validation&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 27 Mar 2025 00:30:41 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/cross-validation-using-k-fold-fit-quality/m-p/855878#M102686</guid>
      <dc:creator>daniel_s</dc:creator>
      <dc:date>2025-03-27T00:30:41Z</dc:date>
    </item>
    <item>
      <title>Re: cross validation using k-fold fit quality</title>
      <link>https://community.jmp.com/t5/Discussions/cross-validation-using-k-fold-fit-quality/m-p/855990#M102689</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/66673"&gt;@daniel_s&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;After having launched your LASSO model with K-folds crossvalidation, there are indeed several ways to proceed with the results :&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;EM&gt;Choose the best performing LASSO model on the "best" validation fold&lt;/EM&gt;&lt;/STRONG&gt; : &lt;FONT color="#FF0000"&gt;&lt;STRONG&gt;Not recommended &lt;/STRONG&gt;&lt;/FONT&gt;as&amp;nbsp;this would look like "cherry picking" and not an "honest assessment" and selection procedure : it's more like selecting the right data for the model, instead of fitting the right model to your data, so you might end up overfitting your validation data.&amp;nbsp;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;EM&gt;After assessing results consistency and robustness, retrain the model on all data&lt;/EM&gt;&lt;/STRONG&gt; : This approach could seem logical, as once you have assessed that your model is robust and have similar results across all folds, you could be tempted to use all data to further improve your model. It could be a viable option if you're sure that model's parameters (for example, terms included and penalty value) could be kept the same between fitting with crossvalidation data and fitting with all data, to ensure your model won't be overfitting on the whole dataset. This approach has the drawback to lose sight of the model validation, so if anything goes wrong on the test data, it's hard to debug the model without validation data.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;EM&gt;Create model averaging of your K models&lt;/EM&gt; &lt;/STRONG&gt;: This is my preferred approach (if possible). Once you have your K models, you can run each model and save their prediction formula using "Publish Prediction Formula" to store your models in the&amp;nbsp;&lt;A href="https://www.jmp.com/support/help/en/18.1/#page/jmp/formula-depot.shtml#" target="_blank" rel="noopener"&gt;Formula Depot&lt;/A&gt; :&amp;nbsp;&lt;BR /&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Victor_G_0-1743062832794.png" style="width: 400px;"&gt;&lt;img src="https://community.jmp.com/t5/image/serverpage/image-id/74296iA82E0FC0EEE8AB24/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Victor_G_0-1743062832794.png" alt="Victor_G_0-1743062832794.png" /&gt;&lt;/span&gt;
&lt;P&gt;Once the models formula in the Formula depot, you can click on the red triangle next to "Formula Depot" and select "&lt;A href="https://www.jmp.com/support/help/en/18.1/index.shtml#page/jmp/model-comparison.shtml#ww74115" target="_blank" rel="noopener"&gt;Model Comparison&lt;/A&gt;". This will create a short summary of the performances of your model, and if you click on the red triangle next to "Model Comparison", you can create a &lt;A href="https://www.jmp.com/support/help/en/18.1/index.shtml#page/jmp/model-comparison-platform-options.shtml#" target="_self"&gt;Model Averaging&lt;/A&gt;&amp;nbsp;:&lt;/P&gt;
&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Victor_G_1-1743063059364.png" style="width: 400px;"&gt;&lt;img src="https://community.jmp.com/t5/image/serverpage/image-id/74297i8C40512B5F2468D1/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Victor_G_1-1743063059364.png" alt="Victor_G_1-1743063059364.png" /&gt;&lt;/span&gt;
&lt;P&gt;This option will create a new formula in your datatable, that corresponds to the average equation of your K models (in my case, the average equation/model of my 5 individual crossvalidated models) :&lt;/P&gt;
&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Victor_G_2-1743063269688.png" style="width: 400px;"&gt;&lt;img src="https://community.jmp.com/t5/image/serverpage/image-id/74298iC07BD7E163878329/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Victor_G_2-1743063269688.png" alt="Victor_G_2-1743063269688.png" /&gt;&lt;/span&gt;
&lt;P&gt;You can then compare once again the performances of your K individual models and your average model using the same Model Comparison platform.&amp;nbsp;&lt;BR /&gt;Note that this approach may not be easy/feasible to do if you have a large number of folds, and/or if the models used are complex (like Neural Networks). For simple models like regression models and Machine Learning "base models" (like Decision Tree, SVM, kNN, ...), this approach helps avoid overfitting and ensure robustness and generalization properties, without "losing" any data in validation.&lt;/P&gt;
&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can read more about crossvalidation in the following posts :&amp;nbsp;&lt;BR /&gt;&lt;LI-MESSAGE title="CROSS VALIDATION - VALIDATION COLUMN METHOD" uid="588298" url="https://community.jmp.com/t5/Discussions/CROSS-VALIDATION-VALIDATION-COLUMN-METHOD/m-p/588298#U588298" discussion_style_icon_css="lia-mention-container-editor-message lia-img-icon-forum-thread lia-fa-icon lia-fa-forum lia-fa-thread lia-fa"&gt;&lt;/LI-MESSAGE&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;LI-MESSAGE title="k-fold r2" uid="484230" url="https://community.jmp.com/t5/Discussions/k-fold-r2/m-p/484230#U484230" discussion_style_icon_css="lia-mention-container-editor-message lia-img-icon-forum-thread lia-fa-icon lia-fa-forum lia-fa-thread lia-fa"&gt;&lt;/LI-MESSAGE&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I also highly recommend the playlist "Making Friends with Machine Learning" from Cassie Kozyrkov to learn more about models training, validation and testing :&amp;nbsp;&lt;A href="https://www.youtube.com/playlist?list=PLRKtJ4IpxJpDxl0NTvNYQWKCYzHNuy2xG" target="_blank" rel="noopener nofollow noreferrer"&gt;Making Friends with Machine Learning - YouTube&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hope this response will help you and answer your questions,&lt;/P&gt;</description>
      <pubDate>Thu, 27 Mar 2025 08:39:40 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/cross-validation-using-k-fold-fit-quality/m-p/855990#M102689</guid>
      <dc:creator>Victor_G</dc:creator>
      <dc:date>2025-03-27T08:39:40Z</dc:date>
    </item>
    <item>
      <title>Re: cross validation using k-fold fit quality</title>
      <link>https://community.jmp.com/t5/Discussions/cross-validation-using-k-fold-fit-quality/m-p/856180#M102700</link>
      <description>&lt;P&gt;Thank you! Your insights were just right for the Machine learning trajectory I am on.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 27 Mar 2025 14:49:38 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/cross-validation-using-k-fold-fit-quality/m-p/856180#M102700</guid>
      <dc:creator>daniel_s</dc:creator>
      <dc:date>2025-03-27T14:49:38Z</dc:date>
    </item>
    <item>
      <title>Re: cross validation using k-fold fit quality</title>
      <link>https://community.jmp.com/t5/Discussions/cross-validation-using-k-fold-fit-quality/m-p/866660#M102932</link>
      <description>&lt;P&gt;Just to complete my analysis, if I use average model from K fold model, as described above, I would like to review importance of each model independent variables. I would typically use a VIF. Please advise how would I do that on the average formula?&lt;/P&gt;
&lt;P&gt;Thank you.&lt;/P&gt;</description>
      <pubDate>Mon, 07 Apr 2025 20:18:07 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/cross-validation-using-k-fold-fit-quality/m-p/866660#M102932</guid>
      <dc:creator>daniel_s</dc:creator>
      <dc:date>2025-04-07T20:18:07Z</dc:date>
    </item>
    <item>
      <title>Re: cross validation using k-fold fit quality</title>
      <link>https://community.jmp.com/t5/Discussions/cross-validation-using-k-fold-fit-quality/m-p/866681#M102934</link>
      <description>&lt;P&gt;Hi &lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/66673"&gt;@daniel_s&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'm not sure to follow you, &lt;A href="https://www.jmp.com/support/help/en/18.1/#page/jmp/parameter-estimates-for-original-predictors.shtml" target="_self"&gt;VIF&lt;/A&gt; values are helpful to assess the degree of collinearity among your variables, but it is not reflecting the importance of these variables on the response.&lt;/P&gt;
&lt;P&gt;You can compare the parameters estimates from the average model formula to compare importance of factors.&lt;/P&gt;
&lt;P&gt;You can also load the average formula in the &lt;A href="https://www.jmp.com/support/help/en/18.1/#page/jmp/profiler-platform-options.shtml" target="_self"&gt;Profiler&lt;/A&gt; (available in menu Graph, check "Expand intermediate formula" to make sure your response will be linked to your original predictors), and use the "&lt;A href="https://www.jmp.com/support/help/en/18.1/#page/jmp/assess-variable-importance.shtml" target="_self"&gt;Assess Variable Importance&lt;/A&gt;" option to calculate variable importances (in terms of main effects and total effects) based on &lt;A href="https://www.jmp.com/support/help/en/18.1/#page/jmp/statistical-details-for-assess-variable-importance.shtml#ww418531" target="_self"&gt;Sobol indices.&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hope this answer will help you,&lt;/P&gt;</description>
      <pubDate>Mon, 07 Apr 2025 22:14:03 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/cross-validation-using-k-fold-fit-quality/m-p/866681#M102934</guid>
      <dc:creator>Victor_G</dc:creator>
      <dc:date>2025-04-07T22:14:03Z</dc:date>
    </item>
    <item>
      <title>Re: cross validation using k-fold fit quality</title>
      <link>https://community.jmp.com/t5/Discussions/cross-validation-using-k-fold-fit-quality/m-p/866684#M102937</link>
      <description>&lt;P&gt;Thanks! I still have more to learn, and your response gives me a platform to continue.&lt;/P&gt;</description>
      <pubDate>Mon, 07 Apr 2025 22:38:59 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/cross-validation-using-k-fold-fit-quality/m-p/866684#M102937</guid>
      <dc:creator>daniel_s</dc:creator>
      <dc:date>2025-04-07T22:38:59Z</dc:date>
    </item>
  </channel>
</rss>

