<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: best model in Discussions</title>
    <link>https://community.jmp.com/t5/Discussions/best-model/m-p/772673#M95321</link>
    <description>&lt;P&gt;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/48779"&gt;@VictorG&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I guess I'm going to learn some new JMP features.&amp;nbsp; Please tell me where you got the Simulation window with the Columns to switch out that you are showing.&amp;nbsp; I can't find it anywhere.&amp;nbsp; I do see where to run simulation experiments, but I can't find anywhere that prompts saving a simulation formula.&amp;nbsp; Thanks.&lt;/P&gt;</description>
    <pubDate>Fri, 12 Jul 2024 10:43:18 GMT</pubDate>
    <dc:creator>dlehman1</dc:creator>
    <dc:date>2024-07-12T10:43:18Z</dc:date>
    <item>
      <title>best model</title>
      <link>https://community.jmp.com/t5/Discussions/best-model/m-p/771899#M95251</link>
      <description>&lt;P&gt;Hello&lt;BR /&gt;my question is how can i find best parametric model that fits very well on my dataset?&lt;BR /&gt;i want find a parametric model that can predict good my response&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jul 2024 07:51:53 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/best-model/m-p/771899#M95251</guid>
      <dc:creator>maryam_nourmand</dc:creator>
      <dc:date>2024-07-10T07:51:53Z</dc:date>
    </item>
    <item>
      <title>Re: best model</title>
      <link>https://community.jmp.com/t5/Discussions/best-model/m-p/771951#M95255</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/56938"&gt;@maryam_nourmand&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Your question concerns a broad topic, and there may be (a lot) of questions to adress and answer before answering this question:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;EM&gt;&lt;STRONG&gt;How is the data collected ?&lt;/STRONG&gt;&lt;/EM&gt; Observational study, experimental design, ... ? Representativeness and completeness of the dataset ? An Exploratory Data Analysis may be helpful to detect some patterns and possible pitfalls regarding the &lt;A href="https://www.jmp.com/en_us/statistics-knowledge-portal/what-is-regression/simple-linear-regression-assumptions.html" target="_self"&gt;assumptions in regression models&lt;/A&gt;, like multicollinearity which may require adapted model like PLS or pre-processing steps like PCA.&lt;/LI&gt;
&lt;LI&gt;&lt;EM&gt;&lt;STRONG&gt;Objective of the model(s) ?&lt;/STRONG&gt;&lt;/EM&gt; &lt;SPAN&gt;Causal explanations, prediction &amp;amp; optimization, or both (also linked to the available dataset and collection method) ?&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;&lt;EM&gt;&lt;STRONG&gt;Validation strategy/feature selection ?&lt;/STRONG&gt;&lt;/EM&gt;&amp;nbsp;How to ensure the model(s) created has the right level of complexity and still has good predictive performance for example ? Do you assess the model performances and robustness through a "standard" Machine Learning validation strategy (with cross-validation or train/validation/test splits), or through a "statistically-oriented" approach, based on likelihood, information criteria (AICc, BIC), p-values ... ? Note that the model complexity should also be directly limited by the data collection : if you have factors with 3 different levels for example, you won't be able to fit higher terms than 2nd order terms.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;EM&gt;&lt;STRONG&gt;Evaluation/selection metrics and threshold ?&amp;nbsp;&lt;/STRONG&gt;&lt;/EM&gt;How do you evaluate the models ? What is the selection process/criterion : do you select the ones with the best predictive results with the selected metric, or do you select all models which have a better performance than a benchmark model or a naive model, ... ? How do you finally test the model ?&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Some of these questions and answers were already described in previous posts :&lt;/P&gt;
&lt;P&gt;&lt;A href="https://community.jmp.com/t5/Discussions/Statistical-Significance/m-p/765928/highlight/true#M94573" target="_blank"&gt;https://community.jmp.com/t5/Discussions/Statistical-Significance/m-p/765928/highlight/true#M94573&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://community.jmp.com/t5/Discussions/Analysis-of-split-plot-design-with-full-factorial-vs-RSM/m-p/770579/highlight/true#M95183" target="_blank"&gt;https://community.jmp.com/t5/Discussions/Analysis-of-split-plot-design-with-full-factorial-vs-RSM/m-p/770579/highlight/true#M95183&lt;/A&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Creating, comparing and selecting model(s) require evaluation metrics linked to your objective and thresholds/citeria to select one or several models. If you simply want the best predictive model, you could :&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Create a model with a standard ML validation strategy (cross-validation for example) or a strategy able to control the model's complexity,&lt;/LI&gt;
&lt;LI&gt;Use one or several metrics linked to predictive accuracy, like RMSE, MAPE, ...&lt;/LI&gt;
&lt;LI&gt;Compare models based on the metric(s) and domain expertise :&amp;nbsp;&lt;SPAN&gt;which one(s) is/are the most appropriate/relevant for your topic and which ones have the best performances,&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;Choose to estimate individual predictions with the selected model(s) to see how/where they differ, and/or to use a combined model to average out the prediction errors.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;Test the model in "real" situation/production environment.&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;You might be interested by these ressources as well :&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;LI-MESSAGE title="Which Model When?" uid="725618" url="https://community.jmp.com/t5/Mastering-JMP/Which-Model-When/m-p/725618#U725618" discussion_style_icon_css="lia-mention-container-editor-message lia-img-icon-tkb-thread lia-fa-icon lia-fa-tkb lia-fa-thread lia-fa"&gt;&lt;/LI-MESSAGE&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;LI-MESSAGE title="Specifying and Fitting Models" uid="738847" url="https://community.jmp.com/t5/Mastering-JMP/Specifying-and-Fitting-Models/m-p/738847#U738847" discussion_style_icon_css="lia-mention-container-editor-message lia-img-icon-tkb-thread lia-fa-icon lia-fa-tkb lia-fa-thread lia-fa"&gt;&lt;/LI-MESSAGE&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;LI-MESSAGE title="Building and Understanding Predictive Models" uid="653896" url="https://community.jmp.com/t5/Learning-Center/Building-and-Understanding-Predictive-Models/m-p/653896#U653896" discussion_style_icon_css="lia-mention-container-editor-message lia-img-icon-tkb-thread lia-fa-icon lia-fa-tkb lia-fa-thread lia-fa"&gt;&lt;/LI-MESSAGE&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;LI-MESSAGE title="Building Predictive Models for Correlated and High Dimensional Data" uid="618888" url="https://community.jmp.com/t5/Mastering-JMP/Building-Predictive-Models-for-Correlated-and-High-Dimensional/m-p/618888#U618888" discussion_style_icon_css="lia-mention-container-editor-message lia-img-icon-tkb-thread lia-fa-icon lia-fa-tkb lia-fa-thread lia-fa"&gt;&lt;/LI-MESSAGE&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;LI-MESSAGE title="Building Better Predictive Models Using JMP Pro - Model Screening" uid="296048" url="https://community.jmp.com/t5/Mastering-JMP/Building-Better-Predictive-Models-Using-JMP-Pro-Model-Screening/m-p/296048#U296048" discussion_style_icon_css="lia-mention-container-editor-message lia-img-icon-tkb-thread lia-fa-icon lia-fa-tkb lia-fa-thread lia-fa"&gt;&lt;/LI-MESSAGE&gt;&amp;nbsp;(it might help screening parametric and Machine Learning models options and compare them simultaneously)&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;LI-MESSAGE title="Predictive Modeling" uid="274837" url="https://community.jmp.com/t5/Statistical-Thinking-for/Predictive-Modeling/m-p/274837#U274837" discussion_style_icon_css="lia-mention-container-editor-message lia-img-icon-tkb-thread lia-fa-icon lia-fa-tkb lia-fa-thread lia-fa"&gt;&lt;/LI-MESSAGE&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;LI-MESSAGE title="STIPS Module 7: Predictive Modeling and Text Mining" uid="739987" url="https://community.jmp.com/t5/Statistical-Thinking-for/STIPS-Module-7-Predictive-Modeling-and-Text-Mining/m-p/739987#U739987" discussion_style_icon_css="lia-mention-container-editor-message lia-img-icon-tkb-thread lia-fa-icon lia-fa-tkb lia-fa-thread lia-fa"&gt;&lt;/LI-MESSAGE&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;LI-MESSAGE title="What Model When? (and Which Modeling Type?) - (2023-US-PO-1509)" uid="651698" url="https://community.jmp.com/t5/Discovery-Summit-Americas-2023/What-Model-When-and-Which-Modeling-Type-2023-US-PO-1509/m-p/651698#U651698" discussion_style_icon_css="lia-mention-container-editor-message lia-img-icon-tkb-thread lia-fa-icon lia-fa-tkb lia-fa-thread lia-fa"&gt;&lt;/LI-MESSAGE&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;LI-MESSAGE title="Choosing Models in JMP with Model Selection Criteria - (2023-US-30MP-1456)" uid="651672" url="https://community.jmp.com/t5/Discovery-Summit-Americas-2023/Choosing-Models-in-JMP-with-Model-Selection-Criteria-2023-US/m-p/651672#U651672" discussion_style_icon_css="lia-mention-container-editor-message lia-img-icon-tkb-thread lia-fa-icon lia-fa-tkb lia-fa-thread lia-fa"&gt;&lt;/LI-MESSAGE&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;LI-MESSAGE title="Data Mining and Predictive Modeling" uid="310425" url="https://community.jmp.com/t5/Mastering-JMP/Data-Mining-and-Predictive-Modeling/m-p/310425#U310425" discussion_style_icon_css="lia-mention-container-editor-message lia-img-icon-tkb-thread lia-fa-icon lia-fa-tkb lia-fa-thread lia-fa"&gt;&lt;/LI-MESSAGE&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;I hope this first discussion starter will help you,&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jul 2024 12:47:06 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/best-model/m-p/771951#M95255</guid>
      <dc:creator>Victor_G</dc:creator>
      <dc:date>2024-07-10T12:47:06Z</dc:date>
    </item>
    <item>
      <title>Re: best model</title>
      <link>https://community.jmp.com/t5/Discussions/best-model/m-p/772496#M95288</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If I want to explain my goal more precisely:&lt;/P&gt;&lt;P&gt;I have an initial dataset related to cancer patient data. I want to find a suitable parametric statistical model that best fits my dataset. Using this model, I aim to simulate data in order to apply a shift in the model's intercept. Ultimately, I want to see how quickly my pre-existing machine learning model can detect this shift in a control chart (i.e., obtaining the ARL). For this purpose, I need the best-fitting model on my data so that I can simulate data from it.&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jul 2024 12:05:32 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/best-model/m-p/772496#M95288</guid>
      <dc:creator>maryam_nourmand</dc:creator>
      <dc:date>2024-07-11T12:05:32Z</dc:date>
    </item>
    <item>
      <title>Re: best model</title>
      <link>https://community.jmp.com/t5/Discussions/best-model/m-p/772508#M95291</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/56938"&gt;@maryam_nourmand&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Ok, then you'll be more likely in a Data Mining approach, with a fixed dataset of observational data where you try to fit an acceptable predictive model.&amp;nbsp;Using &lt;A href="https://www.jmp.com/support/help/en/17.2/#page/jmp/make-validation-column.shtml#" target="_self"&gt;validation columns&lt;/A&gt;&amp;nbsp;(with stratification on your features) will help validate (to avoid overfitting) and test your model on production data.&amp;nbsp;In terms of modeling strategy, you'll very likely use &lt;A href="https://www.jmp.com/support/help/en/17.2/#page/jmp/stepwise-regression-models.shtml" target="_self"&gt;stepwise approaches&lt;/A&gt;&amp;nbsp;to select features, and &lt;A href="https://www.jmp.com/support/help/en/17.2/#page/jmp/generalized-regression-models.shtml#" target="_self"&gt;Generalized Regression approaches&lt;/A&gt; with validation column method.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As an example, I used the Cancer_Data dataset from Kaggle to predict if there is a benign or malign cancer based on individual characteristics :&amp;nbsp;&lt;A href="https://www.kaggle.com/datasets/erdemtaha/cancer-data?resource=download" target="_blank"&gt;https://www.kaggle.com/datasets/erdemtaha/cancer-data?resource=download&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;After a first exploratory data analysis focussed mainly on distributions and correlations between features, I created a validation column formula (stratification on the features) with 70/20/10 ratios for training/validation/test sets, and then used a Generalized regression model with validation column method, and all main effects features and 2-features interactions terms entered as possible terms in the model. I choose an adaptative Elastic Net as features are strongly correlated, but there might be other options as well, like PLS or PCA pre-processing.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can then save the formula of this model, and use the Prediction Profiler to create &lt;A href="https://www.jmp.com/support/help/en/17.2/#page/jmp/simulator.shtml#" target="_self"&gt;simulations&lt;/A&gt;, to assess impact of features effects on the response, and possibly estimate the effect of increasing noise on the predicted response.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I hope these few options may help you for your topic,&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jul 2024 13:50:19 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/best-model/m-p/772508#M95291</guid>
      <dc:creator>Victor_G</dc:creator>
      <dc:date>2024-07-11T13:50:19Z</dc:date>
    </item>
    <item>
      <title>Re: best model</title>
      <link>https://community.jmp.com/t5/Discussions/best-model/m-p/772523#M95294</link>
      <description>&lt;P&gt;Your response triggers me to ask:&amp;nbsp; what do you mean by a "parametric statistical model?"&amp;nbsp; I usually think of machine learning models as non-parametric, so are you excluding such models here.&amp;nbsp; It is unclear since you say you have a pre-existing machine learning model.&amp;nbsp; Are you wanting to compare a parametric and non-parametric model?&amp;nbsp; If you use the model screening platform, you can build a number of both types of predictive models.&amp;nbsp; Using whatever you find to be the "best fitting," you can then save the prediction formula in order to do simulations.&amp;nbsp; I'm not entirely sure what you mean by a "shift in the model's intercept" but I think you could just put an additive disturbance into the formula to generate the simulated data.&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jul 2024 15:06:05 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/best-model/m-p/772523#M95294</guid>
      <dc:creator>dlehman1</dc:creator>
      <dc:date>2024-07-11T15:06:05Z</dc:date>
    </item>
    <item>
      <title>Re: best model</title>
      <link>https://community.jmp.com/t5/Discussions/best-model/m-p/772587#M95302</link>
      <description>&lt;P&gt;Thank you for your response.&lt;/P&gt;&lt;P&gt;But I think for simulation, it is better to go to the section `save columns-&amp;gt;save simulation formula` because it contains a formula for simulation that I can use for writing code as well, right?&lt;/P&gt;&lt;P&gt;And thank you for introducing the dataset.&lt;/P&gt;&lt;P&gt;But do you have a dataset related to a treatment process that includes multiple stages? I mean, after the first treatment, the disease relapses, and the second treatment is performed, and the treatment information such as drug dose, etc., is recorded. If you have such a dataset, I would appreciate it if you could share it.&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jul 2024 19:22:47 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/best-model/m-p/772587#M95302</guid>
      <dc:creator>maryam_nourmand</dc:creator>
      <dc:date>2024-07-11T19:22:47Z</dc:date>
    </item>
    <item>
      <title>Re: best model</title>
      <link>https://community.jmp.com/t5/Discussions/best-model/m-p/772589#M95304</link>
      <description>&lt;P&gt;Yes, by "parametric model" I do not mean machine learning models.&lt;/P&gt;&lt;P&gt;My goal in finding the best parametric model is to simulate and generate data with a larger quantity than my initial dataset. I want to use more data for my machine learning model, but I want the simulated data to closely resemble and be similar to my initial dataset.&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jul 2024 19:27:36 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/best-model/m-p/772589#M95304</guid>
      <dc:creator>maryam_nourmand</dc:creator>
      <dc:date>2024-07-11T19:27:36Z</dc:date>
    </item>
    <item>
      <title>Re: best model</title>
      <link>https://community.jmp.com/t5/Discussions/best-model/m-p/772611#M95308</link>
      <description>&lt;P&gt;I don't see why you need a parametric model to do that.&amp;nbsp; You can run any predictive model and use the Profiler to simulate any number of additional data points, specifying different values for the independent variables and adding random noise to the predictions.&amp;nbsp; Perhaps I am not understanding what you intend to do, but I don't see the parametric model part of this as necessary.&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jul 2024 21:33:29 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/best-model/m-p/772611#M95308</guid>
      <dc:creator>dlehman1</dc:creator>
      <dc:date>2024-07-11T21:33:29Z</dc:date>
    </item>
    <item>
      <title>Re: best model</title>
      <link>https://community.jmp.com/t5/Discussions/best-model/m-p/772633#M95313</link>
      <description>&lt;P&gt;What is the difference between saving simulation formulas through "Save Columns -&amp;gt; Save Simulation Formula" and simulating using the Profiler tool?&lt;/P&gt;&lt;P&gt;The reason I am using a parametric model for simulation is that I want to have the relationship and the simulation formula so that I can write the corresponding code and repeat this process 100 times in a loop (to obtain the ARL of my control chart based on the model I constructed in phase I ).&lt;/P&gt;&lt;P&gt;If I were to do this manually with the software, it would be time-consuming and difficult.&lt;/P&gt;</description>
      <pubDate>Fri, 12 Jul 2024 01:45:39 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/best-model/m-p/772633#M95313</guid>
      <dc:creator>maryam_nourmand</dc:creator>
      <dc:date>2024-07-12T01:45:39Z</dc:date>
    </item>
    <item>
      <title>Re: best model</title>
      <link>https://community.jmp.com/t5/Discussions/best-model/m-p/772649#M95317</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/56938"&gt;@maryam_nourmand&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It depends what is your objective.&lt;/P&gt;
&lt;P&gt;Saving a simulation formula can help you assess the coefficients distributions of your model, by switching in the diagnosis response with the diagnosis simulation formula :&amp;nbsp;&lt;A href="https://www.jmp.com/support/help/en/17.2/index.shtml#page/jmp/simulate.shtml#ww262668" target="_blank" rel="noopener"&gt;Simulate&lt;/A&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Victor_G_0-1720768352175.png" style="width: 400px;"&gt;&lt;img src="https://community.jmp.com/t5/image/serverpage/image-id/66090iCCA1EAAFD0447F4D/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Victor_G_0-1720768352175.png" alt="Victor_G_0-1720768352175.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;The simulation formula is just a condensed version of the prediction formulas you can save from the model fit report: instead of having one probability column for each class and a final classification column, you only save one column with probabilities calculation and classification inside the same formula.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you're more interested in the robustness of your model regarding variations in your inputs, then using the &lt;A href="https://www.jmp.com/support/help/en/17.2/index.shtml#page/jmp/simulator.shtml#" target="_self"&gt;Simulator&lt;/A&gt; from Prediction Profiler with variations in the inputs (and possibly adding noise in the output) and run a &lt;A href="https://www.jmp.com/support/help/en/17.2/index.shtml#page/jmp/simulation-experiment.shtml" target="_blank" rel="noopener"&gt;Simulation Experiment&lt;/A&gt; may help you. You can then generate variations in your inputs, and possibly shift the distributions of your features to see how it affects your model, as well as increasing noise to see how robust your prediction model might be.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;On a side note and related to the points brought by&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/53879"&gt;@dlehman1&lt;/a&gt;&amp;nbsp;:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;My goal in finding the best parametric model is to simulate and generate data with a larger quantity than my initial dataset. I want to use more data for my machine learning model, but I want the simulated data to closely resemble and be similar to my initial dataset.&lt;/SPAN&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;In order to simulate and generate data that is similar to the real data collected, you have to "mimic" the data generation process. If there are strong non-linearities, correlations, or other patterns found by your ML model and not considered by the parametric model, you should simulate and generate data with the ML model, or else you'll introduce a strong bias in the simulation/generated data. I don't understand the need to have a separate model for data generation and prediction for this use case.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Note that no matter how predictive a model might be, it's only a simplification of a phenomenon you're trying to understand and predict, so the data generation from this model may still be more or less biased (and probably generate less noisy outcomes than real data).&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;But it can still be useful to assess its robustness to variations in the data like you intend to do.&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;And no sorry, I don't have a dataset that includes multiple stages treatment. I just found this one and use it to provide an illustration example that could fit your use case, that's all.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Hope this answer will help you,&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 12 Jul 2024 07:34:23 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/best-model/m-p/772649#M95317</guid>
      <dc:creator>Victor_G</dc:creator>
      <dc:date>2024-07-12T07:34:23Z</dc:date>
    </item>
    <item>
      <title>Re: best model</title>
      <link>https://community.jmp.com/t5/Discussions/best-model/m-p/772673#M95321</link>
      <description>&lt;P&gt;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/48779"&gt;@VictorG&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I guess I'm going to learn some new JMP features.&amp;nbsp; Please tell me where you got the Simulation window with the Columns to switch out that you are showing.&amp;nbsp; I can't find it anywhere.&amp;nbsp; I do see where to run simulation experiments, but I can't find anywhere that prompts saving a simulation formula.&amp;nbsp; Thanks.&lt;/P&gt;</description>
      <pubDate>Fri, 12 Jul 2024 10:43:18 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/best-model/m-p/772673#M95321</guid>
      <dc:creator>dlehman1</dc:creator>
      <dc:date>2024-07-12T10:43:18Z</dc:date>
    </item>
    <item>
      <title>Re: best model</title>
      <link>https://community.jmp.com/t5/Discussions/best-model/m-p/772679#M95324</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/53879"&gt;@dlehman1&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The simulation feature can be accessed by right-clicking on any panel/report of model platform :&amp;nbsp;&lt;A href="https://www.jmp.com/support/help/en/17.2/#page/jmp/launch-the-simulate-feature.shtml#" target="_blank" rel="noopener"&gt;Launch the Simulate Feature&lt;/A&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here as an example on model summary of SVM&amp;nbsp;:&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Victor_G_0-1720782020605.png" style="width: 400px;"&gt;&lt;img src="https://community.jmp.com/t5/image/serverpage/image-id/66093i033F8C9313D816BF/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Victor_G_0-1720782020605.png" alt="Victor_G_0-1720782020605.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;Then you'll be prompted to choose the columns to switch in and switch out. In the context of Simulate, you could&amp;nbsp;use it on the parameters estimates of a linear regression model, switching out the response with the simulated response to assess the parameters estimates distribution, or on the model summary to assess the consistency of the model's performances :&amp;nbsp;&lt;A href="https://www.jmp.com/support/help/en/17.2/#page/jmp/the-simulate-window.shtml#" target="_blank" rel="noopener"&gt;The Simulate Window&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Concerning the simulation formula, it can be found and saved as one of the options under "Save Columns" in the Generalized Regression platform :&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Victor_G_1-1720782436018.png" style="width: 400px;"&gt;&lt;img src="https://community.jmp.com/t5/image/serverpage/image-id/66094i660BF6C792A06AED/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Victor_G_1-1720782436018.png" alt="Victor_G_1-1720782436018.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I use this "Simulate" option to assess robustness of Machine Learning models using a Validation column formula. Using this column as a validation in the model dialog window, I can then simulate and switch in and out these column values, to represent slightly different training/validation/test sets with the same stratification or grouping method. It can help assess robustness of the algorithm (like cross-validation would also enable : &lt;A href="https://community.jmp.com/t5/Discussions/How-can-I-automate-and-summarize-many-repeat-validations-into/m-p/632192/highlight/true#M83061" target="_blank" rel="noopener"&gt;https://community.jmp.com/t5/Discussions/How-can-I-automate-and-summarize-many-repeat-validations-into/m-p/632192/highlight/true#M83061&lt;/A&gt;), and possibly test the performances improvement (if any) of the model by fine-tuning the hyperparameters (&lt;A href="https://community.jmp.com/t5/Discussions/Boosted-Tree-Tuning-TABLE-DESIGN/m-p/609591/highlight/true#M81062" target="_blank" rel="noopener"&gt;https://community.jmp.com/t5/Discussions/Boosted-Tree-Tuning-TABLE-DESIGN/m-p/609591/highlight/true#M81062&lt;/A&gt;).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hope this answer may help you,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 12 Jul 2024 11:17:46 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/best-model/m-p/772679#M95324</guid>
      <dc:creator>Victor_G</dc:creator>
      <dc:date>2024-07-12T11:17:46Z</dc:date>
    </item>
    <item>
      <title>Re: best model</title>
      <link>https://community.jmp.com/t5/Discussions/best-model/m-p/772834#M95341</link>
      <description>&lt;P&gt;hello again&lt;BR /&gt;for example look at this simulation formula :&lt;BR /&gt;(-0.741104921699375) + 0.0237113448295257 * :weight + Match( :sex,&lt;BR /&gt;0, (:age - 33.8085106382979) * 0.0091364700024452,&lt;BR /&gt;1, (:age - 33.8085106382979) * 0,&lt;BR /&gt;.&lt;BR /&gt;) + (:weight - 63.3936170212766) * Match( :family, 0, -0.015643799304455, 1, 0, . ) + (:tumor size - 2.49893617021277) *&lt;BR /&gt;Match( :family, 0, 0.441690642908435, 1, 0, . ) + (:tumor size - 2.49893617021277) * Match( :hyper, 0, -0.346279666197286, 1, 0, . )&lt;BR /&gt;+Match( :hyper, 0, Match( :dissection, 0, 0.354356313550484, 1, 0, . ), 1, Match( :dissection, 0, 0, 1, 0, . ), . ) + (0.15468861367462&lt;BR /&gt;+ 0) * Tangent( Pi() * (Random Uniform() - 0.5) )&lt;BR /&gt;&lt;BR /&gt;i need such a formula like this for my project because i should write a python code and generate for 100 times to calculate ARL&lt;BR /&gt;did I manage to convey my point?&lt;/P&gt;</description>
      <pubDate>Sat, 13 Jul 2024 05:10:34 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/best-model/m-p/772834#M95341</guid>
      <dc:creator>maryam_nourmand</dc:creator>
      <dc:date>2024-07-13T05:10:34Z</dc:date>
    </item>
  </channel>
</rss>

