<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Help choosing JMP Analysis for Multiple Factors in Discussions</title>
    <link>https://community.jmp.com/t5/Discussions/Help-choosing-JMP-Analysis-for-Multiple-Factors/m-p/36964#M21691</link>
    <description>&lt;P&gt;Whenever I start a predictive modeling exercise, especially if I inherited the data from somewhere else with little knowledge of how, where, when, and under what circumstances the data were collected, I spend some time in what I call 'getting acquainted with the data' mode. I look for things like data quality, unusual or supicious observations, missing values (you have none of these), nonsense values, and any other feature that sticks out at me that might make modeling problematic. I always start with the Distribution platform to just get a feel for "Where's the middle, how spread out is the data, and is there anything odd or unusual going on?" From there especially with a relatively small set of predictor variables, I just use the Fit Y by X platform to look for relationships between predictors and responses...and compare what I see with my process/domain knowledge. If a scatter plot proves that 'water runs uphill' (in other words is counter known laws of physics, chemistry, biology, socioeconomic behavior, etc.) then I start to get suspicious and suspend the modeling work until I get to the bottom of the issues.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Data cleaning and prep is never fun...and takes work...but it's absolutely necessary.&lt;/P&gt;</description>
    <pubDate>Wed, 08 Mar 2017 20:10:04 GMT</pubDate>
    <dc:creator>Peter_Bartell</dc:creator>
    <dc:date>2017-03-08T20:10:04Z</dc:date>
    <item>
      <title>Help choosing JMP Analysis for Multiple Factors</title>
      <link>https://community.jmp.com/t5/Discussions/Help-choosing-JMP-Analysis-for-Multiple-Factors/m-p/36890#M21642</link>
      <description>&lt;P&gt;Hello! &amp;nbsp;I&amp;nbsp;am using JMP 13 (regular)&amp;nbsp;and learning about&amp;nbsp;predictive and specialized modeling.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a dataset (attached) with over 100,000 observations of a process. &amp;nbsp;75% of the observations had a duration of 4 days or less. &amp;nbsp;I'd like to know why some observations took&amp;nbsp;over 4 days. &amp;nbsp;I've identified some possible factors: &amp;nbsp;1 continuous and 16 categorical.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Can you&amp;nbsp;please suggest one or more JMP analyses that I could apply to the data?&lt;/P&gt;</description>
      <pubDate>Fri, 09 Jun 2023 00:16:11 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Help-choosing-JMP-Analysis-for-Multiple-Factors/m-p/36890#M21642</guid>
      <dc:creator>jb</dc:creator>
      <dc:date>2023-06-09T00:16:11Z</dc:date>
    </item>
    <item>
      <title>Re: Help choosing JMP Analysis for Multiple Factors</title>
      <link>https://community.jmp.com/t5/Discussions/Help-choosing-JMP-Analysis-for-Multiple-Factors/m-p/36893#M21644</link>
      <description>&lt;P&gt;I take it the Y column is the number of days? IF so, one can make an indicator column to represent those observations that are &amp;gt; 4 days and then use that for modeling to see which variables have the greatest impact on predicting the indictor variable.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Chris&lt;/P&gt;</description>
      <pubDate>Tue, 07 Mar 2017 18:52:54 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Help-choosing-JMP-Analysis-for-Multiple-Factors/m-p/36893#M21644</guid>
      <dc:creator>Chris_Kirchberg</dc:creator>
      <dc:date>2017-03-07T18:52:54Z</dc:date>
    </item>
    <item>
      <title>Re: Help choosing JMP Analysis for Multiple Factors</title>
      <link>https://community.jmp.com/t5/Discussions/Help-choosing-JMP-Analysis-for-Multiple-Factors/m-p/36896#M21646</link>
      <description>&lt;P&gt;Hi Chris. Yes, the Y column is the number of days. Thanks for the suggestion to create an indicator column based on it.&lt;/P&gt;</description>
      <pubDate>Tue, 07 Mar 2017 19:38:27 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Help-choosing-JMP-Analysis-for-Multiple-Factors/m-p/36896#M21646</guid>
      <dc:creator>jb</dc:creator>
      <dc:date>2017-03-07T19:38:27Z</dc:date>
    </item>
    <item>
      <title>Re: Help choosing JMP Analysis for Multiple Factors</title>
      <link>https://community.jmp.com/t5/Discussions/Help-choosing-JMP-Analysis-for-Multiple-Factors/m-p/36897#M21647</link>
      <description>&lt;P&gt;You're welcome,&lt;BR /&gt;Decision Trees are a good quick way to see the possible impact and you can use an indicator column.&lt;BR /&gt;A simple column formula like:&lt;BR /&gt;If(:Hours &amp;gt; 4, 1, 0)&lt;BR /&gt;&lt;BR /&gt;You can then change the model type to nominal and use as the response.&lt;BR /&gt;&lt;BR /&gt;Best,&lt;BR /&gt;&lt;BR /&gt;Chris&lt;/P&gt;</description>
      <pubDate>Wed, 08 Mar 2017 18:52:54 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Help-choosing-JMP-Analysis-for-Multiple-Factors/m-p/36897#M21647</guid>
      <dc:creator>Chris_Kirchberg</dc:creator>
      <dc:date>2017-03-08T18:52:54Z</dc:date>
    </item>
    <item>
      <title>Re: Help choosing JMP Analysis for Multiple Factors</title>
      <link>https://community.jmp.com/t5/Discussions/Help-choosing-JMP-Analysis-for-Multiple-Factors/m-p/36898#M21648</link>
      <description>&lt;P&gt;Once your target column is built you might start with a partition or logistic regression.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Partition&lt;/P&gt;&lt;P&gt;&lt;A href="http://www.jmp.com/support/help/Partition_Models.shtml" target="_blank"&gt;http://www.jmp.com/support/help/Partition_Models.shtml&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Webcast: &lt;A href="https://www.jmp.com/en_us/events/ondemand/building-better-models/decision-trees.html" target="_blank"&gt;https://www.jmp.com/en_us/events/ondemand/building-better-models/decision-trees.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Logistic Regression&lt;/P&gt;&lt;P&gt;&lt;A href="http://www.jmp.com/support/help/Logistic_Analysis.shtml#274628" target="_blank"&gt;http://www.jmp.com/support/help/Logistic_Analysis.shtml#274628&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 07 Mar 2017 19:45:33 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Help-choosing-JMP-Analysis-for-Multiple-Factors/m-p/36898#M21648</guid>
      <dc:creator>ih</dc:creator>
      <dc:date>2017-03-07T19:45:33Z</dc:date>
    </item>
    <item>
      <title>Re: Help choosing JMP Analysis for Multiple Factors</title>
      <link>https://community.jmp.com/t5/Discussions/Help-choosing-JMP-Analysis-for-Multiple-Factors/m-p/36961#M21688</link>
      <description>&lt;P&gt;I took a quick look at the raw data. Are you aware that a few of the X variables are only at "1"?&lt;/P&gt;</description>
      <pubDate>Wed, 08 Mar 2017 18:31:26 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Help-choosing-JMP-Analysis-for-Multiple-Factors/m-p/36961#M21688</guid>
      <dc:creator>Peter_Bartell</dc:creator>
      <dc:date>2017-03-08T18:31:26Z</dc:date>
    </item>
    <item>
      <title>Re: Help choosing JMP Analysis for Multiple Factors</title>
      <link>https://community.jmp.com/t5/Discussions/Help-choosing-JMP-Analysis-for-Multiple-Factors/m-p/36962#M21689</link>
      <description>&lt;P&gt;One other feature with the raw data, there is a high proportion&amp;nbsp;of the x categorical variables at the '1' or '0'&amp;nbsp;level with a relatively low proportion at the&amp;nbsp;the other level for that particular variable...in the 99% to 1%&amp;nbsp;and higher/lower&amp;nbsp;range for many of them.&amp;nbsp;I'd make sure I spend some time&amp;nbsp;pondering my cross validation and model validation&amp;nbsp;schemes. Since you aren't running JMP Pro this is gonna create some work for you...but with very few observations of all the categorical variables at&amp;nbsp;one of&amp;nbsp;the levels for any one x categorical variable, I worry about cross validation, overfitting or just not enough observations&amp;nbsp;at the&amp;nbsp;lower proportion&amp;nbsp;level for a signal to rise above the noise.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Mar 2017 19:17:11 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Help-choosing-JMP-Analysis-for-Multiple-Factors/m-p/36962#M21689</guid>
      <dc:creator>Peter_Bartell</dc:creator>
      <dc:date>2017-03-08T19:17:11Z</dc:date>
    </item>
    <item>
      <title>Re: Help choosing JMP Analysis for Multiple Factors</title>
      <link>https://community.jmp.com/t5/Discussions/Help-choosing-JMP-Analysis-for-Multiple-Factors/m-p/36963#M21690</link>
      <description>&lt;P&gt;Hi Peter, thanks for the feedback on the categorical variables in my data. &amp;nbsp;I didn't realize some only had values of only 1, and others had values in very low proportions. &amp;nbsp;I think I'll go back and learn more about these categorical variables to see if I can exclude them from my analysis.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Mar 2017 19:58:02 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Help-choosing-JMP-Analysis-for-Multiple-Factors/m-p/36963#M21690</guid>
      <dc:creator>jb</dc:creator>
      <dc:date>2017-03-08T19:58:02Z</dc:date>
    </item>
    <item>
      <title>Re: Help choosing JMP Analysis for Multiple Factors</title>
      <link>https://community.jmp.com/t5/Discussions/Help-choosing-JMP-Analysis-for-Multiple-Factors/m-p/36964#M21691</link>
      <description>&lt;P&gt;Whenever I start a predictive modeling exercise, especially if I inherited the data from somewhere else with little knowledge of how, where, when, and under what circumstances the data were collected, I spend some time in what I call 'getting acquainted with the data' mode. I look for things like data quality, unusual or supicious observations, missing values (you have none of these), nonsense values, and any other feature that sticks out at me that might make modeling problematic. I always start with the Distribution platform to just get a feel for "Where's the middle, how spread out is the data, and is there anything odd or unusual going on?" From there especially with a relatively small set of predictor variables, I just use the Fit Y by X platform to look for relationships between predictors and responses...and compare what I see with my process/domain knowledge. If a scatter plot proves that 'water runs uphill' (in other words is counter known laws of physics, chemistry, biology, socioeconomic behavior, etc.) then I start to get suspicious and suspend the modeling work until I get to the bottom of the issues.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Data cleaning and prep is never fun...and takes work...but it's absolutely necessary.&lt;/P&gt;</description>
      <pubDate>Wed, 08 Mar 2017 20:10:04 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Help-choosing-JMP-Analysis-for-Multiple-Factors/m-p/36964#M21691</guid>
      <dc:creator>Peter_Bartell</dc:creator>
      <dc:date>2017-03-08T20:10:04Z</dc:date>
    </item>
  </channel>
</rss>

