<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic PCA to separate two outcomes in Discussions</title>
    <link>https://community.jmp.com/t5/Discussions/PCA-to-separate-two-outcomes/m-p/359229#M60931</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I have a binary outcome for a set of patients, and a long list of variables measured (i.e. potential predictors/characteristics). I can do a PCA on the whole set, or PCA on each set separately (i.e. positive patients and negative patients separately). But what I really want is the following: what are the two principal components that best separate the positives from the negatives? I've played a bit with the discriminant analysis but haven't got too far, any help or suggestions would be welcome.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;uriel&lt;/P&gt;</description>
    <pubDate>Fri, 09 Jun 2023 00:28:59 GMT</pubDate>
    <dc:creator>utkcito</dc:creator>
    <dc:date>2023-06-09T00:28:59Z</dc:date>
    <item>
      <title>PCA to separate two outcomes</title>
      <link>https://community.jmp.com/t5/Discussions/PCA-to-separate-two-outcomes/m-p/359229#M60931</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I have a binary outcome for a set of patients, and a long list of variables measured (i.e. potential predictors/characteristics). I can do a PCA on the whole set, or PCA on each set separately (i.e. positive patients and negative patients separately). But what I really want is the following: what are the two principal components that best separate the positives from the negatives? I've played a bit with the discriminant analysis but haven't got too far, any help or suggestions would be welcome.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;uriel&lt;/P&gt;</description>
      <pubDate>Fri, 09 Jun 2023 00:28:59 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/PCA-to-separate-two-outcomes/m-p/359229#M60931</guid>
      <dc:creator>utkcito</dc:creator>
      <dc:date>2023-06-09T00:28:59Z</dc:date>
    </item>
    <item>
      <title>Re: PCA to separate two outcomes</title>
      <link>https://community.jmp.com/t5/Discussions/PCA-to-separate-two-outcomes/m-p/359255#M60933</link>
      <description>Hi,&lt;BR /&gt;Have you considered the Discriminant Analysis platform, or the Response Screening platform if you need to filter / focus which variables are the most relevant to your classification.&lt;BR /&gt;Best,&lt;BR /&gt;TS</description>
      <pubDate>Tue, 16 Feb 2021 16:34:35 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/PCA-to-separate-two-outcomes/m-p/359255#M60933</guid>
      <dc:creator>Thierry_S</dc:creator>
      <dc:date>2021-02-16T16:34:35Z</dc:date>
    </item>
    <item>
      <title>Re: PCA to separate two outcomes</title>
      <link>https://community.jmp.com/t5/Discussions/PCA-to-separate-two-outcomes/m-p/359259#M60935</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/5314"&gt;@utkcito&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp; PCA might not be the right tool for you. Since you have a binary outcome (on/off, 1/0, dead/alive), it sounds more like a logistic regression problem, much like the Titanic survivors example in the sample data in JMP. The purpose of PCA isn't necessarily for the prediction of an outcome, i.e. separating out 0/1 outcomes. You might be better off with a decision tree, neural net, or other tree-based methods like XGBoost. You could even try support vector machines. Many of those options depend on what version of JMP you have (e.g. Pro or standard).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp; PCA is really designed to find a set of linearly independent vectors (of the predictor data set -- you would not feed it the response data) that maximizes the&amp;nbsp;&lt;EM&gt;variability&lt;/EM&gt; explained in the data along a set of orthogonal principle components. You can read up on it in JMP's help &lt;A href="https://www.jmp.com/support/help/en/15.2/index.shtml#page/jmp/principal-components.shtml" target="_self"&gt;here&lt;/A&gt;. The principle components won't necessarily be good predictors to separate the positive/negatives. This might be much better suited with the SVM platform. If you don't have Pro, you'll probably want to try either the partitioning decision tree method or the neural net.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp; Another thing: depending on how many rows you have, you might want to generate a validation column stratified on the outcome column and use that for a validation of your model. You might also consider adding a null factor column (see the autovalidation add-in for JMP) to see if the set of possible predictors actually shows up more often in the model than a completely random and orthogonal null factor.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hope this helps.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Good luck!,&lt;/P&gt;&lt;P&gt;DS&lt;/P&gt;</description>
      <pubDate>Tue, 16 Feb 2021 16:37:57 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/PCA-to-separate-two-outcomes/m-p/359259#M60935</guid>
      <dc:creator>SDF1</dc:creator>
      <dc:date>2021-02-16T16:37:57Z</dc:date>
    </item>
    <item>
      <title>Re: PCA to separate two outcomes</title>
      <link>https://community.jmp.com/t5/Discussions/PCA-to-separate-two-outcomes/m-p/359323#M60941</link>
      <description>&lt;P&gt;Hi TS and DS,&lt;/P&gt;&lt;P&gt;I am not looking for predictors with the PCA. I want to characterize, describe. I think the discriminant analysis should do it under the wide linear method as it shows principal components, but can't find the details of the principal components it generates to assess if that really is the case.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Uriel.&lt;/P&gt;</description>
      <pubDate>Tue, 16 Feb 2021 18:07:52 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/PCA-to-separate-two-outcomes/m-p/359323#M60941</guid>
      <dc:creator>utkcito</dc:creator>
      <dc:date>2021-02-16T18:07:52Z</dc:date>
    </item>
    <item>
      <title>Re: PCA to separate two outcomes</title>
      <link>https://community.jmp.com/t5/Discussions/PCA-to-separate-two-outcomes/m-p/359548#M60958</link>
      <description>&lt;P&gt;You might consider PLS-DA instead of PCA. PLS ensures that the first components relate to the most variation in your target, in other words for a single Y you should always see the best separation in score plots of components 1 and 2.&amp;nbsp; You could then look at loadings, VIP v Coefficients, or use methods you are used to in PCA to assess variable importance.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can launch PLS-DA either using the PLS Personality of the &lt;STRONG&gt;Fit Model&lt;/STRONG&gt; platform, or by creating creating a variable that is either 1 or 0 depending on the category, and then including that in the 'Y' section of the &lt;STRONG&gt;PLS&lt;/STRONG&gt; platform.&lt;/P&gt;</description>
      <pubDate>Wed, 17 Feb 2021 04:44:55 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/PCA-to-separate-two-outcomes/m-p/359548#M60958</guid>
      <dc:creator>ih</dc:creator>
      <dc:date>2021-02-17T04:44:55Z</dc:date>
    </item>
    <item>
      <title>Re: PCA to separate two outcomes</title>
      <link>https://community.jmp.com/t5/Discussions/PCA-to-separate-two-outcomes/m-p/359670#M60961</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/5314"&gt;@utkcito&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp; I agree with&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/6657"&gt;@ih&lt;/a&gt;&amp;nbsp;, the PLS platform might be better suited to what you're wanting to do, or the SVM platform too.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp; One reason why the discriminant analysis might not be working is that it sounds like your data set might not entirely fit the expected format for the DA. The DA requires categorical X's and continuous responses Y's, and it is kind of the inverse, using the continuous Y's to ultimately predict the X's, i.e. it predicts a classification variable based on a known continuous Y variable.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp; If you want to use the X's to characterize/describe the outcome Y's, then PLS, partitioning, or SVM might be better platforms, I think. And as&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/6657"&gt;@ih&lt;/a&gt;&amp;nbsp;mentioned, using the VIP plots, you can use that along with the VIP threshold to generate a simplified model where it's not using every single X column in the model. Clustering could also be an option, but it might not make as much sense as the PLS when trying to interpret the results.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp; Hope you can use one of those platforms to help your work.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Good luck!,&lt;/P&gt;&lt;P&gt;DS&lt;/P&gt;</description>
      <pubDate>Wed, 17 Feb 2021 13:21:29 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/PCA-to-separate-two-outcomes/m-p/359670#M60961</guid>
      <dc:creator>SDF1</dc:creator>
      <dc:date>2021-02-17T13:21:29Z</dc:date>
    </item>
    <item>
      <title>Re: PCA to separate two outcomes</title>
      <link>https://community.jmp.com/t5/Discussions/PCA-to-separate-two-outcomes/m-p/359809#M60976</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/5314"&gt;@utkcito&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;While I fully agree with the others on potential options for analyzing your data I would like to offer up another JMP Pro platform that will be well worth your time to try out.&amp;nbsp; Generalized Regression (GR) is extremely well suited for binomial/binary outcomes.&amp;nbsp; The LASSO and Elastic Net algorithms will allow you to find the best overall combination of variables in the factor space you are working and not in a principal component or latent factor(PLS) space.&amp;nbsp; Both PCA and PLS models can be difficult to determine what is most important without a lot of extra effort.&amp;nbsp; You also have many options for cross-validation to avoid overfitting.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Neural Nets (NN) might be useful to you.&amp;nbsp; The TanH, Hyperbolic Tangen function is very well suited for binomial outcomes and you will be automatically directed to use cross-validation to avoid overfitting.&amp;nbsp; (JMP and JMP Pro)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Bootstrap Forest as a partitioning method will also allow you to find the most significant factors in your factor space. (JMP Pro)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;One last thought, if you have access to JMP 16 EA you can try Model Screening under Predictive Modeling to help guide you to the best overall model for your data.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Model Screening will be one of the great new features in JMP Pro 16&lt;/P&gt;</description>
      <pubDate>Wed, 17 Feb 2021 19:02:14 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/PCA-to-separate-two-outcomes/m-p/359809#M60976</guid>
      <dc:creator>Bill_Worley</dc:creator>
      <dc:date>2021-02-17T19:02:14Z</dc:date>
    </item>
    <item>
      <title>Re: PCA to separate two outcomes</title>
      <link>https://community.jmp.com/t5/Discussions/PCA-to-separate-two-outcomes/m-p/361129#M61058</link>
      <description>&lt;P&gt;Thanks to all that responded, your comments are helpful - I've in fact used all these options in the recent past to analyze the data. I was interested particularly in how an orthogonal discrimination would look like, but if not directly possible I'll move on.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Uriel.&lt;/P&gt;</description>
      <pubDate>Sat, 20 Feb 2021 16:12:57 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/PCA-to-separate-two-outcomes/m-p/361129#M61058</guid>
      <dc:creator>utkcito</dc:creator>
      <dc:date>2021-02-20T16:12:57Z</dc:date>
    </item>
  </channel>
</rss>

