Share your ideas for the JMP Scripting Unsession at Discovery Summit by September 17th. We hope to see you there!
Choose Language Hide Translation Bar
Highlighted
rexneal
Level III

Query about Response Screening

I am studying RNA gene expressions in various diseases and am using JMP Pro 14 for modeling. I wish as a control to model a control set of control genes (housekeeping genes, which do not vary by condition). Using nominal logistic regression, naive bayes, and partition, I got consistent negative results (ROCs area ~0.45 -0.55) in both the training and validation sets. But bootstrap forest gave me high ROCs >.80 in the training and ~ .55 in the validation. This seemed odd to me. So I created a synthetic database of 100 rows: Response = yes or no, N=50 each. I created three random number columns, plus a validation column stacked so that half of the yes or no were training of validation. Again the nominal logistic regression, naive bayes, and partition gave ROC areas of about 0.5 as expected. But the bootstrap forest gave an ROC area of 0.82 and the validation ROC area of 0.48. I used the bootstrap forest settings as suggested in the user documentation. I fiddled with these without obvious effect. Why does boostrap forest yield a high ROC area for a training set of random numbers? Are my settings incorrect? Thanks. Neal.

1 REPLY 1
Highlighted
ian_jmp
Staff

Re: Query about Response Screening

I suggest you send an email to support@jmp.com directing them to this thread. You should also mention the version number of JMP you are using.

Article Labels

    There are no labels assigned to this post.