I am studying RNA gene expressions in various diseases and am using JMP Pro 14 for modeling. I wish as a control to model a control set of control genes (housekeeping genes, which do not vary by condition). Using nominal logistic regression, naive bayes, and partition, I got consistent negative results (ROCs area ~0.45 -0.55) in both the training and validation sets. But bootstrap forest gave me high ROCs >.80 in the training and ~ .55 in the validation. This seemed odd to me. So I created a synthetic database of 100 rows: Response = yes or no, N=50 each. I created three random number columns, plus a validation column stacked so that half of the yes or no were training of validation. Again the nominal logistic regression, naive bayes, and partition gave ROC areas of about 0.5 as expected. But the bootstrap forest gave an ROC area of 0.82 and the validation ROC area of 0.48. I used the bootstrap forest settings as suggested in the user documentation. I fiddled with these without obvious effect. Why does boostrap forest yield a high ROC area for a training set of random numbers? Are my settings incorrect? Thanks. Neal.