cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Nayimoni
Level I

Predictive Modeling with asymmetric cost / imbalanced data set

Hello,

 

How can I instruct predictive modeling platforms on JMP Pro (partition, random forests etc.) to assign a higher cost to falsely predicting the minority class. I have a dataset that is highly imbalanced with a high cost for false negative (readmission rate of acute patients) and when I use the classifiers the AUC is low and all positive cases are falsely classified.

 

Please help on tools available to address this issue.

 

Thanks

10 REPLIES 10
dale_lehman
Level VII

Re: Predictive Modeling with asymmetric cost / imbalanced data set

As has been pointed out, improving the AUC and incorporating asymmetric costs/benefits are two different things.  The AUC has to do with the ability of the model to classify correctly, while the profit matrix steers the classification errors towards where they do the least damage.  If the AUC measure is not bad but the classifications are making costly mistakes (like failing to classify any of the 1 values), then you should try a different cutoff probability for the classification - that is what the profit matrix will do automatically, but you can also change the cutoff probabilities by hand to alter the misclassifications.  If you still don't like what you can get, then you want a better model - one that will produce a higher AUC.  Nothing about the misclassification costs will raise the AUC - only a better model can raise the AUC.  So, if you need a better model, then try other techniques, transforming your variables, changing the model settings, etc. but don't use the profit matrix to try to improve the model itself.