BookmarkSubscribeSubscribe to RSS Feed

Community Trekker

Joined:

Apr 24, 2017

## Set Probability Threshold / Confusion Matrix

The default probability threshold is set to 0.5.  That can make it difficult to compare models when one has higher specificity and lower sensitivity, but both have similar accuracies and positive predictive values.  How do we change the probability threshold to compare model performance?

Thanks!

Aaron

1 ACCEPTED SOLUTION

Accepted Solutions

Super User

Joined:

Feb 10, 2013

Solution

## Re: Set Probability Threshold / Confusion Matrix

Aaron,
It is unclear which platform you are working in. If you happen to have JMP Pro and are fitting models using generalized regression then one of my favorite JMP13 features is the ability to interactively change the probability threshold and watch my sensitivity and specificity estimates change. But I digress. To explore various cut-offs on a model I might save the model formula to my data table so that I can set up a column with the "model call" (use the formula: score> cut-off to get a column of 0s and 1s) and use that to evaluate model performance (thinking in the diagnostic model world). If you are in the logistic platform and have ROC curves for your models then the ROC Table will provide "2x2" tables for each possible cut-off in your data set. Finally there is an add-in in the file exchange that will calculate a series of performance measures and confidence intervals for diagnostic type models.

Hopefully something here is helpful to what you are trying to do.
2 REPLIES

Super User

Joined:

Feb 10, 2013

Solution

## Re: Set Probability Threshold / Confusion Matrix

Aaron,
It is unclear which platform you are working in. If you happen to have JMP Pro and are fitting models using generalized regression then one of my favorite JMP13 features is the ability to interactively change the probability threshold and watch my sensitivity and specificity estimates change. But I digress. To explore various cut-offs on a model I might save the model formula to my data table so that I can set up a column with the "model call" (use the formula: score> cut-off to get a column of 0s and 1s) and use that to evaluate model performance (thinking in the diagnostic model world). If you are in the logistic platform and have ROC curves for your models then the ROC Table will provide "2x2" tables for each possible cut-off in your data set. Finally there is an add-in in the file exchange that will calculate a series of performance measures and confidence intervals for diagnostic type models.