cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
dale_lehman
Level VII

alternative cut-off confusion matrix

I have used the alternative cut-off probability confusion matrix add-in many times.  Suddenly it is producing nonsensical results on some data sets.  I verified that it still works as expected on an old data set.  Attached is one of the recent data sets in which it does not appear to be working.  The target variable is Stroke and I have a Validation column and a number of prediction model probabilities labeled with the probability of a stroke given the name of the modeling method (e.g., look at the column labeled boosted tree or logistic regression for 2 examples that are not working properly).  Strokes are very rare (around 5%) in the dataset and the probabilities derived from the predictive models are correspondingly small.  But when I use the add-in and a range of cutoff probabilities from 0.1 to 0.9, I keep getting zero predictions of stroke - which does not match up with the distribution of probabilities from these models (which, as I said, are low, but not zero).  Can anybody shed light on why this is happening?

Thanks.

5 REPLIES 5
SDF1
Super User

Re: alternative cut-off confusion matrix

Hi @dale_lehman ,

 

  I can confirm that the alternative cut-off confusion matrix add-in doesn't work on the prediction columns you mentioned in your post. However, I re-ran the Boosted Tree platform following the same predictors and response that you used, even verified the equation was the same, but the alternative cut-off matrix add-in worked correctly on this new formula. I found the same thing with the fit nominal logistic. If I re-run it (I do end up with a different equation), I can run the alternative cut-off and it works as expected. I'm running JMP Pro 15.2.1. Something might be a bit off with your two fits. Have you checked the ROC curves for the fits to see if they're above the diagonal line. There was no saved script for those two model formulas in your table, so it's hard to assess how well the fits work. Have you tried re-fitting the logistic and boosted tree models again to see if the newer equations work? When I run them a second time, it works with the add-in.

 

  On another note, I noticed that you used a random seed in generating your validation column. Since you have very few instances of stroke==1, you might consider making a validation column that is stratified on stroke so that it can take this into account. Also, when performing the nominal logistic fit, it might be better to model the Target Level: 1 rather than 0, since you're probably trying to predict the likelihood of a stroke (stroke==1) versus the likelihood of no stroke (stroke==0).

 

  Just some thoughts. Hope this helps.

 

Good luck!,

DS

dale_lehman
Level VII

Re: alternative cut-off confusion matrix

Diedrich

Thank you - I would mark your reply as a solution, but I really want to hear if someone can explain why this happened.  Indeed, if I rerun those models, the add-in works as intended, but it did not on the original file I submitted - even though the probabilities produced by those models should have yielded the correct confusion matrices.  So, I have no idea why the add-in suddenly stopped working correctly and then worked once again.  I am also running JMP Pro15.2.1.

I concur with your other point about stratifying the validation set since it is so unbalanced.  I did not do that this time, but I was quite aware of the reason it makes sense to do that.

SDF1
Super User

Re: alternative cut-off confusion matrix

I have no idea why it stopped working, it would be nice if someone could figure that part out. I was able to verify the same behavior you described with the data table you provided. Not sure why it worked when re-running the models. I couldn't find any discrepancies with the formulas or anything that would make it behave funny.
Jeff_Perkinson
Community Manager Community Manager

Re: alternative cut-off confusion matrix

I wonder if @Wendy_Murphrey might have some insight about the Alternate Cut-off Confusion Matrix Add-In here.

-Jeff

Re: alternative cut-off confusion matrix

Hi, @dale_lehman .

 

I'm glad to learn that you have found The Alternate Cut-off Confusion Matrix Add-In useful.  It was developed several years ago and was based upon only the predicted (or probability) columns saved from the Partition, Neural (JMP Pro only), or Fit Model reports. This is noted in the description of the Predicted Value column here.  I'll be glad to investigate the behavior you are seeing with other types of predictions and will update this page with my findings.

Wendy