BookmarkSubscribe
Choose Language Hide Translation Bar
Tarek_Zikry
Staff (Retired)

Exploring Model Classification Thresholds

These are instructions on how to download and run the Model Classification Explorer Add-In I created as a JMP Intern this summer with @KarenC and @mia_stephensThis add-in provides a unified dashboard to visualize model cutoffs and error trade-offs simultaneously. You can interactively change a model threshold and immediately see the results propagated in performance measures, confusion matrices, and ROC curves.

 

NOTE: This may be slow or freeze up with very large data sets.

 

Step 1: Download the attached .jmpaddin file and example data table.

 

selection.JPGData Selection Dialog

Step 1: Cast the Prob[Donut] to X and the Consumption to Y. These two columns are required.

 

Step 2: Set your target level to “Donut,” as that’s what the model is predicting.

 

Step 3: Set your alpha level for the statistical analysis, which is left to 0.05 on default.

 

Step 4: You have the option in the top right to choose the Performance Terminology based on a given application field, which is a purely visual option for the labels of the different measures. In this case, we’ll leave it on “General.”

 

Step 5: The visual accessibility check box is a feature to change graphical output to be able to interpret results without needing to distinguish colors on the graphs, which we’ll leave off for now.

 

Step 6: With all these initial parameters set, click “OK” to launch the platform. Depending on the size of your dataset and whether or not you used a validation column, this may take a few seconds to launch.

 

gif3.gifInteractively Exploring Cutoffs

Let me know if you have any comments or suggestions! 

 

Comments
StevePowell

This add-in is a great additioni to JMP. I'm running 14.1 and it works fine on the Doughnut data. However, it locks up JMP and then crashes on almost any other dataset I chose. Any suggestions as to what may be going wrong?
Thanks.
Steve Powell

Hi Steve,

 

Yes, it turns out that if your dataset is too large (try N < 1000) then the add-in may "freeze" JMP. I haven't had a crash, just that JMP locks up as you experienced.  We are looking into this perfromance issue. In the meantime, try a random subset of your data of interest to at least get started on exploring thresholds.  Stay tuned!

 

Thanks,

Karen

marxx

Thanks for sharing Tarik and the followup Karen, I will be interestred also in this working with larger data sets when that is available.

 

Thanks, marxx

Marxx,

 

If you want to try it on a larger data set try running it when you are going to be away from your computer for a bit and then see if has run when you return. I have found that if I wait, it will run. It just has some inital "work" to do to get going.


Karen

 

marxx

Wonderful! I am often looking for a way to dart away from the screen, so will follow your advice and cruise around while I let it spin for a bit.

 

Much appreciated, marxx

tsnow_

I have tried this several times, it crashes every time. Any alternative?

Hello, 

I am not sure why you are having trouble, can you provide more information?  What version of JMP are you using? Mac or windows? Is it crashing or hanging? How many rows in your data table?

Thanks

tsnow_

Hi KarenC. I am using JMP 14. I am using windows. I have 11,221 rows. A message that says 'not responding' pops out. Kindly assist.

Thanks

Hi  tsnow,

 

My guess is that the app is having trouble with the amount of data. We have observed issues with a large number of rows, exactly how many would likely depend on machine configuration.  However, all is not lost. With that size of data table I think you could take a random sample (subset it into another table) and then run the app.  I would start with say 10% and see if the app runs. If it does then I would look at 4 - 5 different random samples of 10% and see if you are "landing in a similar spot" regardless of sample. If so you can probabaly answer your practical question.  If not then I would etiher run a few more samlpes and/or try a bigger sample 20%, 30%, until you run into the size issue again.


Hope this helps.  You could also build the score vs. state plot with all of your data in Graph Builder as a starting place of understanding the distriubtions and where a cut-point might make sense for your problem.


Karen

tsnow_

Thank you Karen! It was a success! Again, many thanks!