What is ROC and how does one interpret the ROC graph?
Oct 16, 2015 7:40 AM
ROC curves are one type of ‘goodness of fit’ curve that is commonly used with a categorical response. Generally, an ROC curve is used to provide a visualization of the sensitivity of a model that is being used to categorize a response. The idea is you would like to minimize the probability that a response will be a false positive and the ROC curve provides a nice visualization of the probability that a model will misclassify a response. Generally you’d like to see a curve that ascends VERY quickly and vertically from the origin towards the upper left corner of the plotting area, then curves ‘quickly’ to the right with a long flat line across the top of the chart paralleling the upper top of the charting area. A common plotting convention is to place a 45 degree line from the graph origin to the upper right corner of the charting area. This 45 degree line is interpreted as the ‘flip a coin’ line. That is, if the ROC curve isn’t far from this 45 degree line, you might as well flip a coin to categorize a binary response. So the further away from the 45 degree line, the ROC curve trace is, the more sensitive (better than flipping a coin) the model is.
A common statistic that accompanies ROC curves is the AUC (area under the curve) which gives a percentage of the space ‘under the curve’ the idea being if the model perfectly classified all responses, it would be 100%. AUCs are one measure that is often used to compare one model to another. Idea, again, is the model with the higher AUC is more sensitive than models with lower AUCs.
You’ll see ROC curves deployed widely within many JMP and JMP Pro modeling personalities whenever there is a categorical response. So for platforms such as nominal logistic regression, Partition (Decision Tree, Bootstrap Forest, Boosted Tree, etc.), Generalized Regression, PLS, pretty much whenever there is a categorical/nominal response…you’ll see the option to plot an ROC curve in JMP or JMP Pro.