Solved: Re: Understanding the meaning of a lift curve

Report Inappropriate Content · Jun 8, 2023 5:41 PM

I'm trying to understand the meaning of a lift curve. If it starts very high (>4) but drops to 0 after .1 does that mean there is no predictive power for 90% of the population?

dale_lehman · Nov 1, 2021 08:51 AM

You should provide a picture if you want a more thorough answer, but I think you are largely correct. If the lift curve is 4 at a value of .1, that means that targeting the 10% of the observations with the highest probabilities (of whatever the target variable is) will pick up 40% of the actual positive occurrences. If it drops after that, then the subsequent probabilities don't pick up many more occurrences. However, I believe it should only drop towards 1, not zero. When you get to 100% of the highest probabilities (i.e., all of the observations), you should have 100% of the occurrences. A lift curve of 0 doesn't make any sense to me.

View solution in original post

dale_lehman · Nov 1, 2021 08:51 AM

You should provide a picture if you want a more thorough answer, but I think you are largely correct. If the lift curve is 4 at a value of .1, that means that targeting the 10% of the observations with the highest probabilities (of whatever the target variable is) will pick up 40% of the actual positive occurrences. If it drops after that, then the subsequent probabilities don't pick up many more occurrences. However, I believe it should only drop towards 1, not zero. When you get to 100% of the highest probabilities (i.e., all of the observations), you should have 100% of the occurrences. A lift curve of 0 doesn't make any sense to me.

Mark_Bailey · Nov 1, 2021 01:17 PM

To clarify @dale_lehman's example, lift is a factor. Lift equal to 4 means that 4 times as many targets were conditionally predicted by the model than would be predicted by the marginal probability. See this slide from our copyrighted JMP training materials provides an example:

The reduce model is the marginal probability, 0.1 in this case. The full model is conditioned on the predictors or factors.

Yes, it only drops to 1 because it is a factor, comparing the true predicted targets to the marginal targets.

Also, the domain is always from the top or left of the axis. So a value of 0.4 means the top 40% of the cases ranked by predicted probability.

Note that there are different definitions of lift in use.

dale_lehman · Nov 1, 2021 01:31 PM

Thanks for the clarification. It makes me wonder about something: if you were to take, say, a point near the top of the lift curve, e.g., at .1 on the x axis, then the y axis is the multiplier (factor) that relates that to the true predictions. Let's say the lift is 4.0 at an x value of 10%. Then, if we were to draw a curve xy=40% into the graph, that would represent no additional lift beyond what the top 10% of the probabilities predicted. For example, when x=20%, the xy=40% would show a lift of 2.0 at x=20%. To the extent that the lift curve at x=20% lies above a y value of 2.0, then the model is adding predictive value beyond the highest 10% of the probabilities. It seems to me that the area between the xy=constant curve and the lift curve provides some sort of measure of the lift over the range of x values (though it occurs to me that at x=100%, this xy=constant curve would go below 1.0 (0.4 in my example). I would think there is some way to manipulate these areas into a type of measure, akin to the AUC. Just wondering - do you know of anything along these lines?