Is there any sample data set for testing Logistic Regression with a large number (10-20) of independent variables?
Mar 24, 2014 1:28 PM(2681 views)
I have a problem with 10s of effects (independent variables) and one categorical dependent variable. The probability of the categorical variable taking on a value of "1" is very small- about 10 in a million. Consequently, I have 10s of millions of data points from which I would like to estimate a model using Logistic Regression, that would give me the coefficients for the independent variables to compute the probability of the dependent variable.
Before I run Logistic Regression and just take whatever coefficients it spits out, I would like to see how well JMP handles a problem of such dimensionality. Is anyone aware of a test data set that I could use to test JMP on a problem of similar size?