Subscribe Bookmark RSS Feed

multiple variable logistic regression.

harshgoyalbits0

Community Member

Joined:

Jun 30, 2016

I am new to JMP and have two doubts on how to solve it.

I have attaced a xlsx file with variables in it. I am looking for these two simple questions.

1 use the other variables to predict "passion". Passion would need to be aggregated say into high/low first and then we can use logistic regression or decision trees.
I don't know how to classify the column into 0 and 1 and how to use multiple variables for logistic regression. I am getting a bar graph kind of plot.

2 do a segmentation of the states based on all of the variables... this ends up creating groups of states by similar "golfiness". For this, we need to use cluster analysis.

Please help.

Thanks

2 REPLIES
Dan_Obermiller

Joined:

Apr 3, 2013

Multiple Logistic Regression is accomplished by using Fit Model. The Y variable will need to have either the nominal or ordinal modeling type depending on if you want nominal or ordinal logistic regression. For the data you provide, passion has 18 different levels or categories, many with only 1 observation. The data is too sparse with this many levels to develop a good logistic regression model. If you wish to collapse the data into fewer categories, that will certainly improve the modeling efforts. You need to determine how to collapse into categories that are meaningful to you.

JMP can also perform a cluster analysis and is easily available from the Analyze > Multivariate Methods menu. Again, there are several options from which you need to choose which would be too lengthy to describe here.

Steven_Moore

Super User

Joined:

Jun 4, 2014

I tried your data in the Logistic platform and got disappointing results using several strategies for dividing your data into High, Medium, ad Low Passion.  However, in the Neural Network platform, I got some very interesting results! You might want to play around with the various parameters until you get a reasonable model.  I like to use 3 and 2, respectively, in the first and second layers in both the TanH and Linear Activation Types.  I got good models with similar r-squared values for the training and validation sets.

Steve