cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
TCM
TCM
Level IV

How to do Model Selection from Contingency Analysis

My domain is Consumer Research.  I have 7 explanatory variables and 1 response variable.  All are categorical.

My objective is to replace the response variable with one or a combination of the explanatory variables.

My approach:

  • Perform contingency analyses of each of the explanatory variables with the response.
  • From the contingency analysis, select the explanatory variable with the highest R-square (U).  Because the number of levels are different, I don’t think Likelihood Ratio or Pearson chi-square values would be helpful in the model selection.
  • From the Measures of Association table, use Lambda and Uncertainty values.  Choose the explanatory variable with the highest values.

Below is a summary describing the variables and results of contingency analysis.

Variable #

# levels

Rsq(U)

LR-chi sq

Pearson chi sq

Lambda Asym (C|R, R|C)

Lambda Sym.

Uncertainty coef (C|R, R|C)

Uncertainty coef (Sym)

A

4

.08

248

268

.08, .13

.1

.08, .08

.08

B

3

.02

76

75

.06, .02

.04

.02, .03

.03

C

10

.04

136

153

.05, .04

.045

.04, .03

.034

D

10

.31

961

1207

.33, .13

.22

.3, .2

.245

E

18

.34

1056

1498

.34, .1

.21

.34, .19

.24

F

40

.345

1084

1590

.34, .05

.17

.35, .15

.21

G

6

.32

1000

1286

.35, .28

.31

.32, .26

.29

 

The response variable has 6 levels.

Levels in variables, D, E, F, G are ordered by ascending intensity.  It is assumed the low and high boundaries are similar.

Questions:

  1.  Is the approach as outlined valid?
  2. Can I combine one of the variables from D,E,F,G ( I am inclined to select G) with one or more from A,B,C to get a better model (i.e., better replacement for the response)?  If so, how might one do this and what metrics might be used to select the best model?
1 ACCEPTED SOLUTION

Accepted Solutions

Re: How to do Model Selection from Contingency Analysis

The two-way contingency table analysis is valid in its own right, but it is not sufficient for your purpose. Logistic regression using a linear predictor that combines all the variables will satisfy your need better.

 

Start here in the JMP on-line documentation to learn what logistic regression is, how to set up your data, how to launch the analysis platform, and the results that are available to answer your questions.

View solution in original post

2 REPLIES 2

Re: How to do Model Selection from Contingency Analysis

The two-way contingency table analysis is valid in its own right, but it is not sufficient for your purpose. Logistic regression using a linear predictor that combines all the variables will satisfy your need better.

 

Start here in the JMP on-line documentation to learn what logistic regression is, how to set up your data, how to launch the analysis platform, and the results that are available to answer your questions.

TCM
TCM
Level IV

Re: How to do Model Selection from Contingency Analysis

Thank you, Mark!
I have used Logistic Regression in the past but only with binary responses (e.g., Stable/Unstable). This instance would be a great learning opportunity. Will get right to it!