cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
learning_JSL
Level IV

Trying to improve multivariate (continuous data) regression fit by also including ordinal data

Hi - Is it possible or advisable to attempt to improve the fit of my multiple linear regression model by also incorporating an ordinal variable?  My current model contains 2 continuous explanatory variables and one continuous response variable.  I know that in the real world the ordinal variable DOES matter and clearly effects the response variable...I'm just not sure how to capture that effect.  For context, I am modeling (i.e. predicting) how ecoli (response variable) is affected by turbidity (continuous), rainfall in certain nearby locations (continuous), and other variables that did not improve the model fit.  I know, however, that the point at which the river level increases (referred to as rising stage) typically contains higher ecoli due to initial washoff from the watershed and often this interval of rising stage contains more ecoli than even at peak stage.  We know this from observations.  So....these instances of rising stage should, ideally, be captured in the model, and it seems that the best way to do this is to use an ordinal variable (e.g. rapid rise = 2, slow rise = 1, no rise, very slow rise, or fall = 0).  So far so good.  But I'm not sure how to incorporate this into my predictive model of "ecoli" vs "continuous and ordinal variables".  Any suggestions or direction on this?   Thanks in advance.   

1 ACCEPTED SOLUTION

Accepted Solutions
statman
Super User

Re: Trying to improve multivariate (continuous data) regression fit by also including ordinal data

I'm not exactly sure of your particular situation, but yes, delta change over time.  Continuous variables will always be more useful and will require less data.

"All models are wrong, some are useful" G.E.P. Box

View solution in original post

6 REPLIES 6
SDF1
Super User

Re: Trying to improve multivariate (continuous data) regression fit by also including ordinal data

Hi @learning_JSL ,

 

  What version of JMP or JMP Pro are you running?

 

  You can incorporate the ordinal data without a problem. You can even include crossed terms of rising stage with turbidity or rainfall. Just add it in the dialogue window like you normally would. You can practice using the Big Class.jmp data table, using age as an ordinal variable -- I just did it with the PLS model personality to predict "height" based on age, sex, weight and age*weight, just as an example. 

 

  That being said, if the data that you're modeling wasn't from a DOE (which it doesn't sound like it was), you'll want to make sure that you use some kind of cross validation technique to make sure that you're not overfitting your data. To me, it sounds like you're trying to predict the levels of E.Coli in waterways depending on a number of other factors. Sounds like a cool problem to work on and of big public health importance.

 

  If you have Pro, you'll want to run models with different platforms to see which one generates the best predictive model on data that you haven't used to train the model (assuming this is what you ultimately want to do). You'll want to try things in GenReg, decision trees, and even neural nets. You can then compare the models on the unused data and see which works best as a prediction model.

 

Hope this helps. Good luck!,

DS

learning_JSL
Level IV

Re: Trying to improve multivariate (continuous data) regression fit by also including ordinal data

Thanks very much for the input Diedrich.  I'll give it a try.  By the way, I am using JMP version 12.1.

statman
Super User

Re: Trying to improve multivariate (continuous data) regression fit by also including ordinal data

Why don't you measure the actual rate of rise/fall as a continuous variable instead of categorizing it?

"All models are wrong, some are useful" G.E.P. Box
learning_JSL
Level IV

Re: Trying to improve multivariate (continuous data) regression fit by also including ordinal data

That's actually a good idea and a better measure than what I am using.  Are you thinking delta rise / delta time?     

statman
Super User

Re: Trying to improve multivariate (continuous data) regression fit by also including ordinal data

I'm not exactly sure of your particular situation, but yes, delta change over time.  Continuous variables will always be more useful and will require less data.

"All models are wrong, some are useful" G.E.P. Box
learning_JSL
Level IV

Re: Trying to improve multivariate (continuous data) regression fit by also including ordinal data

Thanks Statman.  I'll give it a try.