Subscribe Bookmark RSS Feed

Linear regression with categorical X axis and ANOVA analyses

cchb

Community Trekker

Joined:

Aug 12, 2013

Hola everyone, I am pretty new to JMP Pro and I have a couple of questions:

1. I am having problems finding how to do a linear regression when the X axis is categorical. I have a dataset of leaves of 9 different age classes (15 to 30 sampled leaves for each leaf age class arranged from youngest to oldest age class as follows: Y1, Y2, Y3, Y/M, M, O, S, O/S1, O/S2), 3 different light conditions (SU, SH Mid, SH Low) and 12 different trees. These leaves have 12 different physical & chemical traits such as leaf thickness, leaf area, leaf water content, leaf carbon content, etc. and I want to find the trend for each leaf trait (Y axis) arranged by tree and light condition as the leaves age (X axis). I have tried going into the "Analyze" menu and doing a "Fit Y by X" regression arranged by tree and light condition but what I get is a Oneway Analysis for the leaf trait and I don't know what to do with this.

2. I need to perform 1 & 2-way ANOVAS and Tukey HSD tests but I can't find the platforms for these and HELP has so far not been very helpful.

I have previously used SPSS to do stats so I am a bit lost as to how to get the above mentioned analyses done in JMP Pro.

Many thanks in advance for the help,

Cecilia

1 ACCEPTED SOLUTION

Accepted Solutions
ms

Super User

Joined:

Jun 23, 2011

Solution

Linear regression in the context of estimating the relationship between a dependent stochastic variable and one or more independent variables is not limited to continuous variables. In regression analysis, categorical variables are often called dummy variables that take the value 0 or 1. True that the the statistical test procedure is equivalent to ANOVA but the parameterized model and the estimated prediction formula can also be viewed from a regression perspective. Actually, in JMP the Analysis of Variance report in Fit Model is categorized as one of several "Regression Reports", regardless of independent variables being categorical or not.

Fit Model can be used for both oneway and two-way analyses. Fit Model also allows you to view and save the prediction formula (i.e. the regression function with estimated parameters) and there are different options for parameterization (found under the red triangle -> Estimates...). Tukey HSD is invoked from the red triangle above each independent variable (called "Effects" in JMP).

6 REPLIES
paigemiller

Community Trekker

Joined:

Apr 12, 2012

There is no such thing as linear regression with categorical X's, except in the sense that it actually is ANOVA.

Oneway ANOVA in JMP is Fit Y by X

Two-Way ANOVA in JMP is Fit Model

Byron_JMP

Staff

Joined:

Apr 26, 2012

Try using the Fit Model Platform (its under the Analyze menu, below the Fit Y by X platform you've already found)

This platform will perform a multiple linear regression

It looks like you have multiple X's , these go into the Construct Model Effects box, and your Y's go into the Y box (multiple Y's are OK)

This is a short description of how to set it up

http://jmp.com/academic/pdf/learning/05_multiple_linear_regression.pdf

There are a lot of other quick guides here

JMP | Learning Library

paigemiller

Community Trekker

Joined:

Apr 12, 2012


Byron wrote:



Try using the Fit Model Platform (its under the Analyze menu, below the Fit Y by X platform you've already found)


This platform will perform a multiple linear regression


I wouldn't call it "multiple linear regression" in the case where the original poster said she has categorical X. So I'm not sure the example you give fits.

ms

Super User

Joined:

Jun 23, 2011

Solution

Linear regression in the context of estimating the relationship between a dependent stochastic variable and one or more independent variables is not limited to continuous variables. In regression analysis, categorical variables are often called dummy variables that take the value 0 or 1. True that the the statistical test procedure is equivalent to ANOVA but the parameterized model and the estimated prediction formula can also be viewed from a regression perspective. Actually, in JMP the Analysis of Variance report in Fit Model is categorized as one of several "Regression Reports", regardless of independent variables being categorical or not.

Fit Model can be used for both oneway and two-way analyses. Fit Model also allows you to view and save the prediction formula (i.e. the regression function with estimated parameters) and there are different options for parameterization (found under the red triangle -> Estimates...). Tukey HSD is invoked from the red triangle above each independent variable (called "Effects" in JMP).

cchb

Community Trekker

Joined:

Aug 12, 2013

Apologies for not replying to answers earlier, I am in the process of writing my PhD thesis and am working on different analyses at the same time and had to concentrate on a different set for a while. Many thanks for everyone's helpful comments, in the next couple of weeks I will be getting back to these regression and ANOVA analyses and will try the different options mentioned.

Thanks again, Cecilia

cchb

Community Trekker

Joined:

Aug 12, 2013

Further to my original post, I am currently trying to use the Fit model platform to determine how each of my sampled leaf traits (e.g. leaf thickness, leaf area, leaf water content, leaf carbon content, etc.), vary with 3 model effects:

1. tree species (12 different trees)

2. leaf age (9 leaf ages from youngest to oldest: Y1, Y2, Y3, Y/M, M, O, S, O/S1, O/S2)

3. different light conditions (SUn, SHade Mid, SHade Low)

I tried running an analysis using leaf water content as 'Y' and lightCondition*leafAge as model effects using the 'Cross' option(interactions). However, I got a window saying: 'The model is missing an effect' and this made me realise that I had not considered that I have an unbalanced sampling design in 3 different ways:

1. the sample sizes are different for each leaf age when all the trees are pooled together and also when considered by individual tree

2. the leaf ages between trees are all different: a tree might have only 1 leaf age or up to six leaf ages with each tree having a different combination of the possible 9 leaf ages

3. all trees have SU leaves but only 4 trees have SH Mid and one of these has also SH Low, furthermore, only the canopy position SH Mid has leaf age O/S2

Could someone advise me how to deal with this? I tried changing the 'Personality' of the model from 'standard least squares' to 'generalised linear model' with 'distribution' set as 'normal' and the 'link function' as 'identity' (i.e. no transformation, I think this is correct?). But I still get the 'The model is missing an effect' message and basically, I think the problem comes down to how to deal with missing data. How can I tell JMP Pro to put a 'NA' in all the places where I have missing data? I ignored the 'missing effect' error message and ran the analysis anyway and the results show that JMP Pro seems to be 'filling in' missing data. For example, the results include leaf age O/S2 for the SUn leaves which does not exist (I think is the source of the 'model is missing an effect' window) and from looking at the results it seems the internal algorithm weighting and balancing are getting messed up due to this. I would be very grateful for any advise on how to deal with this missing data in JMP Pro.

Thanks in advance for the help! Cecilia