cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
SerranoS
Level I

LostDFs for 2-Way Anova

Y= Standardized test growth

 

X=AP Student; NonAP Student, White, Black, Hispanic, Asian, 2+Race, Pacific Islander, Native American.

 

I needed to turn on JMPs feature to ignore blank cells through the "Exclude missing data." 

 

Each of the X's were inserted and crossed using the MACROS - FULL FACTORIAL. The image is the result. Why are there LostDFs?

 

My data set is large. 8000? My columns are set up where I have the students labeled as AP (if they are not the cell is blank). This is how I also did the racial and ethnic categories. I want to test the effect of race and ethnicity on score growth for my two populations (AP and NonAP)

1 ACCEPTED SOLUTION

Accepted Solutions

Re: LostDFs for 2-Way Anova

See the Singularity Details at the top of the window. They are displayed first. Your model is over-specified. The data do not support estimating all the terms in this model. It is not a matter of sufficient N. It is a matter of combinations of factor levels.

View solution in original post

4 REPLIES 4

Re: LostDFs for 2-Way Anova

See the Singularity Details at the top of the window. They are displayed first. Your model is over-specified. The data do not support estimating all the terms in this model. It is not a matter of sufficient N. It is a matter of combinations of factor levels.

SerranoS
Level I

Re: LostDFs for 2-Way Anova

Can you please explain what you mean that my model is "over-specified"? I am not a statistician. Is it a matter of fixing how my data is structured in my spreadsheet? I have separate columns for each of those X and Y variables. 

 

Instead of two separate columns, one for AP students and a separate one for NonAP students. Is this wrong? Should I just have one and it coded either AP or NonAP?

statman
Super User

Re: LostDFs for 2-Way Anova

It would be helpful if you attach your data table.  It is possible the data table is not organized properly.  An over specified model means you have too many terms in the model for the amount of information in the data table.  Perhaps you have higher order terms (e.g., interactions or non-linear) and you don't have those combinations in the data set.

 

It appears you have 2 factors: 

X1: Placement at 2 levels

X2: Race at 7 levels

 

You should have 3 columns: Y, X1 and X2

 

 

"All models are wrong, some are useful" G.E.P. Box
dlehman1
Level IV

Re: LostDFs for 2-Way Anova

It does appear that you have too many variables in your model.  For example, are AP and nonAP the same information but inverted?  It does seem like anybody who is not marked AP is probably marked nonAP - that would cause singularity if you put both variables in the model.  It also looks like some of your demographic groups may overlap the AP columns in ways that make the information redundant.  If you attach a sample of the data table that would be clearer.