BookmarkSubscribe
Choose Language Hide Translation Bar
Highlighted
Abbiegraham
Community Trekker

Multivariate dataset

Hello,

 

I was wondering if I could get some help knowing how to analyse the attached data set using JMP, (not JMP Pro).

 

There was a management change applied accross a group of 59 different farms. This occurred at the end of May, so April and May was the old regime "Treatment 1" and July and August was the new regime "Treatment 2",

 

We want to know if this change significantly affected bodyweight, egg production, deviation from target bodyweight and deviation from target egg production, overall and also in two specific time periods (30-45 weeks) and 45 + weeks. We would like to know separate and interactive effects of:

 

Age, breed, treatment, farm "issue"

 

Please note the following:

  • Each row is an average value for a farm
  • Not every farm has bodyweight data available
  • Sometimes the same farm was measured before and after the regime change, in others, performance was only measured after the regime change
  • The birds in Treatment 2 will be older, and age affects performance.
  • Breed is not balanced, i.e. there are much less replicates in Breed Type 4
  • We would like to know the affect of the "issue farm" status

 

Any help would be appreciated

 

Thanks in advance

Abbie

0 Kudos
5 REPLIES 5
KarenC
Super User

Re: Multivariate dataset

Hi Abbie,

You should start by doing some "data cleaning". You need to make sure that your numeric data is assigned to numeric columns (not as character data). See https://www.jmp.com/support/help/en/15.0/?os=mac&source=application#page/jmp/view-or-change-column-i... for help. Then I would start with graph builder and explore your data (https://www.jmp.com/support/help/en/15.0/?os=mac&source=application#page/jmp/graph-builder-2.shtml#). After that you could move to specific testing for differences if needed.

Karen
phil_kay
Staff

Re: Multivariate dataset

Just to add that you can change data type and modelling type for multilpe columns.

 

Select all relevant columns (in this case the nominal ones that should be numeric/continuous)

Cols > Standardize Attributes

 

Then go to Attributes and select Data Type > Numeric and Modeling Type > Continuous.

0 Kudos
phil_kay
Staff

Re: Multivariate dataset

Also a good idea to use Cols > Recode to add a "No" value where Problem Farm is not "Yes" (instead of just missing value). Assuming that might be a variable of interest, you will want to have both "Yes" and "No" values to use it.
0 Kudos
phil_kay
Staff

Re: Multivariate dataset

I'm a bit confused because you said that each row is a farm but I extracted just the data on 1 farm (anonymised as "V3_0"):

 

Treatment Month Farm House Number Problem Farm Bird Age Breed
Treatment 1 April V3_0 H2 no 57 Lohmann
Treatment 1 April V3_0 H2 no 58 Lohmann
Treatment 1 April V3_0 H2 no 59 Lohmann
Treatment 1 April V3_0 H2 no 60 Lohmann
Treatment 1 May V3_0 H2 no 61 Lohmann
Treatment 1 May V3_0 H2 no 62 Lohmann
Treatment 1 May V3_0 H2 no 63 Lohmann
Treatment 1 May V3_0 H2 no 64 Lohmann
Treatment 2 July V3_0 H2 no 70 Lohmann
Treatment 2 July V3_0 H2 no 71 Lohmann
Treatment 2 July V3_0 H2 no 72 Lohmann
Treatment 2 July V3_0 H2 no 73 Lohmann

 

I can see multiple rows for the same farm, month and house. It does seem that bird age is different though. So is each row a bird? Or a group of birds of the same age?

0 Kudos
phil_kay
Staff

Re: Multivariate dataset

So now I understand that each row is for 1 week within each month.

 

I think the best approach is to fit a model of response (I have looked only at Egg Production % but the same would go for any other responses) versus Bird Age (you know that this has an effect on your responses so you would consider this a "covariate") and Month. If there is a significant difference between April, May (treatment 1) and July, August (treatment 2) then that is consistent with there being an effect of Treatment.

 

However... important caveat!...

 

This would not prove that Treatment is an important effect. The observed effect could also be due to some seasonal effect (e.g. egg production is generally lower in July, August because it is hotter). From this data there is no way to separate Treatment from a seasonal effect or some other uncontrolled change between May and July. You would need data covering mulitiple years to understand the seasonal effect.

 

With all that said, I looked at fitting models. I also added a random effect for farm to account for uncontrolled differences between farms. (I anonymised Farm). And some farms have mulitple houses and there could be uncontrolled variation between these so I added a random effect of house, nested within farm (House [Farm]).

 

Nested randon effects example: https://www.jmp.com/support/help/14-2/two-factor-nested-random-effects-model.shtml

 

I also added Problem Farm and Breed as factors to estimate the effects of these potentially important factors.

 

Models with both fixed and random effects are known as "mixed" models. Using JMP Pro I also explored a more complex mixed model that allows random variation of the effect of bird age between farms.

 

The attached data table has both models as scripts. It will only be possible to run the more complex Mixed Model ("Fit Mixed") in JMP Pro. 

 

The standard mixed model does indicate a significant effect of Month: egg production is lower in July and August vs April, May.

 

The more complex mixed model is a better fit. It does not change the conclusion about the effect of Month. However, in this model Breed and Problem Farm are not significant.

 

...I hope all this helps. Mixed modeling is not an easy method to use.

 

In addition to these modeling approaches I would also encourage using exploratory data analysis to look at the quality of the data. Some of the recorded results for egg production are very different from the others (I also added a Distribution analysis script on the table). It would be a good idea to check that these results are correct.

 

0 Kudos