cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Check out the JMP® Marketplace featured Capability Explorer add-in
Choose Language Hide Translation Bar
Gabriel
Level III

Random effect test

Hi everyone, I would appreciate a little clarification or guidance on my experimental design.

My experiment has two main treatments: gas contamination (2 different levels) and species (2 different species). The experiment is set up so that I have two main rings (one for each level of contamination), each with the two different species in them. Each rings are replicated 3 times making it 6 rings in all to assess my parameter of interest.

I guess the basic assumption would be to test for Rings as a random effect, but I am unsure how to do this. How can I correctly estimate of the effect of the gas contamination, species, Days and interactive their effect, once I have satisfied my normality assumption?

Some guidance needed, I have added a sample excel data.

 

Thank you.

Gabriel Mulero
2 ACCEPTED SOLUTIONS

Accepted Solutions
Victor_G
Super User

Re: Random effect test

Hi @Gabriel,

 

From the look of your table and your description, your design looks like a split-plot design, with "Ring" as your whole plot, and "contamination" as your hard-to-change factor. Species would be an easy-to-change factor, and Dyn is your response, measured at two time interval (50 and 56 days, time could be used as a continuous factor in the design).

 

In order to create a relevant analysis with random effect, some column properties need to be set before :

As a random block, "Ring" is entirely correlated with the factor "contamination", and should have two columns properties :

  • "Design role" set as "Random Block",
  • "Value Order" set to know the ordering of the rings.

As a hard-to-change factor, "contamination" should have three column properties :

  • "Design role" set as "Categorical",
  • "Factor Changes" set as "Hard",
  • "Value Order" set to know the ordering between low and high levels.

Same properties for "species", but "Factor Changes" property should be set as "Easy".

 

For "days", you can do the analysis of Dyn by day and create two separate models, one for each time period, or directly use "days" as a continuous factor and specify a "factorial to degree 2" model (model with main effects and 2-factors interactions) .

 

Once properties are set, you can go to "Fit Model", and specify your model (you may have to choose a "Mixed Model" personality so that "Ring" can be added and used as Random Effects in the Model Effects panel) : 

Victor_G_0-1681199699812.pngVictor_G_1-1681199726504.png


EDIT: You can also do a Least Squares Model but use the red triangle next to "Attributes" with Ring entered in the model to specify it as a Random Effect. Same test will be done than with the Mixed Model personality (you can look at REML Variance Components Estimates and the Wald p-value to assess if your random effect is statistically significant or not). This model possibility has been added in the scripts of the datatable.

 

Once you have done the analysis on your own data, and depending on the outcomes of the test for random effect, you may try other models as well (Generalized Regression without Random Effect if it is non-significant or other modeling platforms).

 

I hope this first answer will help you.
Please find attached the JMP datatable used with all column properties set and the "Mixed Model" script I used on this dataset.

 

Victor GUILLER
L'Oréal Data & Analytics

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

View solution in original post

Victor_G
Super User

Re: Random effect test

Hi @Gabriel,

 

By default when you're creating a split-plot design in JMP, the whole plots will be ordered as 1, 2, 3, ... and this information will be saved in the column property. 

 

As mentioned in the JMP Help (Value Order (jmp.com)) : "In designs created using most DOE platforms, categorical factors, including the constructed factors Whole Plots and Subplots, and blocking factors are assigned the Value Order property. This property orders the levels according to the order in which they appear in the Factors section. The levels of constructed factors are consecutive integers and the Value Order property specifies this natural ordering. You can modify the Value Order specification for any factor to meet your needs."

 

The order in itself is not so important (it is related to the way you have specified your factors), so you can change it if needed (since all blocks are similar in terms of experiments repartition/distribution).

 

I hope this answer will help you,

Victor GUILLER
L'Oréal Data & Analytics

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

View solution in original post

8 REPLIES 8
Victor_G
Super User

Re: Random effect test

Hi @Gabriel,

 

From the look of your table and your description, your design looks like a split-plot design, with "Ring" as your whole plot, and "contamination" as your hard-to-change factor. Species would be an easy-to-change factor, and Dyn is your response, measured at two time interval (50 and 56 days, time could be used as a continuous factor in the design).

 

In order to create a relevant analysis with random effect, some column properties need to be set before :

As a random block, "Ring" is entirely correlated with the factor "contamination", and should have two columns properties :

  • "Design role" set as "Random Block",
  • "Value Order" set to know the ordering of the rings.

As a hard-to-change factor, "contamination" should have three column properties :

  • "Design role" set as "Categorical",
  • "Factor Changes" set as "Hard",
  • "Value Order" set to know the ordering between low and high levels.

Same properties for "species", but "Factor Changes" property should be set as "Easy".

 

For "days", you can do the analysis of Dyn by day and create two separate models, one for each time period, or directly use "days" as a continuous factor and specify a "factorial to degree 2" model (model with main effects and 2-factors interactions) .

 

Once properties are set, you can go to "Fit Model", and specify your model (you may have to choose a "Mixed Model" personality so that "Ring" can be added and used as Random Effects in the Model Effects panel) : 

Victor_G_0-1681199699812.pngVictor_G_1-1681199726504.png


EDIT: You can also do a Least Squares Model but use the red triangle next to "Attributes" with Ring entered in the model to specify it as a Random Effect. Same test will be done than with the Mixed Model personality (you can look at REML Variance Components Estimates and the Wald p-value to assess if your random effect is statistically significant or not). This model possibility has been added in the scripts of the datatable.

 

Once you have done the analysis on your own data, and depending on the outcomes of the test for random effect, you may try other models as well (Generalized Regression without Random Effect if it is non-significant or other modeling platforms).

 

I hope this first answer will help you.
Please find attached the JMP datatable used with all column properties set and the "Mixed Model" script I used on this dataset.

 

Victor GUILLER
L'Oréal Data & Analytics

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
Gabriel
Level III

Re: Random effect test

Thanks @Victor_G

Just before I work through the suggestions you gave, what should be done with the Value Order, if my Rings although called Ring1 to Ring6, don't necessarily need to follow a particular order?

Gabriel Mulero
Victor_G
Super User

Re: Random effect test

Hi @Gabriel,

 

By default when you're creating a split-plot design in JMP, the whole plots will be ordered as 1, 2, 3, ... and this information will be saved in the column property. 

 

As mentioned in the JMP Help (Value Order (jmp.com)) : "In designs created using most DOE platforms, categorical factors, including the constructed factors Whole Plots and Subplots, and blocking factors are assigned the Value Order property. This property orders the levels according to the order in which they appear in the Factors section. The levels of constructed factors are consecutive integers and the Value Order property specifies this natural ordering. You can modify the Value Order specification for any factor to meet your needs."

 

The order in itself is not so important (it is related to the way you have specified your factors), so you can change it if needed (since all blocks are similar in terms of experiments repartition/distribution).

 

I hope this answer will help you,

Victor GUILLER
L'Oréal Data & Analytics

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
Gabriel
Level III

Re: Random effect test

Thank you so much for the very detailed explanation, it helped a lot.

I would like to clarify about the Generalized regression how is different from just doing the Least squares directly? Although, I noticed a few difference as the generalized regression allows me to explore different estimation methods. However, what is the advantage or difference if I fit the least squares directly and doing it through the generalised regression platform?

When I do both separately for my parameter of interest, I see some difference and similarities (see attached), could you please explain why the difference (especially the Cv in the effect test)?

 

 

Gabriel Mulero
Victor_G
Super User

Re: Random effect test

Hi @Gabriel,

 

It seems you only have nominal variables in your model.

I might have found an element of response concerning your question about the differences from Least Squares estimation method between Generalized Regression and Least Squares platforms :

 

"The parameterization of nominal variables used in the Generalized Regression personality differs from their parameterization using other Fit Model personalities. The Generalized Regression personality uses indicator function parameterization. In this parameterization, the estimate that corresponds to the indicator for a level of a nominal variable is an estimate of the difference between the mean response at that level and the mean response at the last level. The last level is the level with the highest value order coding; it is the level whose indicator function is not included in the model."

From :  Launch the Generalized Regression Personality

 

Whereas in Fit Model : 

"When you enter a column with a nominal modeling type in the Fit Model launch window, JMP represents it internally as a set of continuous indicator variables. Each variable assumes only the values –1, 0, and 1. (Note that this coding is one of many ways to use indicator variables to code nominal variables.) If your nominal column has n levels, then n–1 of these indicator variables are needed to represent it. (The need for n–1 indicator variables relates directly to the fact that the main effect associated with the nominal column has n–1 degrees of freedom.) Full details are covered in Nominal Factors."

From : Statistical Details for Nominal Effects Coding

 

So the difference you see might be linked to this difference of nominal effects coding, hence giving different estimate values and calculated p-values.

 

I hope this answer will help you understand the difference,

Victor GUILLER
L'Oréal Data & Analytics

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
Gabriel
Level III

Re: Random effect test

thank you @Victor_G . I see now.

The bottom line would then be, when to use what? or is there specific nature of ones nominal variable that necessitates using one over the other if we have a normally distributed dataset?

For instance Cv in this case has to remain nominal, as there is no particular order. CO2 contamination (is either high or low) but not in any particularly an order. Of course inputting days into the model as to come in as a continuous variable. 

I am not just clear about this.

 

Thanks.

 

Gabriel Mulero
Victor_G
Super User

Re: Random effect test

Hi @Gabriel,

 

There may no definitive answer to your question, as it highly depends on your factors, topic and objectives.


As far as I understand the differences between these two nominal effects coding, I would say that if you expect to evaluate levels from a nominal factor by comparing each level to one specific level, then the nominal coding from Generalized Regression might be more appropriate. You can use column property "Value Order" to specify the order needed so that it fits your study design. The last level of the factor will be used for the comparison to all other levels.
For example, in the JMP datatable example "Cholesterol.jmp", I personally would prefer creating a model based on Generealized Regression platform with Standard Least Squares estimation method, as it would enable the nominal coding to compare each treatment effect level to the "Placebo" treatment effect level (by default because of alphabetical ordering, but by adding a "Value order" column property I can change the comparisons done with another level, like "Control"). It makes more sense, as we would like to know what are the effects of various treatments compared to Placebo (or Control), not compared to the average calculated on all different levels.

 

But when you want to know how each levels may influence the average response calculated on all levels, then the Standard Least Squares from Fit Model platform may be more useful.

 

Your CO2 contamination seems to be an ordinal factor, since there is a "natural order" between high and low (high > low), so you may specify this factor as ordinal, as it is an information taken into account in the ordinal factors coding as well : https://www.jmp.com/support/help/en/17.1/index.shtml#page/jmp/ordinal-factors.shtml#ww96069 

If you specify it as an ordinal effect and if you have only 2 levels, effect estimate will be calculated by taking the difference between the average response for each level (High - Low), whereas with a "normal" nominal coding and 2 levels, effect estimate will be calculated by taking the difference between the average response of a level and the overall average response for the two levels (example: High - Mean(High,Low)).  

 

I hope this answer will bring a little more clarity to this complex topic,

Victor GUILLER
L'Oréal Data & Analytics

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
Gabriel
Level III

Re: Random effect test

Thank you @Victor_G. I guess in my case CO2 contamination is more a nominal variable or factor since the high or low are just environment within which my data are collected. One chamber is given high CO2 contamination and the other is given low, and my measurement of plant's "A-Dyn" response is taken in the various open chamber.

But just like you said the topic is quite complex to navigate.

Gabriel Mulero