cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • JMP 19 is here! See the new features at jmp.com/new.
  • Register to attend Discovery Summit 2025 Online: Early Users Edition, Sept. 24-25.
Choose Language Hide Translation Bar
SaraA
Level III

Reporting DOE results for publication

Hi, 

I am writing up a manuscript that includes Design of Experiments (Plackett-Burman DOE and Full Factorial DOE). 

However, I am unsure what is important to report in the publications to be as transparent as possible. 

 

So far, I have included the model metrics (R2, R2 adjusted, PRESS, model p-value, lack of fit p-value) as well as the significant terms and interactions and their p-value. I believe reporting parameter estimates for the factors/interactions does not make sense when you have a regression model that includes significant interaction terms (since these parameter estimates change due to interactions). In that case, it is just best to use the prediction profiler. However, the prediction profiler is very difficult to include for publication. So how can I report the results of the factors and their interactions in a meaningful way? 

 

I would appreciate any advice. 

Thank you

Sara 

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Reporting DOE results for publication

Hi @SaraA,

 

Unfortunately it can be very hard to know how to produce DoE publications because there is so much variation and quality in what is produced in academic publications - I've got a few thoughts on this from my own experiences of publishing DoE works and I'm sure others will add on.

 

The most important rule: Assume your readers know nothing about DoE or statistics!

Even though it's a popular tool for experimentation, a lot of readers may not have been introduced to it, which means you need to use tools like visualisation to convey what a DoE is doing and to help the readers understand. This is where a lot of authors fall down, with my biggest pet peeve being them including the whole formula in the paper: does that really convey the results of the DoE? Similarly, things like parameter estimates (as you've mentioned) don't really mean much when you've got other tools like your Pareto Plots

 

I'm going to grab a few examples of how I showed my results to convey some of the ways to show your DoE (shameless plug you can also find the article here

 

Ben_BarrIngh_0-1752221100233.png

 

Setting the scene - use simple visualisations to show the structure of the DoE and how the points are being explored - you can do this with Graph Builder or the Scatterplot Matrix when you have loads of factors - this helps readers understand that the DoE is a structured approach to experimentation. Similarly, having a table where you show the coding of your factors (if you're using -,0,+ and axials (a,A) will really help.

 

Ben_BarrIngh_1-1752221250885.png

 

Showing what you achieved - before you jump into the regression model and the greater complexities of the DoE approach, you should highlight the real values that you've got, a simple plot in Fit Y by X using something like a Tukey's test is a really clear way to show 'Look how much more stuff I made' using the DoE approach - one of the great strengths of DoE is that it's a structured way of exploring an experimental region, the modelling approach after is another plus! In the example above you can see my '00a' is a lot more productive then the other points - in reality I could stop there because I've got what I wanted!

Ben_BarrIngh_3-1752221420098.png

 

 

Introducing the model - as you've mentioned, you're showing the basic information required to prove that your model is sufficient - depending on your target journal, that should be more than enough to prove that you've built a good model and you might not need to dive deeper (unless you're targeting a very statsy journal). If you have material like your studentised residuals and actual by predicted plot - make sure to include them, but maybe in your supplementary information.

 

Ben_BarrIngh_2-1752221362492.png

Introducing the modelling concept - The Pareto Plot is a great way for a reader to look and really quickly pull apart the significance of your terms and also a great way to introduce more complex concepts like quadratic curvature (X1*X1) and interactions (X1*X2) - the reader can quickly see that 'Blue Line is good' and understand where terms have less importance - this is a great point to link this into your understanding of the system, for example, Nitrogen is highly important to production because of the reliance of the organism used in this paper on it for growth. This helps to show that you're not just running a statistical experimentation method, but that you're using it to understand your system better.

 

Ben_BarrIngh_4-1752221640031.png

Ben_BarrIngh_5-1752221658912.png

 

 

Combine the many ways to show your surface - Personally, I love the Prediction profiler and it can be a great way to show off your system and how it operates, but as you mentioned it can be stunted when you have to present it statically - if you have a simple system, you can show different shots of the prediction profiler with different settings. In my case, I showed the prediction profiler at the optimum (which showed the shape of my factors) and combined it with the Contour Plots to give more understanding to my system. As a tip - if you set your Contour Plot to the same factor settings as is in your Prediction Profiler, the area where the tall grid 'intersects' with the surface (which I added a white highlight to) reflects the surface on the Prediction Profiler!

 

Consider sharing your results in JMP Public - More and more publications are providing access to the data that formed the results sections (which is great!), I personally really liked publishing my results to JMP Public and including it in my articles - this gives the readers a chance to actually play with the Prediction Profilers and download the available data (here as an example of how I shared my results).

 

I hope this helps you out and good luck with the review stage!

 

Thanks,

Ben

 

 

 

“All models are wrong, but some are useful”

View solution in original post

15 REPLIES 15

Re: Reporting DOE results for publication

Hi @SaraA,

 

Unfortunately it can be very hard to know how to produce DoE publications because there is so much variation and quality in what is produced in academic publications - I've got a few thoughts on this from my own experiences of publishing DoE works and I'm sure others will add on.

 

The most important rule: Assume your readers know nothing about DoE or statistics!

Even though it's a popular tool for experimentation, a lot of readers may not have been introduced to it, which means you need to use tools like visualisation to convey what a DoE is doing and to help the readers understand. This is where a lot of authors fall down, with my biggest pet peeve being them including the whole formula in the paper: does that really convey the results of the DoE? Similarly, things like parameter estimates (as you've mentioned) don't really mean much when you've got other tools like your Pareto Plots

 

I'm going to grab a few examples of how I showed my results to convey some of the ways to show your DoE (shameless plug you can also find the article here

 

Ben_BarrIngh_0-1752221100233.png

 

Setting the scene - use simple visualisations to show the structure of the DoE and how the points are being explored - you can do this with Graph Builder or the Scatterplot Matrix when you have loads of factors - this helps readers understand that the DoE is a structured approach to experimentation. Similarly, having a table where you show the coding of your factors (if you're using -,0,+ and axials (a,A) will really help.

 

Ben_BarrIngh_1-1752221250885.png

 

Showing what you achieved - before you jump into the regression model and the greater complexities of the DoE approach, you should highlight the real values that you've got, a simple plot in Fit Y by X using something like a Tukey's test is a really clear way to show 'Look how much more stuff I made' using the DoE approach - one of the great strengths of DoE is that it's a structured way of exploring an experimental region, the modelling approach after is another plus! In the example above you can see my '00a' is a lot more productive then the other points - in reality I could stop there because I've got what I wanted!

Ben_BarrIngh_3-1752221420098.png

 

 

Introducing the model - as you've mentioned, you're showing the basic information required to prove that your model is sufficient - depending on your target journal, that should be more than enough to prove that you've built a good model and you might not need to dive deeper (unless you're targeting a very statsy journal). If you have material like your studentised residuals and actual by predicted plot - make sure to include them, but maybe in your supplementary information.

 

Ben_BarrIngh_2-1752221362492.png

Introducing the modelling concept - The Pareto Plot is a great way for a reader to look and really quickly pull apart the significance of your terms and also a great way to introduce more complex concepts like quadratic curvature (X1*X1) and interactions (X1*X2) - the reader can quickly see that 'Blue Line is good' and understand where terms have less importance - this is a great point to link this into your understanding of the system, for example, Nitrogen is highly important to production because of the reliance of the organism used in this paper on it for growth. This helps to show that you're not just running a statistical experimentation method, but that you're using it to understand your system better.

 

Ben_BarrIngh_4-1752221640031.png

Ben_BarrIngh_5-1752221658912.png

 

 

Combine the many ways to show your surface - Personally, I love the Prediction profiler and it can be a great way to show off your system and how it operates, but as you mentioned it can be stunted when you have to present it statically - if you have a simple system, you can show different shots of the prediction profiler with different settings. In my case, I showed the prediction profiler at the optimum (which showed the shape of my factors) and combined it with the Contour Plots to give more understanding to my system. As a tip - if you set your Contour Plot to the same factor settings as is in your Prediction Profiler, the area where the tall grid 'intersects' with the surface (which I added a white highlight to) reflects the surface on the Prediction Profiler!

 

Consider sharing your results in JMP Public - More and more publications are providing access to the data that formed the results sections (which is great!), I personally really liked publishing my results to JMP Public and including it in my articles - this gives the readers a chance to actually play with the Prediction Profilers and download the available data (here as an example of how I shared my results).

 

I hope this helps you out and good luck with the review stage!

 

Thanks,

Ben

 

 

 

“All models are wrong, but some are useful”
SaraA
Level III

Re: Reporting DOE results for publication

@Ben_BarrIngh 

Where can I find more information on how to make the first 2 graphs you are showing here? 

Re: Reporting DOE results for publication

Hi @SaraA,

For the first one I used a graph builder to plot the points, then I used PowerPoint to draw the shape of the squares and circles in the same colour.

For the second one, you can use Fit Y by X where the X is your 'Pattern' and Y is your response - then I did a means comparison with a Tukeys test to display the comparison circles.

Hope that helps!
Ben
“All models are wrong, but some are useful”
SaraA
Level III

Re: Reporting DOE results for publication

Hi Ben, 

 

Thank you very much. One last question: how can I generate a scaled estimates plot after REML analysis? JMP does not generate this plot for mixed model analysis somehow (although it does provide the parameter estimates, but I think the scaled estimates or Pareto plot is more intuitive). 

 

Thank you

Sara

Re: Reporting DOE results for publication

Hi @SaraA ,

 

Because of the nature of the method, REML focuses on variance components and (unlike methods like Least Squares) does not perform standardisation (which is a pre-processing step in the workflow for modelling) - there's a lot of additional complexity with mixed models and trying to standardize them (do you standardise globally or within groups? how do you standardize a variable that is different between two groups?).

 

Thanks,

Ben

“All models are wrong, but some are useful”
Victor_G
Super User

Re: Reporting DOE results for publication

Hi @SaraA,

 

It's difficult to help you without knowing for which journal or audience your publication is dedicated to. Also what is your objective with this publication, and how confidential/open-source it can be ?

 

Depending on the responses, the best situation you can provide to the readers is to give them (in annexes) the whole dataset, with all inputs and responses, as well as predicted responses, and any processing or decision you may have taken on your data (transformations, handling or exclusion of outliers, etc ...). This way, you're giving all the info so that your findings and work can be reproducible.

 

About your comment : 


@SaraA wrote:

I believe reporting parameter estimates for the factors/interactions does not make sense when you have a regression model that includes significant interaction terms (since these parameter estimates change due to interactions).


I strongly disagree. Parameter estimates help give you practical importance (effect size) of the factors on your response(s). Statistical significance is not enough to understand if a model is adequate, reliable and practically useful. You need both to interpret the model and results.
See Which one to define effect size: Logworth or Scaled Estimates ? discussion for more explanation about the difference between practical and statistical significances. 

One a side note, it is also a good practice to ensure reproducibility of your modeling results.

 

About the Prediction Profiler, you could maybe provide an interactive HTML file or embbed the Profiler on a web page (website, or JMP Live) if you want to share it.

I would still consider providing useful "static" visualisations of your models, like effect size plot (in bar chart), heatmap (if you want to see the influence of your factors on several responses), correlation maps (for your responses), surface plot, scatterplots... There are many things you can do with JMP to visualize your model results.

 

Hope this answer may help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
SaraA
Level III

Re: Reporting DOE results for publication

@Victor_G 

Say you have an interaction between term X1 and X2 in your model and you report the parameter estimates for X1 and X2 but because of the interaction, these estimates (i.e. slope coefficients) can change, for example from a positive slope to a negative slope, depending on the levels of the other factor with which it interacts, than why would you report the parameter estimates?

Victor_G
Super User

Re: Reporting DOE results for publication

I don't understand your point.
The interaction effect still have the same coefficient value no matter the levels of X1 and X2 (if these two factors are continuous).

If one (or several) of your factors is categorical/ordinal, you can provide the interaction effect value for each categorical level by displaying the Expanded Estimates of your model :

Victor_G_0-1752228217118.png

 

Hope this clarify my point,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
SaraA
Level III

Re: Reporting DOE results for publication

@Victor_G 

Sure, the interaction term have the same coefficient value no matter the levels of X1 and X2 but the coefficient values of the main terms (X1 and X2) will have different values, depending on the levels of the other factor, right? 

Recommended Articles