cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Discussions

Solve problems, and share tips and tricks with other JMP users.
Choose Language Hide Translation Bar
BayesKnight
New Member

Verifying Random Run Order and Design Validity in JMP DOE

Hi everyone,

I’m currently working with a Design of Experiments (DOE) created in JMP, and I’m interested in evaluating the quality of the randomized run order. Although JMP automatically randomizes the run sequence during design generation, I would like to objectively assess how effective that randomization is.

In particular, I’d like to know whether JMP provides built‑in functionality or established workflows to:

  • detect trends, systematic patterns, or drift across the run sequence,
  • identify clustering or non‑uniform distribution of factor levels,
  • perform formal statistical tests of randomness (e.g., runs tests, autocorrelation or independence tests),
  • or otherwise quantify how closely the run order approximates true randomization.

If JMP does not offer a direct method for this type of assessment, are there recommended JSL scripts, add‑ins, or best‑practice approaches for evaluating randomness in DOE execution order?

Additionally, I am interested in whether JMP provides tools to verify that the integrity and statistical properties of the experimental design are preserved after manually reordering the runs.

Thank you in advance for your insights.

BayesKnight

5 REPLIES 5
Victor_G
Super User

Re: Verifying Random Run Order and Design Validity in JMP DOE

Hi @BayesKnight,

Welcome in the Community !

I see at least two options that can help you detect response trends due to a possible lack of randomization or due to a time sensitivity in the options of Model Fit:

  • Graphical option: Plot Residual by row: This plot can help you detect patterns that result from the row ordering of the observations.
  • Statistical testing option: Durbin-Watson Test: Statistic to test whether the residuals have first-order autocorrelation. Only appropriate if the rows are in time order (experiments done in the same order as in the data table).

You might have also other graphical or statistical options looking at Control charts (assess whether the repartition of factors levels follow a trend or not thanks to control charts and Westgard rules ) and Time Series analysis options. You can also check through Hierarchical clustering that the clustering of points is not imbalanced or creates ordered groups of rows. Finally, you can also check if there are significant correlations between factors and row order using the platform Multivariate.

The evaluation of design through the platform Evaluate Design does not take into consideration the ordering of the rows in the different evaluations. If you suspect a possible Time-Trend when running your design and would like to make the repartition of factors levels robust to any time trend, you can check the following posts:

https://community.jmp.com/t5/R-D-Blog/How-to-create-an-experiment-design-that-is-robust-to-a-linear/...

Covariates in defined order in custom design 

Hope this answer may help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
BayesKnight
New Member

Re: Verifying Random Run Order and Design Validity in JMP DOE

Hi @Victor_G,

Thank you very much for the warm welcome and for the information provided. I would like to clarify that my primary interest is in assessing the quality of the randomized run order before the experiment is executed. Once responses are collected, it becomes too late to address potential issues caused by non‑random sequencing. Is there a way to evaluate it?

Re: Verifying Random Run Order and Design Validity in JMP DOE

From your original post, you wanted to "quantify how closely the run order approximates true randomization".

You tell me what true randomization looks like and I could probably find a way to quantify how close I am to it.

Ultimately, randomization of the DOE runs is like an insurance policy against unknown/unexpected error sources that could be related to time or the order that the experiments are conducted. Those error sources, if present, would manifest themselves in the response data that is collected which is why @Victor_G provided that insight.

As long as a reputable random number generator is used (which JMP does have reputable random number generators), any random pattern is typically appropriate. Specific situations may indicate that it is NOT, but you would need to know those specific situations.

For example, suppose a piece of equipment will always make a mistake on the fourth item that is produced.  In that situation, a valid random pattern just might put one of the factors at the high setting every fourth time. For an 8 run design, that would be very plausible. But that would certainly influence the results. You would not know that until conducting the analysis and then, most importantly, when VERIFYING the results. I have seen something similar to this happen in the real world in spite of a "valid" random pattern.

Finally, the statistical analysis and properties that you ask about assume that the error terms from the model are independent and identically distributed. Manually reordering the runs of a DOE should not affect that. In fact, you do not even need to randomize, if you know that each run is truly independent. Most people randomize to act as an insurance policy against those unknown error sources (as mentioned above). If manually moving design runs affects the independence, then your design (and analysis) should take those time features into account.

I hope this information helps. And remember, all experiments should be verified and even the sequence 1, 2, 3, 4, 5 is a possible random pattern! 

Dan Obermiller
statman
Super User

Re: Verifying Random Run Order and Design Validity in JMP DOE

I offer a different perspective. There are plenty of "techniques" to evaluate the effectiveness of the experimental strategy, post experiment. I suggest spending more time on planning. I believe randomization in experimentation is a techniques to prevent some unidentified, untested factor (I'll call this noise) from being confounded with a factor in the experiment. I think we can be more effective. So where does that noise effect go? Do you want to know about that noise effect? Using randomization prevents assignment (Shewhart). I want to know about the noise. What noise is significant? How do I introduce noise into the experiment? How to be robust to the noise. I agree with G.E.P. Box:

"Block what you can, randomize what you cannot".

By this he means if you can identify the noise, confounding the noise with the block is more effective than randomization. It is assignable. For noise you can't identify (not sure why you can't), randomize.

I suggest a couple of papers:

Youden, W.J., Randomization and Experimentation, Technometrics, Vol. 14, No. 1, February 1972

Hwan, Marilyn (2000), “To Randomize or Not to Randomize, That is the Question”, ASQ Statistics Division Newsletter, Vol. 17, No. 1, 26

 

 

"All models are wrong, some are useful" G.E.P. Box
SDF1
Super User

Re: Verifying Random Run Order and Design Validity in JMP DOE

Hi @BayesKnight ,

  In addition to what has already been well-stated by others, I would offer a couple other thoughts to consider.

1. How do you plan to quantify the DOE order BEFORE you run the experiment? If the runs depend on the sate of something (sample state, equipment, or oven temp, or something that you aren't taking into account in the DOE), then this will show up in the data, but only AFTER you do all the experimental runs, and can be evaluated as previously discussed. 

2. If you really want to test out the randomness of a DOE before, you could generate hundreds or thousands of DOEs and if you have runs 1 through 20, you can gather the statistics on how frequently each unique run is picked. If you generate enough DOEs, you shouldn't see any pattern in the selection process. But then, which DOE do you choose? Which one is the most random? You could set up a JSL script to do this and it shouldn't take very long. You should of course be very careful about how you define your DOE so that you don't have any hidden factors that can mess it up. Even if you did this analysis, it still would never be able to account for any hidden variable that does influence the run order.

3. After the DOE is done, there's several ways to evaluate the randomness of the runs, as mentioned already (the residual by row is a great place to start). I would also point out, you could add a column that is "run order" and include that in the analysis to determine if there is any statistically important dependence of the results on the run order. You could even include a "null factor" (read up on autovalidation in JMP -- there is a JSL code available, but it is not a built-in feature) that is a completely random factor, and any factor that comes in as statistically less than or equally important than this null factor can be ignored. 

  Pre-planning is key, and often ends up constituting a larger percentage of your discussion time. A well planned out DOE will be less time consuming to analyze than a poorly planned DOE, and you will gain more insights faster as well. In this regard, consider things like: Is there reason to think that order matters, and if "yes", why; where is it coming from; and how can we account for it in our design? What other factors might affect the results and can we account for them? Where does the noise in our signal come from? How can we minimize the noise and/or take it into account in the DOE?

Hope this helps!,

DS

Recommended Articles