Discussions

SDF1 · Jun 6, 2025 03:16 PM

Hello JMP Community,

Running JMP Pro 18.1.1 on Windows 11. Apologies ahead of time for the long post, but I want to try and provide as much information ahead of time. The references to factor and response columns are for the attached anonymized data table for those interested in looking in more detail (hence some vagueness in my description and anonymized data table).

I have some general concerns and questions to bring up regarding a split-plot DOE that I'm helping my colleagues to analyze. Some not too dissimilar from the recent post here. Unlike the "clean" and "nice" examples in the JMP help data, this is a real-life industrial example where things don't always work out in an ideal way.

Background: I have some chemist colleagues that wanted to run a DOE by testing out six factors and measuring four different responses. The purpose of the DOE was to hopefully determine what combination and at what levels those six factors could/should be set in order to achieve the desired results. One of the responses is, what I would call for lack of better words, the "primary" response, Y__1, (the other responses have dependencies on this "primary response"). The dependency is NOT linear for each of the other responses.

DOE design: We have a 30-run (this number was determined as the allowable number of runs by the chemists) custom DOE with six factors, one of which is hard to control (oven temperature), hence turning what would otherwise be a random design into a split-plot design.

X__5 is the hard to change oven temperature response.
Runs are grouped into 4 whole pots with 7 or 8 runs within each whole plot.
Three of the four responses have lower design limits, with a goal to maximize their response (Y__1, Y__2, Y__3).
One response has an upper design limit, with a goal to minimize its response (Y__4).
One response has a lower detection limit of 10 and absolute lower limit of 0 (Y__3).
One response has an upper detection limit of 280 (Y__4).

Observations of data: The DOE was conducted and here are some observations of the data.

Y__1 is normally distributed. Recall this is also the "primary" response.
Y__2 is NOT normally distributed, but is best fit by a SHASH distribution. JMP does have a SHASH transform function to turn it into a normally distributed data set as well as an inverse SHASH transformation. Y__2 is not normal because it strongly depends on Y__1, and as Y__1 changes (decreases), Y__2 increases, goes through a maximum and then begins to decrease.
Y__3 is NOT normally distributed because of the physical limit of 0 and detection limit of 10. This data is best characterized with a log-normal distribution. If using that as the Distribution in the GenReg platform, then the 0s need to be recoded as something very small and close to, but not 0, like 1e-12 or something.
Y__4 is NOT normally distributed and is also better characterized by a log-normal distribution (using the updated continuous fit functions that can deal with detection limits vs the legacy fitters that suggest a normal 2-mixture instead).

Discussion points/things to consider: Below are just some topics and discussions points that have either been brought up to me about the DOE results, or that I'm wondering how to manage/work with as I analyze the data.

As was mentioned in the post linked above, because this is a split-plot DOE, it's important to keep the whole plot & random effects in the model because it's not a truly random DOE. Removing it can potentially lead to erroneous conclusions.
- Other platforms like Generalized Regression and Generalized Linear Mixed Models can't handle the random effects, but it can manage the whole plot as a factor.
  - Does this mean I should stay away from using those platforms as a way to generate alternative models?
- Unfortunately, the only modeling platform that can handle the problem of detection limits for two of my responses is GenReg by using a Censor column. The others can't manage this, nor can they manage the non-normal distributions of 3 of my responses.
  - NOTE: the censor column for Y__3 is different from Y__4 because the detection limits are different.
- This results in a dilemma: use the Mixed Model (or SLS) approach that handles the whole & random effects, but can't manage the non-normally distributed data or the detection limit problem; or use an alternative platform like GenReg which can handle the censored data and the detection limit problem, but can't handle the whole & random effects of the split-plot design.
When running either the standard least squares or mixed model platforms with the data (or even GenReg and GLMM), all model profilers suggest that in order to maximize Y__1, Y__2, Y__3 and minimize Y__4, the factor settings should be set to an extreme value (either low or high -- it changes depending on the platform). This doesn't make sense from a domain knowledge perspective. Based on the ranges chosen for the DOE, we anticipated some/or all of the factors to be between the extremes for optimal responses.
- At present each response is being treated equally (25%, 0.25) in the set/maximize desirability options. This can change, but doesn't have a large effect on the profiler outcome -- it still suggests setting the factors at extreme values most of the time.
The chemists have proposed augmenting the DOE by adding center points. I am not sure this would solve the problems we are facing in the analysis. But anyway,
- I have tried to see if this is a possibility, but when I try this, I have to include the whole plots as a factor, and my only Augmentation Choice is "Augment", all the others are grayed out.
  - Why do I not have the other options available?
  - Is there a way to add center points or do a space filling without adding some kind of bias to the DOE?

Questions/issues I'm struggling with: Overall, here are the issues I'm trying to manage when analyzing the data -- particularly in trying to figure out and understand how to avoid the profiler from suggesting extreme settings for the factors.

Should I stick with the Mixed Model platform because of the split-plot design no matter what?
Is it at all useful to try and use other platforms (GenReg, GLMM, etc.) to try and analyze the data?
- If it is worthwhile, how best to try and include the whole plot & random effects inherent in the DOE?
What are some best practice methods for managing the strong dependence Y__2, Y__3, and Y__4 have on Y__1 in the analysis?
What are some best practice methods for managing the non-linear responses of Y__2, Y__3, and Y__4? Transforming them, or is there some other/better way?
I'd prefer to get a model for all responses at once, like how SLS can do it, but I don't think that platform is the correct platform to use given the non-normal distributions as well as dependence of the other responses on Y__1. I can save the prediction formulas for the responses from whatever platform I use and then use the Graph > Profiler to generate a prediction profiler, but I'm still stuck with the profiler suggesting extreme settings for the factors, which doesn't make sense.
- Why is the profiler suggesting extreme values?
- Does this indicate a problem with the DOE? If so, what is/are the problem(s).
- Can the extreme profiler suggestions be resolved somehow?
If augmenting the design by performing more experiments is one way to go, how can I access other augmentation options? Right now, all I can do is change the upper/lower factor settings and define the number of additional runs, I can't do anything else.

Thank you for taking the time to read through this post. Any feedback/thoughts/suggestions are much appreciated.

Thanks!,

DS

SDF1 · Jun 18, 2025 11:23 AM

Hi @Victor_G ,

Thanks for taking the time to write your detailed response and provide some thoughts on the problems at hand. I've been working through some of the suggestions you provided as well as pursuing some other avenues and ideas.

In response to your thoughts, here is some feedback:

Identify patterns and anomalies: Yes, I have looked into that, and one thing that I noticed here is that Whole Plots 2 & 3, where X__5 is at it's "low" setting results in generally overall poor performance for the 4 responses. This is not too surprising for us. It's good that we had this in the factor settings so we could see a large enough "signal" change in the responses. As mentioned before, 3 of the 4 responses are correlated (non-linearly) and dependent on the first response, Y__1, so seeing those correlations in the Multivariate platform are not new for us. One of the difficulties here is that for one of the responses, Y__2, it is non-monotonic as a function of Y__1 -- meaning that Y__2 can have the same response value at different Y__1 values. For example, Y__2 can have a value of 24 when Y__1 is either ~27 or ~31. Y__3 and Y__4 don't have this behavior, yet they are still non-linearly dependent on Y__1. There is one potential outlier based on the Y__3 response, but so far, we have no special cause reason to eliminate this run or to try and re-do it. However, it is something I will ask my colleagues to look further into.
First model iteration:
- I agree that the Whole Plot effect is random -- ideally, the DOE cold have been done as completely randomized, but because X__5 is hard to change (at least when trying to do this DOE as efficiently as possible), it ended up being a split-plot design. Anyway, based on what we know about some of the responses and how they can be influenced by X__5, we can't really drop the Whole Plots effect. And, although Y__2 doesn't really need it to be included, it is easier to fit the entire Mixed Model as a whole.
- Thanks for the thoughts on the transformations. In testing out the SHASH transformation for Y__2, it actually did not fit as well as just fitting Y__2 without transformation. I have been evaluating the residuals as well as other model assessment methods like doing a Bland-Altman analysis or looking at the 95% CI for the actual by predicted plots. So far, the Mixed Model seems to be performing the best across the model assessment methods.
Models/exploration/comparison/selection:
- While keeping the random effect in the model, the Mixed Model still seems to do a little better job than the Least Squares. Both of them predict some slightly non-realistic values for one of the responses (a different one for each model). They're not drastically unrealistic, but given what we know, especially about response Y__1, those responses should not be able to be achieved.
- Thanks for the suggestion on the weighted column based on how far from the detection limit the response is. I like that and have adopted it when comparing different models while trying to account for the detection limit issue.
- Based on the censored/weighted responses as well as that some require the random effect and one doesn't, I might end up blending prediction columns from different platforms, but I need to further evaluate model performance.
Validation & Augmentation:
- Definitely will want to validate the model -- that's always what I recommend (I can't force them to do anything, just recommend).
- As far as the profiler goes, it's just strange that according to the optimization criteria the profiler always goes to extreme values, which seems a bit strange and unusual.
- As far as augmentation goes, I was hoping to just take the original DOE and do some space filling runs, but it won't let me do that because of the hard to change factor. Instead, if I take your approach and actually change the model by adding in quadratic effects, then I can have JMP generate runs that aren't just repeating the corners/edges of the design space and actually put runs inside the space, which I hope will help.

Your other considerations:

Sorry, should have been a bit more clear on this one. All the responses can be measured independent of one another -- they could theoretically be done simultaneously (in parallel) or sequentially (in serial). The order of the testing is actually irrelevant. What I was trying to explain is that Y__2, Y__3, and Y__4 all depend on Y__1 and each in a non-linear way. It would be better to consider using the factors+Y__1 Prediction Formula to predict the other responses. But yes, there is the risk of inflating the prediction error.
I had not thought of using the Partition platform to analyze the data. Not to take the results too seriously, I think it is helpful to look at the analysis and use that information to our advantage. Thanks! Some of the other methods don't work as well, but I think it might be in part because we don't have so many runs.

I fully agree about models being useful and not perfect -- but that is the whole crux of what I'm trying to do with the data set I have -- generate a useful model, and when a prediction profiler only suggests extreme settings (and inconsistently based on SLS or Mixed Model approaches), I wanted to look for ideas on how to make the most learning I can from this data set.

Yes, your response was very helpful and provided a lot of ideas for how to evaluate things. Thank you. I appreciate the discussion and effort to engage in the data and topic!

Thanks!,

DS

SDF1 · Jun 9, 2025 07:56 AM

I don't have easy access to the article, so I have not been able to read it.

I am reading this article, which has been somewhat helpful: Split-Plot Designs: What, Why, and How, B. Jones and C. J. Nachtsheim, J. Quality Technology, V. 41, N. 4, p. 340.

MRB3855 · Jun 9, 2025 09:43 AM

Hi @SDF1 :I think we need some clarity around exactly what is meant when we talk about the normality of the response (we'll call it Y for now). And, to simplify, let's just consider the OLS multiple regression model (Ordinary Least Squares). We can talk about normality in at least two different ways (for brevity, I'll leave out the subscripts and iid assumption):

1. Y = a + b*x1 + c*x2 + d*x3 + e, where e is normally distributed with mean=0, and variance = Sigma^2.

Equivalently we could say

2. Y is normally distributed with mean a + b*x1 + c*x2 + d*x3, and variance = sigma^2.

And discussions about the distribution of the response is implicitly speaking to the way the model is described in (2) above. But, that is very different than saying, for example, that your Y column should have a normal distribution that can be assessed via the distribution platform (it has to be corrected for a mean that varies with the X's first), i.e., whichever way the model is described, the way we assess normality is the same (via exploring the residuals). So the normality (or not) of the response matters (in fact, parameter estimation in your mixed model via Fit Model/REML assumes normality), and it shouldn't be confused with how we assess normality.

SDF1 · Jun 9, 2025 10:47 AM

Hi @MRB3855 ,

Thanks for your response, and I completely agree with you. I will try to be a little more clear about my questions regarding normality of the responses.

I actually don't care that the responses aren't normal -- they are what they are for several reasons that are known and not surprising. I don't believe I indicated that those responses which are non-normal should be normal. If I did, my apologies, as I don't think they should be anything. They are what they are, and there is currently no indication that any special cause would remove any data from the DOE.

Y__2 is non-normal because it is highly depended (in a nonlinear way) on Y__1. Y__3 is non-normal in part because physically it cannot have a value <0, and due to the design of the DOE, there are some measurements that are near this physical limit (it's also highly dependent on Y__1). While Y__4 on the other hand is in part non-linear because it is also highly dependent on Y__1 and has an upper detection limit of 280 -- after that, the results are maxed-out, and because of the design there are some runs where the response is maxed-out. This is all well-and-good. So far, to me, the DOE has done exactly what it was designed to do.

My questions/concerns regarding the normality of the responses is more about how to manage/deal with the non-normal response in the modeling step. Some resources in the JMP Community & Help that I've read seem to have conflicting recommendations, especially when it comes to a split-plot designs where at least one factor is hard to change. Some of the resources recommend that the analysis should keep the whole&random effects in there (hence you must use the mixed model method -- but you can't account for non-normal distributions in your response -- or the detection limit), and some suggest using the GenReg platform where you can account for the non-normal distributions (and detection limits), but you loose the random effects in the model.

There are obviously trade-offs to the different approaches, and I'm more concerned in minimizing how those trade-offs affect the analysis of the DOE -- when it comes to the normality of the responses. If I use the GenReg platform, I can't account for the random effects inherent in the split-plot design and therefore susceptible to both Type 1 and Type 2 errors -- I could conclude something is there when it's not, or conclude something is not there when it is. On the other hand, the mixed model can't handle the non-normal response distributions (but can handle the whole&random effects), which can sometimes lead to predictions that aren't physically real -- like values <0, which don't make sense.

So, how does one handle this in a real example? The sample data tables in JMP are nice and ideal, providing a clear analytical path, but they don't really address gray areas like this where a clear analytical path isn't so straightforward.

Ultimately, it would be helpful to have a model that has low error and that the residuals are normally distributed, and centered around 0. The mixed model platform results give this (but also give non-physical response predictions), whereas the GenReg platform does not, but does provide physically valid response predictions. Again, how does one handle/manage this in a real, non-ideal situation?

Keep in mind that I have several other more broad questions/concerns as well.

I hope this has helped to clear up some points, but also to direct discussion to the other more general questions like augmentation, or what are best practice approaches to managing analysis when your data doesn't fit nicely into one model or the other, or why does the profiler always suggest extreme settings for optimal responses, and is this an indication something larger is wrong?

DS

Rily_Maya · Jun 14, 2025 08:00 AM

Personal Perspective on Split-Plot Experimental Design:‌
Before conducting the formal experimental design, fix all factors at their normal production levels and collect data by blocking. Analyze the significance of the random block effect. If the random block effect is not significant, proceed with conventional experimental design methods for the formal experiment, but impose randomization restrictions on some factors during execution and use standard analytical methods. If the random block effect is significant, adopt a split-plot design.

Discussions

Split Plot DOE discussion time

Re: Split Plot DOE discussion time

Re: Split Plot DOE discussion time

Re: Split Plot DOE discussion time

Re: Split Plot DOE discussion time

Re: Split Plot DOE discussion time

Recommended Articles