Discussions

DualARIMACougar

Hello everyone,

I am working with a very large DoE dataset in JMP Pro and would like to evaluate it quickly with as few manual clicks as possible. I am still fairly new to JMP Pro, and until recently I have mainly analyzed DoE studies using standard Least Squares methods. Therefore, even after reading documentation and tutorials, I am still unsure about the most efficient workflow for my current task.

My design includes three factors and approximately 20 responses, grouped by different treatments. I would like to refit a Response Surface Model (RSM) for each response and analyze them consistently.

My main questions are:

Is there a way to run Stepwise Regression or RSM fitting more efficiently without having to repeatedly click “Go” for each step?
Are there tools or options in JMP that allow semi-automated or single-click workflows for multiple responses (e.g., Fit Definitive Screening / Fit Model presets, model dialogs, or other shortcuts)?
As an additional question: if full automation is needed, would scripting (JSL) or batch processing be the best way to scale this up? If so, are there recommended starting points or templates for beginners?

My goal is to speed up the model evaluation and refitting steps as much as possible, ideally without heavy scripting.

Thank you in advance for your help and suggestions.

statman · | Posted in reply to message from DualARIMACougar 12-18-2025

It is virtually impossible to provide specific advice on the information you provided. Can you attach the experiment? Or at least the experiment coded?

Sorry, I'm a bit confused by your query. Your post suggests you have 1 large DOE data set. Not sure I understand as you only have 3 factors? Why would you be setting up a workflow? In any case the first step for any experiment is to determine if you created enough variation from a practical perspective. Did the response variables vary enough to be interested in the analysis of the experiment? If so, the next step is to do multivariate correlation of the many Y's. At the same time check for any outliers using Mahalanobis outliers test. Depending on the number of correlated Y's (if they correlate the analysis will be the same), you may have to fit Y's separately. Stepwise is not a procedure for analysis of DOE. It is for historical or observational data when you are searching for a model. In DOE, you have a model in mind (or you should), so Fit Model is your platform.

"All models are wrong, some are useful" G.E.P. Box

DualARIMACougar · | Posted in reply to message from statman 12-18-2025

Hi Statman,

thank you for the clarification. Let me explain my setup in more detail:

I am working with a three-factor RSM design, but for these three factors I have measured 25 different responses. These responses are grouped into five treatment groups (using the ‘By’ role). So while the factor space is small, the total number of response/treatment combinations is large. In addition, I have multiple datasets of this type (five DoEs in total), therefore the repetitive analysis steps add up significantly.

I fully understand that the dataset would be helpful and I love to share it but I cannot share the data here due to CDAs, and anonymization is unfortunately not an option too.

I generated the designs using JMP’s Custom Designer and I am using the Fit Model platform for the analysis. My intention was to select the Stepwise personality rather than Standard Least Squares to introduce more objectivity using AIC/BIC criteria rather than p-value based selection,+ less clicks since I only have to click on the GO button. The stepwise personality still fits a polynomial model to my data but more automized, less bias and not only driven by p-value.

However, the practical issue is that for each response in each treatment group, I still need to repeatedly click the “Go” button. With many responses and multiple datasets, this becomes highly time-consuming. My question is simply whether this part of the workflow can be automated or accelerated.

My end goal is to fit all responses, evaluate them, and generate prediction profilers using the DOE fitting platform. I prefer not to remove outliers beforehand because I want to evaluate them after fitting and avoid biasing the analysis upfront.

So the core question remains:
Is there a way to run Stepwise (or Fit Model) across multiple responses and treatment groups without manually clicking “Go” for every response—either through existing UI shortcuts or via scripting/batch execution?

This would substantially reduce the number of manual clicks required.

Does this additional context help to answer my workflow question?

Also I analyze DoEs now since several years with JMP, always using the old school way with standart least squares. Now I have the JMP Pro version and Iam still hoping to have some options here, which saves time.

Victor_G · Dec 18, 2025 1:19 PM

Did you try to use CTRL+click on "GO" ?
Using CTRL before an action help broadcast the action to every similar button/box. It can also be used to change the settings in the Fit Stepwise estimation for all responses at the same time.

However, Bill's remarks on Stepwise selection still remains. If you have build a DoE with Custom Design platform, that means you have an apriori model for this experimental scenario. You should use it first, and use other modeling methods (with caution) as competitor models to compare the performances and the pros and cons of each.

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

DualARIMACougar · | Posted in reply to message from Victor_G 12-18-2025

Hi Victor,

First of all thanks for your help as always!!

thanks this little combination helps here indeed.
Regarding your comment on the restriction of the fit model, I am a bit confused.
From my understanding, both standart least square regression with manual p-value selection as well as stepwise regression fit a polynomial model to my data, the terms of the polynomial model are specified in the model dialog (in my case its a RSM with).
I found this very nice video: (Mastering JMP) Deploying Stepwise Regression Methods in JMP and JMP Pro - YouTube.
and this explanation: 11.2 - Stepwise Regression | STAT 462

Here the differences between standart least square and stepwise regression are well explained and from my naiive conclusion its kind of the same+not only p value selection is used for stepwise but additionally BIC (or AIC) criterion is used, which makes sense, since the question, what variables are important for the model without increasing the complexity of the model (simply concluded :-)), avoids subjective p-value bias and automates the whole process of model building - giving more objectivity. Would you agree here?

Still my question is not answered and I will repost my questions here and add some more:

1) Is there a practical and fast way to automize DoE model building in JMP Pro for huge datasets with tons of responses (simple Design done with custom designer, e.g. RSM, custom design, etc.)?
2) Is stepwise regression an option here?
3) Since I only need the final prediction profiler with all of my responses together, I face another problem: How to I safe the final prediction profiler without redoing the whole analysis. If I currently safe it as script, I have to redo everything, the prediction profiler after model building is not safed.
4) In JMP Pro I have automatically in my script window the general regression option, which I also can use (And the hover explanation states, that this is the method of choice for all linear regression options). What is the advantage here over standart least square or stepwise regression? How can I automate the analysis in the general regression model builder the strg+click option does not work here)?

Thanks a lot for your helps in advance and I suppose that the question of automating DoE analysis and model building of huge datasets is not only an issue I face an dwould help a lot of people!

statman · Dec 19, 2025 11:56 AM

The best way to analyze a RSM is to use graphical contour plots. If you are at the stage of optimization which is what RSM is intended for, then you should already know significance of first and second order model effects.

When doing Fit Model and assessing the model adequacy, p-value is only one of the statistics you should use (and only use it when you have some idea of what is being used to assess the MSE and that bit is representative). Again before you do anything, assess the integrity of the data (multiple methods to do this). Then the practical significance. You also need to assess R-square Adjusted (larger is better), delta between R-Square and R-Square adjusted (smaller is better otherwise you have an over specified model), RMSE (smaller is better). And always assess the practical implications. Is the model useable?

TBH, your situation is not very complex. Handling the analysis shouldn't take very long (I had an experiment with 3281 responses once).

You can create a workflow, but the steps may be different depending on what you find as you go.

https://www.jmp.com/support/help/en/19.0/?os=mac&source=application#page/jmp/workflow-builder.shtml

"All models are wrong, some are useful" G.E.P. Box

DualARIMACougar · Dec 20, 2025 3:36 AM

Hi Statman,

thanks again for your explanations. I do understand the classical DOE perspective: in a three-factor RSM the model is predefined, the focus is on optimization rather than on model hunting, and things like adjusted R-square, RMSE, and practical relevance are important for adequacy. I am aligned with that.

My problem is not the statistics. It is the practical workload. In my field (protein stability work), I often have ~25 responses per dataset—aggregation levels, charge variants, hydrodynamic size, Tm, etc.—grouped by treatments. The DOE setup itself stays the same (three factors), but the actual response behavior changes with each protein. So I cannot simply reuse a Workflow Builder template; each protein dataset has to be evaluated individually, even though the quadratic model structure is identical.

I also understand contour or surface plots for RSM. These are helpful for intuition. But they are essentially qualitative, because they show the predicted surface in 2D:

y_hat(x1, x2 | x3 = constant)

where

y_hat is the predicted response from the model,
x1, x2, x3 are the three factors,
and “| x3 = constant” just means x3 is held fixed.

This is fine for a single response, but with 20–25 responses it becomes hard to base decisions on contour plots alone.

In practice, I rely more on the Prediction Profiler (and desirability) because it gives me a quantitative way to evaluate and optimize multiple responses at once. For example, for k responses I have predicted values y1_hat, y2_hat, …, yk_hat, and each has its own desirability function d1(y1_hat), d2(y2_hat), …, dk(yk_hat). The overall desirability that I try to optimize is:

D(x) = (d1 * d2 * … * dk)^(1/k)

with x = (x1, x2, x3).
This allows me to numerically optimize the factor settings for multiple stability readouts at once. That’s something contour/error surfaces can’t easily do when the number of responses is high.

So the core issue is not methodology. It’s the time spent clicking. To get from raw data to profiler and desirability-based decisions, I currently need more than an hour per dataset just to run the same quadratic model for each response. Across several datasets, this is four or five hours of mostly repetitive clicking—even though the model definition doesn’t change.

Workflow Builder doesn’t solve this for me, because even if the factors stay identical, each protein dataset yields different responses and I still need to review model adequacy individually.

That’s why I’m looking for a more efficient way—ideally via JSL—to:

loop over a list of response columns,
apply the same quadratic RSM model in Fit Model and optimize by minimizing for example p or maximizing R2 adjusted,
and automatically generate profilers (and desirability)

without having to press “Exclude” again and again.

So my question is simply: is there a way in JMP Pro to batch this process? If you know a JSL pattern or example script that fits this scenario, it would help me a lot. I want to stay aligned with DOE best practice—I just want to reduce manual clicking in the process.

statman · | Posted in reply to message from DualARIMACougar 12-20-2025

Do any of the response variables correlate? If so this may lessen the load. If Y's don't correlate, you will have to make compromises regarding priorities. The models will be different. I don't know how to write a script that integrates logical, rational thinking into decision making regarding model refinement. There may be some coding that makes you more efficient, but I don't know it.

If it were me, I'd be investigating what makes each protein different and why/how should the factors be set differently for each protein (perhaps proteins can be categorized?), but I am not an SME.

"All models are wrong, some are useful" G.E.P. Box

Discussions

Efficient RSM fitting and Stepwise Regression on large DoE datasets with minimal clicks in JMP Pro

Re: Efficient RSM fitting and Stepwise Regression on large DoE datasets with minimal clicks in JMP Pro

Re: Efficient RSM fitting and Stepwise Regression on large DoE datasets with minimal clicks in JMP Pro

Re: Efficient RSM fitting and Stepwise Regression on large DoE datasets with minimal clicks in JMP Pro

Re: Efficient RSM fitting and Stepwise Regression on large DoE datasets with minimal clicks in JMP Pro

Re: Efficient RSM fitting and Stepwise Regression on large DoE datasets with minimal clicks in JMP Pro

Re: Efficient RSM fitting and Stepwise Regression on large DoE datasets with minimal clicks in JMP Pro

Re: Efficient RSM fitting and Stepwise Regression on large DoE datasets with minimal clicks in JMP Pro

Recommended Articles