Discussions

CurseOfCluster6

Dear JMP community,

I would like to characterize and optimize a biochemical reaction (enzyme assay). Therefore, I'm planning to do a screening first to scan for relevant factors. For now, I've decided to screen 8 factors (2x two-level categorical, 1x three-level categorical, 5x continuous).

I am familiar with the classical two-level screenings, but have never used a custom design before. My problem is my three-level categorical factor. It is the reason I chose a custom design and it seems to increase the necessary experiments a lot. Around 20 experiments for only main effects and over 50 experiments for main effects and two-factor interactions (with power over 0.8 for each factor).

Is there another option to find a compromise? I thought about trying to remove one level of the three-level categorical factor by pretests, only focusing on main effects for now, or dealing with a lower power of the test.

Thank you very much!

(JMP Version: Student Edition 18)

statman · | Posted in reply to message from CurseOfCluster6 11-10-2025

Of course you have provided virtually no information on the situation (e.g., response variables, factors) so it is impossible to give specific advice. My first question is why a 3-level categorical (it seems also your question)? Anytime you are choosing more than 2 levels I ask the question (to myself) are you trying to "pick a winner" or are you interested in the factor effect for further study? Of the 3 levels, can you pick the two that are extremes to expose that factor's effect? I wouldn't do a pre-test, because conclusions from that may not be "best" given potential interaction effects (inference space issues). To a small extent, you are biasing the study to the 3-level factor (more DFs) and why estimate a quadratic effect for a categorical factor...non-sensical. Test it at 2 extreme levels. If that is significant, fine tune levels in subsequent experiments.

My partial advice: Design multiple options (easy to do with JMP), consider for each option. What do you get? (e.g., What effects can be estimated? Which are confounded? Which are not in the study?). What are the resource requirements for each? Compare the potential for knowledge gained vs. resource requirements. Predict ALL possible outcomes (not just what you think you'll get). Predict what you will do for each possible outcome. Take this thought process and choose an experiment and run it. Analyze, iterate.

"The best design you'll ever design, is the design you design after you run it." Ross

No one knows the right design...if you do you probably don't need to run it.

BTW, power calculations assume you have actual knowledge of the variance of the response. Do you?

"All models are wrong, some are useful" G.E.P. Box

CurseOfCluster6 · | Posted in reply to message from statman 11-10-2025

Dear @statman,

thank you! Sure, I can provide more information:
Enzyme activity is my response variable, basically meaning how well the reaction goes.
All factors are reaction-influencing parameters:

- 5x continous with high and low levels: pH, temperature, concentration of component A, B and C
- 1x two-level categorical: type of component A (levels: type A.1 or A.2)
- 1x three-level categorical: type of component B (levels: type B.1, B.2 or B.3)
- 1x two-level categorical: presence of another component D (levels: yes and no)
Hope this is understandable.

So I chose the three-level categorical factor because I have to check all three types of component B. I would like to get information about whether the type of component B is relevant at all and which one I want to pick for studying the effect in more detail (the "winner"). In addition, I would like to determine the effect of the concentration of components A and B, and I did this by choosing categorical factors for the types and continuous factors for the concentrations.

Ok, I will design multiple options and compare the results. Yes, I was able to determine the variance of the response experimentally.

statman · Nov 11, 2025 6:36 AM

Does changing the concentration of one component (e.g., A, B or C) impact the concentration of another component? If so, you may need a mixture design. Again, the first experimental strategy, as you suggest in your first post, is screening. You don't need to test all of the levels of B in order to ascertain if that variable should be studied further. Do the extremes of the 3 levels in the screening phase.

"All models are wrong, some are useful" G.E.P. Box

CurseOfCluster6 · | Posted in reply to message from statman 11-11-2025

No, changing the concentration of one component does not impact the concentration of other components. Regarding component B: There are no extremes. The levels are just different types of B, which can not be "ranked".
Thanks!

statman · Nov 11, 2025 7:21 AM

So you have no hypotheses (or predictions) about which level would be best or which would be worst?...Perhaps you should study the theoretical chemical/physi`cal mechanisms of this factor and the various levels prior to experimentation?

"All models are wrong, some are useful" G.E.P. Box

Victor_G · Nov 12, 2025 6:30 AM

Hi @CurseOfCluster6,

Welcome in the Community !

I wanted to answer your topic to bring a different yet complementary response on your situation and possible options.

First, regarding the 3-levels categorical factor, if you're in a screening phase, I would recommend like @statman to (if possible) reduce the number of levels to 2 by only keeping the most different types of component B.
Do you already have some preliminary tests that could help you discriminate the levels ? Or are these components characterized in order to select the two most dissimilar components (based on chemical/physical parameters, molecular descriptors, etc...) ? Very often in a 3-levels scenario, two levels end up being similar and having a similar influence on the response, it's quite rare to find three levels that have very different impacts on the response. Depending on your ability to reduce the number of levels to 2, you may have more or less design options.
Regarding your factors and levels and your screening objectives (focussing on main effects estimation), you have 9 (or 10 if you keep the 3-levels of the categorical factor) terms to estimate. Depending on your experimental budget and the confidence and degree of certainty you want to have in your results, screening design can provide you options from a minimum of 10 to 20 (and more...) experiments to run.
Here are some designs possibilities to consider:
- If you can reduce the number of levels to 2 for all your categorical factors, Mixed-Level Screening Designs can be a very interesting option to consider. With three 2-levels categorical factors and 5 continuous factors, you can create a design with 10, 16 or 32 runs:
  
  The 10-runs option may not be recommended, as some interactions effects may be aliased (completely correlated) with some main effects. The 16-runs design is interesting, as it allows the estimation of main effects, as well as the detection of interactions and quadratic effects (for continuous factors), provided these effects are large enough.
- If you can't reduce the number of levels to 2 for the 3-levels categorical factor, you have several options left:
  - You can use a L18 Hunter classical screening design with 18 runs, using the platform Screening Designs and choosing the option "Choose from a list of fractional factorial designs" :
  - You can create a Custom Design with your 8 factors, for a recommended number of runs of 18 or more if you can run more experiments. Take advantage of the Design Explorer to try different run sizes and compare how each added run may improve the design performances.

Regarding your need to reach a certain power, power is only meaningful when/if:

You know the size of the signal that you need to detect
You have good estimates of the experimental and response measurement noise (RMSE)

Otherwise, it may be best to use power (and other design diagnostics) comparatively between designs.
Here are some posts related to power questions :
Should I consider power analysis in DOE?
Doe and power

Power is only one of the metric and diagnostic available for your design. I would also strongly recommend comparing the designs by looking at the Color Map on Correlations, to know which effects could be correlated and aliased and trying to find a design with the best Estimation Efficiency, lowest prediction variance (Fraction of Design Space Plot & Prediction Variance Profile) and best D-Optimality Design Diagnostics.

Hope this answer will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

CurseOfCluster6 · | Posted in reply to message from Victor_G 11-12-2025

Dear @Victor_G ,

thank you very much for the detailed response!
As @statman also suggested, I will try to remove one level of the three-level categorical factor based on their theoretical properties.
In case this works the mixed model sounds like a very interesting solution and will try to compare multiple variants based on the diagnostic variables you mentioned.

louv · | Posted in reply to message from CurseOfCluster6 11-10-2025

Just an added question to the great input already posted. One of your categorical two level factors is yes and no. Why not just make it a continuous factor where the low level is zero and the high level is whatever the "yes" level is?

Victor_G · Nov 12, 2025 8:34 AM

Totally endorse this proposal @louv ! The "yes/no" categorical factor could be changed to a continuous factor with levels 0 and (max value). With a Mixed-Level Screening design (or similar design approaches), this change of factor type can lead to an information improvement, as using a continuous factor in this type of design enable the estimation of main effect (like categorical factor), but also the detection of interaction and quadratic effects (if these effects are large enough).

Even for categorical factors involving chemistry types/molecules... like factors for different chemistries for component A and B, there are a lot of things possible to go from a categorical space (with different molecules, catalyst chemicals type, etc...) to a continuous one (with molecular descriptors, physical/chemical properties, ...). This option has been discussed here
Efficient DOE of one multi-level (3+) categorical variable and many continuous variables and can be found in these talks:

Coding with Continuous and Mixture Variables to Explore More of the Input Space (2022-US-45MP-1103): https://community.jmp.com/t5/Abstracts/Coding-with-Continuous-and-Mixture-Variables-to-Explore-More-... : This talk shows how switching from categorical to continuous factors the definition of the design space enable to have a broader inference space where it is possible to find solutions.
Increase Efficiency and Model Applicability Domain When Testing Options That Are at First Glance Multilevel Categorical Factors : https://community.jmp.com/t5/Abstracts/Increase-Efficiency-and-Model-Applicability-Domain-When-Testi... : This talk shows how using many descriptors for chemical structures and reducing the dimensionality through PCA can help select continuous factors that allow to create a DoE built on a large design space.

Hope these suggestions may help further,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

Discussions

Improvement of (Custom) Screening Design to lower number of experiments

Re: Improvement of (Custom) Screening Design to lower number of experiments

Re: Improvement of (Custom) Screening Design to lower number of experiments

Re: Improvement of (Custom) Screening Design to lower number of experiments

Re: Improvement of (Custom) Screening Design to lower number of experiments

Re: Improvement of (Custom) Screening Design to lower number of experiments

Re: Improvement of (Custom) Screening Design to lower number of experiments

Re: Improvement of (Custom) Screening Design to lower number of experiments

Re: Improvement of (Custom) Screening Design to lower number of experiments

Re: Improvement of (Custom) Screening Design to lower number of experiments

Recommended Articles