I am trying to design an experiment where some factors are conditional, and I am not sure whether this can be handled properly using Custom Design or whether the design should be constructed manually.
I have one factor, X1, with a reference condition at 1, which is also the maximum value. X1 can only be varied below this reference condition, but one case of interest is also keeping it at the reference value.
For the other two factors, the structure is conditional:
So X2 and X3 are not simple continuous factors, because their continuous values only make sense when the corresponding factor is active/present.
My objective is to understand:
- The effect of X2 and/or X3 being present versus not present
- The effect of changing the level/intensity of X2 and/or X3 when they are active
- How these effects behave at different values of X1
- Whether there are interactions between X1, X2, and X3
One idea I am considering is to treat X2 and X3 as discrete numeric factors, where 0 represents OFF/absent and the non-zero values represent the active continuous range. For analysis, I would then avoid automatic coding/centering of polynomial terms so that the numeric levels are interpreted more directly.
However, I understand that this approach has drawbacks. In particular, the jump from 0 to the first non-zero level may combine two effects:
- the effect of switching the factor ON/present
- the effect of moving to the lowest active intensity/dosage
So it may not cleanly separate the activation/presence effect from the intensity/dosage effect. This could also make interaction terms harder to interpret, especially if X2 and X3 behave differently at different values of X1.
A second approach I am currently trying is to represent X2 and X3 using both a categorical activation factor and a discrete numeric level factor:
- one categorical OFF/ON factor plus one discrete numeric level factor for X2
- one categorical absent/present factor plus one discrete numeric level factor for X3
DOE(
Custom Design,
{
Add Response( Maximize, "Y", ., ., . ),
Add Factor( Continuous, -1, 1, "X1_Level", 0 ),
Add Factor( Discrete Numeric, {0, 1, 2, 3}, "X2_Level", 0 ),
Add Factor( Categorical, {"Off", "On"}, "X2_Status", 0 ),
Add Factor( Discrete Numeric, {0, 1, 2, 3}, "X3_Level", 0 ),
Add Factor( Categorical, {"Absent", "Present"}, "X3_Status", 0 ),
Set Random Seed( 2055292721 ),
Number of Starts( 4702 ),
Add Term( {1, 0} ),
Add Term( {1, 1} ),
Add Term( {2, 1} ),
Add Potential Term( {2, 2} ),
Add Term( {3, 1} ),
Add Term( {4, 1} ),
Add Potential Term( {4, 2} ),
Add Term( {5, 1} ),
Add Term( {1, 1}, {2, 1} ),
Add Term( {1, 1}, {4, 1} ),
Add Term( {2, 1}, {4, 1} ),
Set Sample Size( 12 ),
Disallowed Combinations(
("X2_Status"n == "Off" & "X2_Level"n > 0) |
("X2_Status"n == "On" & "X2_Level"n == 0) |
("X3_Status"n == "Absent" & "X3_Level"n > 0) |
("X3_Status"n == "Present" & "X3_Level"n == 0)
),
Simulate Responses( 0 ),
Save X Matrix( 0 ),
Make Design
}
);
My concern is that the categorical status factor may be redundant, because the OFF/ON or absent/present status is already implied by the numeric level. I am therefore not sure whether this setup can truly separate the activation effect from the level/intensity effect, or whether it introduces collinearity/confounding that makes the model difficult to interpret.
My questions are:
- Is it statistically sensible to include both the status factor and the discrete numeric level factor?
- Can this setup meaningfully separate the activation effect from the level/intensity effect?
- Would it be better to use only the discrete numeric factors, with
0 = Off/Absent and 1–3 = active levels?
- Or is there a better way to handle this type of conditional factor structure in JMP Custom Design?