Hi @Sloux ,
As far as I understand, the Response Screening platform will only compare a bunch of Y vs X combinations, but not Y vs X1*X1 or Y vs X1*X2 kind of combinations. So, it might not be the right platform for what you are doing. It might help to understand what of your data set is Y (response) and what exactly is the X.
To me, what you describe, it sounds like you're actually hunting down what are the best predictors and combinations of predictors to get your outcome. If that's the case, you might want to consider a couple different approaches. Some of the comments below depend on whether or not you obtained your data from a DOE. Also, some of the options require JMP Pro.
- You might consider using the Analyze > Screening > Predictor Screening Platform to check on what are the best predictors. This can depend somewhat on the number of trees you use, so you might want to play around with that. Even then, you'll want to run some bootstrap simulations on the Portion column to really get a better grip on which factors impact the outcome. However, this approach also doesn't allow for crossed terms.
- Use the SLS platform (Fit Model with Personality SLS) then again run bootstrapping on the estimates to see which factors appear most frequently. This approach allows you to look at crossed terms. However, unless you have some kind of evidence that the crossed terms MUST be in there, it's generally recommended not to include the terms. (Here is where the DOE matters -- if the DOE shows there must be crossed terms, then include them).
- Use the GenReg platform (Fit Model with Personality GenReg) then again run bootstrapping on the estimates to see which factors appear most frequently. This approach allows you to look at crossed terms. However, unless you have some kind of evidence that the crossed terms MUST be in there, it's generally recommended not to include the terms. (Here is where the DOE matters -- if the DOE shows there must be crossed terms, then include them).
- Use the PLS (Fit Model with Personality PLS) this will often help you to cut back factors that fall below a Variable Importance value of 0.8 (default, but you can change this).
- You also might try the Analyze > Specialized Modeling > Functional Data Explorer platform -- this depends on if your data is like a spectrum, i.e. some kind out output measurement at different wavelengths, voltages, etc. You can combine this with the Z, Supplementary optional column to include a "DOE" kind of aspect in the analysis and get some prediction profilers out of it. This can sometimes reduce the large data set of wavelengths down to just a few that are the best at predicting an outcome.
- DOE Autovalidation, there is an add-in tool you can get to help with this analysis, and it is somewhat similar to the bootstrapping for the estimates in the other platform, but works slightly different.
These are just some general comments on how to approach evaluating what it sounds like you're trying to do. Depending on how many rows you have, you should consider how you might want to validate things, e.g. leave one out or k-fold, etc. I highly recommend using more than one approach and comparing them. Because the algorithms are slightly different and use a different random seed, the results are often slightly different and can help to make the final decision call as to whether or not to include certain factors and cross terms.
Hope this helps!,
DS