Hi @Fruit325,
In the presence of correlation between your factors, the use of optimal designs (D-, A- or Alias-) with the covariates runs from your DoE will help select the most dissimilar and non-correlated runs.
Here is an example with 50 data points, only 2 factors (for visualization and understanding purposes) with (high) correlation of 0,644 between X1 and X2 :
You can see that the selected red points correspond to a Alias-Optimal (same results with D-Optimal design) Custom design, and are chosen to be the most "spread out", helping reduce the correlation between X2 and X1 and estimate more precisely the main effects for X1 and X2 (despite the correlation).
In this example, the choice of a D- or Alias-optimal design doesn't make any change in the selected covariate runs, but with more factors (as it seems to be the case with your DoE datasets), using Alias-Optimal designs may prevent having correlations between effects in your model and potential effects not in your model (like interactions, quadratic or higher order effects).
I would still like to emphasize the need to separate analysis needs and data collection strategy: if you want to analyze the impact of correlated variables on responses, you can use appropriate modeling options with JMP (Principal Component Analysis, Partial Least Squares ...) or JMP Pro (using penalized regression techniques like Lasso, Ridge, Elastic Net, under the Generalized Regression Models (jmp.com) platform, or using Machine Learning models able to handle correlated variables like Bootstrap Forest), without the need to select/filter data points.
However in some cases, it's easier to spot patterns in a high quality small-size dataset than in a medium-quality big dataset.
Trying both options may be helpful to determine if the patterns found by these two options are similar and if they can be complementary to each others.
Hope this answer will help you,
Victor GUILLER
"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)