Hi all!
I've run a PLS-DA test for a data set that consists of FTIR spectral data of 5 different systems under 5 different treatments. I've used the fit model platform so that I could handle categorical variables in my Y variables. According to PRESS statistics, this model has an optimum number of factors of 7 and possesses good discriminating power between my samples.
Then, I noticed that the statistics provided by the menu interface are related to cross-validation (PRESS, van der Voet, etc). For internal purposes, I needed to run a permutation test in order to check for the statistical significance of the model (i.e. that the model's discriminating power is not just due to random chance). Ideally, I'd need to construct a graph as the one attached (credits for the image to Daniloski et al., 2022).
I've tried to do that by using the multivariate > partial least squares and then filling in the validation box with a validation column i've created using the "Make K fold validation column" option under the predictive modeling (in this case, I've created dummy variables to replace my categorical variables for continuous so that the platform can handle them). When doing so, the first thing I've noticed is that the number of statistical significant factors in this model is now 5 (compared to the previous 7) and, the discriminating capacity has decreased substantially. Furthermore, I still don't know how to make the graph from Daniloski et al., 2022.
I'd greatly appreciate if some one could shed some light on how to do this, ideally, by keeping the 7 factors (ie constructing the model via the fit model platform and using categorical variables), since that model discriminated better between my samples.
Thanks in advance.
Mariana