I am exploring the best ways to present multivariate data for further exploration. Experience has shown it can be difficult for people to grasp what scores and loadings plots in PCA are telling them and how similar different clusters are to each other. I can get T2 values from the multivariate control chart platform (after specifying a training group), and the critical value using the formula in the documentation.
I would like to colour the points/change the markers to highlight different groups.It is straight forward to colour/change the markers to identify the training set and points which exceed/are below the critical value.
I would like to use the clustering platform to group the points with the requirement that all the points in the training set are grouped into a single cluster. Is there a smart way to do this other than run the clustering platform multiple times with increasing numbers of clusters until less than 95 % of the points in the training group are no longer in the same cluster?
What's the best way to store the different colour layouts (pass/fail, clusters, known groups, etc) so they can be switched between using some JSL?
Does anyone have any other ideas how to explore/present the data (besides parallel plot, star plots, scatter plots or control charts of the principal components)?