Recently I have started reading about principal component analysis. I have a big dataset (millions) which I will try to explain with 73 different variables. So I would like to reduce the number of variables so I will use this method.
I did the analysis and checked the scree plot. So I understood that I can use 8 components, I used "save principal components" by using red triangle and add this value. Now I have these results in my dataset.
What I could not understand (because of the various information in different sources) where I can see the components of each group. In principal components page I could view "formatted loading matrix". Is this the data I should see?(This is the one suggested in SAS report I believe).
On the other hand, a colleague advised me to do factor analysis after that and choose principal components again there. But then I should decide the number of components again. So if this is the case, how I can view these results and adapt it to my study? And some of my variables are correlated with each other. So, which oblique rotation I can go with? How can I decide that?
The formatted loading matrix enables you to better understand the correlation patterns between the original variables and PCs with use the sliders to dim or suppress the loading values.
(a) The # of factors recommended by JMP, as shown in the Factor Analysis launch dialog window, is based on the eigenvalue>1. Of course you can increase or decrease as needed. (b) People typically use Varimax for orthogonal rotation, and Quartimin (or Obquartimax) for oblique rotation. Do you want the rotated factors to still be uncorrelated? Or the oblique rotation leads to more interpretable factors?