Subscribe Bookmark RSS Feed

Factor Analysis - Which combination to use?

anoopchengara

Community Trekker

Joined:

Aug 9, 2013

Hello,

 

In doing Factor Analysis on the Principal Components, which combination to use for the Factoring Method (Principal Components or Maximum Likelihood) and Prior Communality (Principal Components or Common Factor Analysis)? The default option is the Maximum Likelihood and Common Factor Analysis, but I would like to know how to interpret the factors derived from the other combinations.

 

In my data set, I found no difference in the results of the factors between the choice of Maximum Likelihood/Principal Components and Maximum Likelihood/Common Factor Analysis, but the other two combinations give results different from each other and from this combination.

 

I would like to have an explanation of the significance (or lack of) of the factors obtained from the other choices of the Factoring Method and Prior Communality.

 

Thank you

Anoop

1 ACCEPTED SOLUTION

Accepted Solutions
di_michelson

Staff

Joined:

Sep 10, 2014

Solution

Factor analysis tries to fit the model Y = XB + E where you only know Y. X, B, and E are all unknown. Therefore, many key conditions have to be imposed on the parameters. Factor analysis involves the decomposition of the R–U matrix, where R is the correlation matrix of the manifest variables (Y) and U is the correlation matrix of the unique factors (E).

 

Principal factor analysis does eigenvalue decomposition of the R–U matrix. It is not PCA! Maximum likelihood is an iterative method that maximizes the likelihood function given the factors.

 

SMC uses the Rsquare of each variable with the rest as an estimate of the diagonal element of R–U. PC uses 1 on the diagonal of R–U.

 

This point is key: since each of these methods are valid mathematically, they are all correct. The researcher needs to use subject matter knowledge in addition to the graphs and statistics from the analysis to determine the factors and their interpretability. The factor analysis model assumes that there are unmeasureable factors causing the manifest variables, and the model which gives the best interpretion is the most useful model, the winner.

 

Please let me know if I can provide more detail to help your understanding of factor analysis.

 

Small advertisement: for more information, JMP offers a one-day course in unsupervised learning, which covers PCA, factor analysis, and clustering. It will be taught in two half-days via Live Web in June. See jmp.com/training for more info.

4 REPLIES
markbailey

Staff

Joined:

Jun 23, 2011

Have you read the explanations of the methods in Help > Books > Multivariate guide book already?

Learn it once, use it forever!
di_michelson

Staff

Joined:

Sep 10, 2014

Solution

Factor analysis tries to fit the model Y = XB + E where you only know Y. X, B, and E are all unknown. Therefore, many key conditions have to be imposed on the parameters. Factor analysis involves the decomposition of the R–U matrix, where R is the correlation matrix of the manifest variables (Y) and U is the correlation matrix of the unique factors (E).

 

Principal factor analysis does eigenvalue decomposition of the R–U matrix. It is not PCA! Maximum likelihood is an iterative method that maximizes the likelihood function given the factors.

 

SMC uses the Rsquare of each variable with the rest as an estimate of the diagonal element of R–U. PC uses 1 on the diagonal of R–U.

 

This point is key: since each of these methods are valid mathematically, they are all correct. The researcher needs to use subject matter knowledge in addition to the graphs and statistics from the analysis to determine the factors and their interpretability. The factor analysis model assumes that there are unmeasureable factors causing the manifest variables, and the model which gives the best interpretion is the most useful model, the winner.

 

Please let me know if I can provide more detail to help your understanding of factor analysis.

 

Small advertisement: for more information, JMP offers a one-day course in unsupervised learning, which covers PCA, factor analysis, and clustering. It will be taught in two half-days via Live Web in June. See jmp.com/training for more info.

anoopchengara

Community Trekker

Joined:

Aug 9, 2013

Hello Ms. Michelson

 

Thank you very much for your reply. I understand it better now. Just as a clarification, it is the eigenvalue decomposition of the R minus U matrix?

 

As a follow up question, in performing this operation on another data set, ML failed because the correlation matrix was singular and JMP recommended using PC as the factoring method, which succeeded. I then relaunched the operation with some other variables and the correlation matrix became non-singular. Unfortunately, I do not recall the combination of variables that made the correlation matrix singular. Are there any mathematical rules for choosing the correct variables to find the Principal Components of? 

 

I will also sign up for the course in June.

 

Thank you

Anoop

di_michelson

Staff

Joined:

Sep 10, 2014

I'm glad the explanation made sense, and I look forward to seeing you in the course. Yes, the loadings are found from the eigenvalue decomposition of the R–U (R minus U) matrix. The maximum likelihood procedure with SMC starts with those Rsquares on the diagonal of that matrix, then it iterates to maximize the likelihood function. In your example, the iteration did not converge. The principal factoring method starts with the full R matrix, with ones on the diagonal.

 

It is important in factor analysis to choose the manifest variables carefully. You need to factors to be interpretable, so you want to choose manifest variables that will support interpretation of the factors. There is not a mathematical rule to do this, only subject matter knowledge. However, you can always use the Multivariate platform to find the correlation matrix. Perfectly correlated variables will have a correlation of 1, and one of the pair can be removed.