Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Factor Analysis - Which combination to use?

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Feb 22, 2017 9:16 AM
(892 views)

Hello,

In doing Factor Analysis on the Principal Components, which combination to use for the **Factoring Method** (*Principal Components* or *Maximum Likelihood*) and **Prior Communality** (*Principal Components* or *Common Factor Analysis*)? The default option is the Maximum Likelihood and Common Factor Analysis, but I would like to know how to interpret the factors derived from the other combinations.

In my data set, I found no difference in the results of the factors between the choice of *Maximum Likelihood/Principal Components *and* Maximum Likelihood/Common Factor Analysis, *but the other two combinations give results different from each other and from this combination.

I would like to have an explanation of the significance (or lack of) of the factors obtained from the other choices of the **Factoring Method** and **Prior Communality.**

Thank you

Anoop

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Feb 24, 2017 5:34 AM
(1681 views)

Solution

Factor analysis tries to fit the model Y = XB + E where you only know Y. X, B, and E are all unknown. Therefore, many key conditions have to be imposed on the parameters. Factor analysis involves the decomposition of the R–U matrix, where R is the correlation matrix of the manifest variables (Y) and U is the correlation matrix of the unique factors (E).

Principal factor analysis does eigenvalue decomposition of the R–U matrix. It is not PCA! Maximum likelihood is an iterative method that maximizes the likelihood function given the factors.

SMC uses the Rsquare of each variable with the rest as an estimate of the diagonal element of R–U. PC uses 1 on the diagonal of R–U.

This point is key: since each of these methods are valid mathematically, they are all correct. The researcher needs to use subject matter knowledge in addition to the graphs and statistics from the analysis to determine the factors and their interpretability. The factor analysis model assumes that there are unmeasureable factors causing the manifest variables, and the model which gives the best interpretion is the most useful model, the winner.

Please let me know if I can provide more detail to help your understanding of factor analysis.

Small advertisement: for more information, JMP offers a one-day course in unsupervised learning, which covers PCA, factor analysis, and clustering. It will be taught in two half-days via Live Web in June. See jmp.com/training for more info.

4 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Feb 22, 2017 9:37 AM
(889 views)

Have you read the explanations of the methods in **Help** > **Books** > **Multivariate** guide book already?

Learn it once, use it forever!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Feb 24, 2017 5:34 AM
(1682 views)

Factor analysis tries to fit the model Y = XB + E where you only know Y. X, B, and E are all unknown. Therefore, many key conditions have to be imposed on the parameters. Factor analysis involves the decomposition of the R–U matrix, where R is the correlation matrix of the manifest variables (Y) and U is the correlation matrix of the unique factors (E).

Principal factor analysis does eigenvalue decomposition of the R–U matrix. It is not PCA! Maximum likelihood is an iterative method that maximizes the likelihood function given the factors.

SMC uses the Rsquare of each variable with the rest as an estimate of the diagonal element of R–U. PC uses 1 on the diagonal of R–U.

This point is key: since each of these methods are valid mathematically, they are all correct. The researcher needs to use subject matter knowledge in addition to the graphs and statistics from the analysis to determine the factors and their interpretability. The factor analysis model assumes that there are unmeasureable factors causing the manifest variables, and the model which gives the best interpretion is the most useful model, the winner.

Please let me know if I can provide more detail to help your understanding of factor analysis.

Small advertisement: for more information, JMP offers a one-day course in unsupervised learning, which covers PCA, factor analysis, and clustering. It will be taught in two half-days via Live Web in June. See jmp.com/training for more info.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Feb 24, 2017 8:31 AM
(838 views)

Hello Ms. Michelson

Thank you very much for your reply. I understand it better now. Just as a clarification, it is the eigenvalue decomposition of the R minus U matrix?

As a follow up question, in performing this operation on another data set, ML failed because the correlation matrix was singular and JMP recommended using PC as the factoring method, which succeeded. I then relaunched the operation with some other variables and the correlation matrix became non-singular. Unfortunately, I do not recall the combination of variables that made the correlation matrix singular. Are there any mathematical rules for choosing the correct variables to find the Principal Components of?

I will also sign up for the course in June.

Thank you

Anoop

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Feb 24, 2017 9:00 AM
(834 views)

I'm glad the explanation made sense, and I look forward to seeing you in the course. Yes, the loadings are found from the eigenvalue decomposition of the R–U (R minus U) matrix. The maximum likelihood procedure with SMC starts with those Rsquares on the diagonal of that matrix, then it iterates to maximize the likelihood function. In your example, the iteration did not converge. The principal factoring method starts with the full R matrix, with ones on the diagonal.

It is important in factor analysis to choose the manifest variables carefully. You need to factors to be interpretable, so you want to choose manifest variables that will support interpretation of the factors. There is not a mathematical rule to do this, only subject matter knowledge. However, you can always use the Multivariate platform to find the correlation matrix. Perfectly correlated variables will have a correlation of 1, and one of the pair can be removed.