cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
ezorlo
Level IV

PCA common scale

Hello JMP community

Is there a way to perform PCA using a common scale the way it is possible to do so with K-means cluster?

I ask because the common scale K-means cluster is a lot better for my data but I want to remove the clustering data and just show the PCA biplot with axes

Best,

1 ACCEPTED SOLUTION

Accepted Solutions
ezorlo
Level IV

Re: PCA common scale

From the JMP user manual

apparently the covariance matrix of PCA is the same as common scale on K-means clustering

View solution in original post

7 REPLIES 7

Re: PCA common scale

I may not be understanding the question, but PCA will automatically center and scale all variables when you are performing the PCA on the correlations. Thus, all variables are on the same scale.

Dan Obermiller
ezorlo
Level IV

Re: PCA common scale

Hi the scaling on the PCA is the same as the K-means when the "columns scaled individually" option is selected

But I cant find a way to perform common scaling with PCA like Kmeans can do when the "columns scaled individually" option is not selected

The reason why I need to do this is because the K-means biplot does not include the variance explained by the axes.If anyone knows how to get the K-means plot to display that value I would appreciate it!

Re: PCA common scale

From the JMP Help on K-Means:

Columns Scaled Individually

Scales each column independently of the other columns. Use when variables do not share a common measurement scale, and you do not want one variable to dominate the clustering process. For example, one variable might have values that are between 0 and 1000, and another variable might have values between 0 and 10. In this situation, you can use the option so that the clustering process is not dominated by the first variable.

 

If you do NOT scale them individually, then you are not scaling the variables.

 

If you want to reproduce that in Principal Components, choose the principal components On Unscaled (available from the red popup menu from Principal Components). However, keep in mind that all of your variables should be on a similar scale for these results to make sense.

Dan Obermiller
ih
Super User (Alumni) ih
Super User (Alumni)

Re: PCA common scale

To add to @Dan_Obermiller comment, I believe K Means does center the columns just not scale them when using that option, so to replicate the analysis with PCA you would want to center the columns first and then do an unscaled PCA.  If you run the script below the bivariate score plots should match.

 

Names default to here(1);

dt = Open( "$Sample_data/Lipid Data.jmp" );

//Center columns
dt << New Column("Age Centered", Numeric, "Continuous", Format("Best", 10), Formula(:Age - Col Mean(:Age)));
dt << New Column("Weight Centered", Numeric, "Continuous", Format("Best", 10), Formula(:Weight - Col Mean(:Weight)));
dt << New Column("Cholesterol Centered", Numeric, "Continuous", Format("Best", 10), Formula(:Cholesterol - Col Mean(:Cholesterol)));
dt << New Column("Triglycerides Centered", Numeric, "Continuous", Format("Best", 10), Formula(:Triglycerides - Col Mean(:Triglycerides)));

//Window with both analyses:
New Window( "Compare",
	H List Box(
		//K Means common scale:
		dt << K Means Cluster(
			Y( :Age, :Weight, :Cholesterol, :Triglycerides ),
			Columns Scaled Individually( 0 ),
			{Single Step( 0 ), Number of Clusters( 3 ), K Means Cluster,
			Go( Show Biplot Rays( [0, 0, 1] ), Biplot( 1 ) )}
		),
		//PCA unscaled using centered columns
		dt << Principal Components(
			Y( :Age Centered, :Weight Centered, :Cholesterol Centered, :Triglycerides Centered ),
			Estimation Method( "Default" ),
			"on Unscaled"
		)
	)
);
ezorlo
Level IV

Re: PCA common scale

thanks for this. i aspire to the day when scripting is more natural than GUI but here I am using JMP more than R...

ezorlo
Level IV

Re: PCA common scale

From the JMP user manual

apparently the covariance matrix of PCA is the same as common scale on K-means clustering

ezorlo
Level IV

Re: PCA common scale

The PCA starts with correlation matrix. If covariance matrix is chosen then the PCA biplot matches the K-means cluster biplot (common scale) and the axes variance can be determined