Solved: Re: PCA common scale

Report Inappropriate Content · Jun 11, 2023 4:17 AM

Hello JMP community

Is there a way to perform PCA using a common scale the way it is possible to do so with K-means cluster?

I ask because the common scale K-means cluster is a lot better for my data but I want to remove the clustering data and just show the PCA biplot with axes

Best,

ezorlo · Nov 22, 2021 04:21 AM

From the JMP user manual

apparently the covariance matrix of PCA is the same as common scale on K-means clustering

View solution in original post

Dan_Obermiller · Sep 20, 2021 10:39 AM

I may not be understanding the question, but PCA will automatically center and scale all variables when you are performing the PCA on the correlations. Thus, all variables are on the same scale.

Dan Obermiller

ezorlo · Oct 25, 2021 05:00 AM

Hi the scaling on the PCA is the same as the K-means when the "columns scaled individually" option is selected

But I cant find a way to perform common scaling with PCA like Kmeans can do when the "columns scaled individually" option is not selected

The reason why I need to do this is because the K-means biplot does not include the variance explained by the axes.If anyone knows how to get the K-means plot to display that value I would appreciate it!

Dan_Obermiller · Oct 25, 2021 5:41 AM

From the JMP Help on K-Means:

Columns Scaled Individually

Scales each column independently of the other columns. Use when variables do not share a common measurement scale, and you do not want one variable to dominate the clustering process. For example, one variable might have values that are between 0 and 1000, and another variable might have values between 0 and 10. In this situation, you can use the option so that the clustering process is not dominated by the first variable.

If you do NOT scale them individually, then you are not scaling the variables.

If you want to reproduce that in Principal Components, choose the principal components On Unscaled (available from the red popup menu from Principal Components). However, keep in mind that all of your variables should be on a similar scale for these results to make sense.

Dan Obermiller

ih · Oct 25, 2021 09:37 AM

To add to @Dan_Obermiller comment, I believe K Means does center the columns just not scale them when using that option, so to replicate the analysis with PCA you would want to center the columns first and then do an unscaled PCA. If you run the script below the bivariate score plots should match.

Names default to here(1);

dt = Open( "$Sample_data/Lipid Data.jmp" );

//Center columns
dt << New Column("Age Centered", Numeric, "Continuous", Format("Best", 10), Formula(:Age - Col Mean(:Age)));
dt << New Column("Weight Centered", Numeric, "Continuous", Format("Best", 10), Formula(:Weight - Col Mean(:Weight)));
dt << New Column("Cholesterol Centered", Numeric, "Continuous", Format("Best", 10), Formula(:Cholesterol - Col Mean(:Cholesterol)));
dt << New Column("Triglycerides Centered", Numeric, "Continuous", Format("Best", 10), Formula(:Triglycerides - Col Mean(:Triglycerides)));

//Window with both analyses:
New Window( "Compare",
	H List Box(
		//K Means common scale:
		dt << K Means Cluster(
			Y( :Age, :Weight, :Cholesterol, :Triglycerides ),
			Columns Scaled Individually( 0 ),
			{Single Step( 0 ), Number of Clusters( 3 ), K Means Cluster,
			Go( Show Biplot Rays( [0, 0, 1] ), Biplot( 1 ) )}
		),
		//PCA unscaled using centered columns
		dt << Principal Components(
			Y( :Age Centered, :Weight Centered, :Cholesterol Centered, :Triglycerides Centered ),
			Estimation Method( "Default" ),
			"on Unscaled"
		)
	)
);

ezorlo · Nov 22, 2021 04:26 AM

thanks for this. i aspire to the day when scripting is more natural than GUI but here I am using JMP more than R...

ezorlo · Nov 22, 2021 04:21 AM

From the JMP user manual

apparently the covariance matrix of PCA is the same as common scale on K-means clustering

ezorlo · Nov 22, 2021 04:23 AM

The PCA starts with correlation matrix. If covariance matrix is chosen then the PCA biplot matches the K-means cluster biplot (common scale) and the axes variance can be determined

PCA common scale

Re: PCA common scale

Re: PCA common scale

Re: PCA common scale

Re: PCA common scale

Re: PCA common scale

Re: PCA common scale

Re: PCA common scale

Re: PCA common scale

Recommended Articles

Transforming Data

Creating Formulas in JMP