Subscribe Bookmark RSS Feed

PCA Model Validation Techniques

stephen_pearson

Community Trekker

Joined:

Oct 6, 2014

PCA can be used to build a model that describes a set of observations. Using the multivariate platform it is possible to perform jacknife and T2 outlier analysis to check if any of the observations are extreme relative to the multivariate mean.

In what situation might a weights column be used to build the model?

If you use a PCA model how do you maintain it?

 

If I have a new observation(s), I can apply the saved principle components & T2 to assess if it is consistent with my model or not.

 

 

Are there any methods in JMP to assess how robust the PCA model is? I was thinking for example of creating a script which deletes 10 % of cell values at random from each column and then uses impute missing to recover them from the covariance matrix. By repeating this many times and comparing the eigenvalue table and the imputed versus actual could give useful infomation about the model robustness.

3 REPLIES
Byron_JMP

Staff

Joined:

Apr 26, 2012

I like where you're going with this idea.

 

First, Weight and Frequency, I'm pretty sure they do the same thing in Principal components. Weigh can use fractions where frequency uses integers.

 

It looks like a lot of the functionality you're thinking of is pretty easy to script.

Names Default To Here( 1 );
dt = Open( "$SAMPLE_DATA/Tiretread.jmp" );
obj = dt<<Multivariate(
	Y( :ABRASION, :MODULUS, :ELONG, :HARDNESS ),
	Estimation Method( "Robust" )
);
obj << T Square( 1 , "Save T Square");
obj<<Close Window;


r=dt << Select Where( :name("T Square") >= Col Quantile( :name("T Square"), 0.9 )  );
r<<exclude;

 

billw_jmp

Staff

Joined:

Jul 2, 2014

DModX might be another option for you.

 

From the Multivariate Methods book in HelpFrom the Multivariate Methods book in HelpSave the DModX formula to your data table as you would other formula columns.

Peter_Bartell

Joined:

Jun 5, 2014

Another thought wrt to PCA validation is to create a validation column ala JMP Pro's "Make Validation Column" capability. Here's the link to the JMP online documentation describing this utility:

 

http://www.jmp.com/support/help/13-2/Make_Validation_Column_Utility.shtml#

 

If you're not running JMP Pro, you can create one by using the workflow shown by my colleague @julian in this video:

 

https://www.youtube.com/watch?v=M5_mECc4NAg

 

Then use the validation column as a By variable. Then you can compare the various PCA visualizations and statistics for reasonableness and such.