turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Discussions
- :
- PCA Predicteds Method

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

May 4, 2017 12:46 PM
(657 views)

I am trying to understand the technique behind using a PCA model to return predicted values of your original input variables, with retained_PC's < predictor_count. Ultimately, I'll be looking into residuals (original values - predicteds) for varying numbers of retained PC's.

I've read that, to return from PCA space back to X space, you'll need to calculate:

predicteds = scores*loadings'

Whenever I look into the formula for the 'save predicteds' output, is this linear equation representative of the above concept?

Thanks

P.S. I know PCA is not meant for predictive modeling, I'm ultimately using it for fault detection with Hotelling's T2 and SPE output. My SPE (squared prediction error) would hinge on this predicteds output.

1 REPLY

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

May 12, 2017 7:37 AM
(565 views)

Looks like the PCA platform does not give you access to the score matrix directly. But the code below may help to give some confidence that the prediction formulas are as they should be. It generates random multivariate Gaussian data for three variables, and when using three principal components (that is, with no dimensionality reduction) the predictions are the same as the original data.

```
Names Default To Here( 1 );
// Function to generate n independant samples from a multivariate Gaussian
// m is the row vector of means
// v is the (square) variance-covariance matrix
// (Dimensions of m and v have to conform)
randomMultiGaussian = Function( {m, v, n},
{Default Local},
nDim = N Row( m );
If( (N Row( v ) != nDim) | (N Col( v ) != nDim),
Beep();
Print( "ERROR in randomMultiGaussian" );
Throw();
);
// Simulate random values . . .
t = Cholesky( v );
d = J( n, nDim, Random Normal() ) * t`;
// Build mean values columnwise . . .
mu = [];
For( i = 1, i <= nDim, i++,
mu ||= J( n, 1, m[i] )
);
// . . . and add the mean to get the final result
d = mu + d;
);
// Make some data
nPts = 100;
m = [0, 0, 0];
v = [1.0 0.2 0.8, 0.2 1.0 0.5, 0.8 0.5 1.0];
dMat = randomMultiGaussian(m, v, nPts);
dt = AsTable(dMat, << Column Names({"x1", "x2", "x3"}));
dt << setName("Sample from Multvariate Gaussian");
// Use PCA
pca = dt << Principal Components(Y( Eval(dt << getColumnNames) ));
// Save predictions
pca << savePredicteds(1); // Predictions with 1 PC
pca << savePredicteds(2); // Predictions with 2 PCs
pca << savePredicteds(3); // Predictions with 3 PCs
// See how the predicted x1 value varies with the number of PCs
dt << Graph Builder(
Show Control Panel( 0 ),
Variables(
X( :x1 ),
Y( :Predicted x1 3 ),
Y( :Predicted x1 2 ),
Y( :Predicted x1 )
)
);
// For 3 PCs compare the predicted values with the original data values
Print(
Maximum(
Maximum(Column(dt, "x1")[1::nPts] - Column(dt, "Predicted x1 3")[1::nPts]),
Maximum(Column(dt, "x2")[1::nPts] - Column(dt, "Predicted x2 3")[1::nPts]),
Maximum(Column(dt, "x3")[1::nPts] - Column(dt, "Predicted x3 3")[1::nPts])
)
);
```