I have been surprised to see a constant component in the saved formula when performing PCA analysis. For example, saving the 1st 2 Principal Components on the following standardized data set (centered and scaled data),
Standardized (Centered + Scaled) Data |
SLength | SWidth | PLength | PWidth |
0.27 | 0.19 | -0.36 | -0.44 |
-0.30 | -1.14 | -0.36 | -0.44 |
-0.88 | -0.61 | -0.94 | -0.44 |
-1.16 | -0.87 | 0.22 | -0.44 |
-0.02 | 0.46 | -0.36 | -0.44 |
1.13 | 1.26 | 1.38 | 1.48 |
-1.16 | -0.07 | -0.36 | 0.52 |
-0.02 | -0.07 | 0.22 | -0.44 |
-1.74 | -1.41 | -0.36 | -0.44 |
-0.30 | -0.87 | 0.22 | -1.40 |
1.13 | 0.72 | 0.22 | -0.44 |
-0.59 | -0.07 | 0.80 | -0.44 |
-0.59 | -1.14 | -0.36 | -1.40 |
-2.02 | -1.14 | -2.11 | -1.40 |
2.28 | 1.52 | -1.52 | -0.44 |
1.99 | 2.59 | 0.22 | 1.48 |
1.13 | 1.26 | -0.94 | 1.48 |
0.27 | 0.19 | -0.36 | 0.52 |
1.99 | 0.99 | 1.38 | 0.52 |
0.27 | 0.99 | 0.22 | 0.52 |
1.13 | -0.07 | 1.38 | -0.44 |
0.27 | 0.72 | 0.22 | 1.48 |
-1.16 | 0.46 | -2.69 | -0.44 |
0.27 | -0.34 | 1.38 | 2.43 |
-0.59 | -0.07 | 2.55 | -0.44 |
-0.02 | -1.14 | 0.80 | -0.44 |
-0.02 | -0.07 | 0.80 | 1.48 |
0.56 | 0.19 | 0.22 | -0.44 |
0.56 | -0.07 | -0.36 | -0.44 |
-0.88 | -0.61 | 0.80 | -0.44 |
-0.59 | -0.87 | 0.80 | -0.44 |
1.13 | -0.07 | 0.22 | 1.48 |
0.56 | 1.79 | 0.22 | -1.40 |
1.42 | 2.06 | -0.36 | -0.44 |
-0.30 | -0.87 | 0.22 | -0.44 |
-0.02 | -0.61 | -1.52 | -0.44 |
1.42 | 0.19 | -0.94 | -0.44 |
-0.30 | 0.46 | -0.36 | -1.40 |
-1.74 | -1.14 | -0.94 | -0.44 |
0.27 | -0.07 | 0.22 | -0.44 |
-0.02 | 0.19 | -0.94 | 0.52 |
-1.45 | -3.01 | -0.94 | 0.52 |
-1.74 | -0.61 | -0.94 | -0.44 |
-0.02 | 0.19 | 0.80 | 3.39 |
0.27 | 0.99 | 2.55 | 1.48 |
-0.59 | -1.14 | -0.36 | 0.52 |
0.27 | 0.99 | 0.80 | -0.44 |
-1.16 | -0.61 | -0.36 | -0.44 |
0.84 | 0.72 | 0.22 | -0.44 |
-0.02 | -0.34 | -0.36 | -0.44 |
The formulas are as follows:
Prin1: 0.59834170442161 * :SLength + 0.569834108206745 * :SWidth + 0.371661472844918 *
:PLength + 0.39892861952586 * :PWidth + 2.15154543958667e-16
Prin2: -0.331623960696996 * :SLength + -0.436415344018397 * :SWidth + 0.620670317712319
* :PLength + 0.54252700661609 * :PWidth + (-1.4778627619204e-16)
Even though the two constant components are almost close to 0 (BTW, I saw constant >> 0 in some other cases), I just don't understand why they would be part of the formula in the 1st place since Prin1 and Prin2 should be just the product between 1st and 2nd eigenvectors and the data.
In matlab, the detailed calculations will be as follows:
[U S V] = svd (cov(X));
Z2 = X * U(:,1:2) ;
Prin1 = Z2(:,1);
Prin2 = Z2(:,2);
Look forward to your explanation and thanks much in advance!