I am Yusuke Ono, Senior Tester at JMP Japan. I am sorry for my poor English in advance.
If you have JMP Pro, the Structural Equation Model platform supports covariance matrices (or correlation matrices) and mean vectors as input, and you can fit confirmatory factor analysis (CFA) and other SEMs.
Eigen function in JMP Scripting Language(JSL) can do the eigenvalue decompostion for symmetric matrices. Principal Component Analysis(PCA) can be done by the eigenvalue decomposition, so you can use Eigen function for PCA.
Factor Analysis by maximum likelihood estimation is more complicated. It is hard to program the algorithm of maximum likefood. So, I would like to recommend the following method although it is still complicated.
If you use Cholesky function, you can simulate a dummy data whose conariance matrix and mean vector are exactly same as yours (if your convariance matrix is positive semi-definite). If your covariance matrix is cov and your mean vector is mean, you can generate the data as below.
z = J(n, NCol(cov), Random Normal());
For(i = 1, i <= NCol(z), i++,
z[0,i] = (z[0,i] - Mean(z[0,i]))
);
c = Covariance(z);
invrc = Inv(Cholesky(c))`;
rcov = Cholesky(cov);
z = J(n,1,1)*mean` + z * invrc * rcov`;
Note that some results for this dummy data are meaningless. For example, score plots in PCA depends on random seeds. Only results related to the covariance matrix and the mean vecotr are meanigful. For example, maximums, minimums and Spearman/Kendall correlations are meaningless because they depend on other information than means and covariances.
The following code is an example. It must be difficult to understand and program this "dummy data method" if you are not familiar with JSL. In additon, my English explanation is not easy to understand. I also skip the mathematical explanation why this calculation can generate dummy data with the mean vector and the covariance matrix you want. If you have any questions, I would like you to reply.
Names Default To Here(1);
/////////// This is a sample data //////////////////////////////////////
New Table( "Test",
New Column( "Mean", Values( [100, 50, 30, 20] )),
New Column( "Stddev", Values( [3, 4, 2.5, 1.8] )),
New Column( "X1", Values( [1, 0.8, 0.6, 0.4])),
New Column( "X2", Values( [0.8, 1, 0.5, 0.2] )),
New Column( "X3", Values( [0.6, 0.5, 1, 0.2] )),
New Column( "X4", Values( [0.4, 0.2, 0.2, 1] ))
);
///////////////////////////// From Here ///////////////////////////////
mean colname = "Mean"; /* column name for means */
stddev colname = "Stddev"; /* column name for standard deviations */
corr colnames = {"X1", "X2", "X3", "X4"};
n = 100; /* Specify the sample size */
dt = Current Data Table();
mean = Column(dt, mean colname) << Get As Matrix; /* Read your mean vector */
sd = Column(dt, stddev colname) << Get As Matrix; /* Read your standard deviations */
r = dt <<Get As Matrix(corr colnames); /* Read your correlation matrix */
/* Eigenvalue decomposition (PCA for correlation matrix) */
{M,E} = Eigen(r);
Print Matrix(M||(M/Sum(M)));
Show(E);
/* Your covariance matrix */
cov = Diag(sd)*r*Diag(sd);
/* Generate dummy data whose covariance matrix and mean vector are exactly same as yours */
Random Reset(11111111);
z = J(n, NCol(cov), Random Normal());
For(i = 1, i <= NCol(z), i++,
z[0,i] = (z[0,i] - Mean(z[0,i]))
);
c = Covariance(z);
invrc = Inv(Cholesky(c))`;
rcov = Cholesky(cov);
z = J(n,1,1)*mean` + z * invrc * rcov`;
/* Make the dummy data */
dt = As Table(z, <<Column Names(corr colnames));
dt << Set Name("Simulated Data");
/////////////////////////////////////////////////////////////////////////////////////
// Now, the data which has the same covariance matrix and means are generated.
// You can use this dummy data for your analysis.
dt << Principal Components(
Y( :X1, :X2, :X3, :X4 ),
Estimation Method( "Row-wise" ),
Standardize( "Standardized" )
);
dt << Multivariate(
Y( :X1, :X2, :X3, :X4 ),
Variance Estimation( "Row-wise" ),
Scatterplot Matrix( 1 ),
Univariate Simple Statistics( 1 )
);
dt << Factor Analysis(
Y( :X1, :X2, :X3, :X4 ),
Variance Estimation( "Row-wise" ),
Variance Scaling( "Correlations" ),
Fit( "ML", "SMC", 1, "Varimax" )
);
One of JMP's unique points is its interactivity. For exmaple, you can find some outliers in the score plot on the PCA platform. If you start from summarized data (like means, stddevs and correlations), you lose some information about your data. I myself also feel some incovinience when I try some analyses in textbooks. But recently, I think more textbooks offer sample raw data sets, not summarized data. Anyway, if you would like for us to support summarized data input (as JMP Pro's SEM platform) also for PCA and Factor Analysis platform, I would like you to enter the request in Wish List (as
@Ben_BarrIngh said).
Yusuke Ono (Senior Tester at JMP Japan)