On the Multivariate Analysis Tool kit Latent Class Analysis is used to predict clusters on Categorical data (Example Health Risk Survey on JMP Library). How is that principle used on Text Data? We get a Document Term Matrix which is either binary (numeric) , Frequency or TF-IDF. We do not get a DTM of Categorical data. So how does LCA work in this case? Does the binary DTM get converted internally to categorical DTM?
Thank you very much. I tried to reproduce the results on the Multivariate LCA platform after converting the DTM matrix to a categorical matrix. Since the output is very different, I could not come to any conclusion. But your feedback is valuable.
The LCA is an unsupervised learning method in Text Explorer. It discovers clusters of documents. It is not a classifier.
Thanks Mark Bailey. I understand it is unsupervised learning algorithm. I was trying to see if I could get the same result on the Multivariate platform and the Text Explorer platform. It appears the mechanism is different. Thanks for responding to my question.