Level: Intermediate
Job Function: Analyst / Scientist / Engineer
Laura Higgins, JMP Global Technical Enablement Engineer, SAS
The world is full of unstructured text, and most of it goes unexplored. To further complicate this, extracting meaning and value from text required special tools until recently. Text Explorer in JMP Pro has many powerful options for bringing insight into text data and using the curated term list. Let’s fly beyond the word cloud and explore the modeling tools of Text Explorer.
I will show some different options for what to do once the term list is curated. When transforming curated text into model factors, I will show different applications of latent class analysis (LCA) and latent semantic analysis (LCA). I will also discuss the meaning of topic scores, provide guidance for choosing the number of topics and show how to use topics as part of modeling.
Once you have curated text, don’t stop at the word cloud – try one of these other techniques to uncover meaning hidden in text data.
•Are there underlying themes in the data? Find which documents (rows) cluster together.
•What conceptual topics and themes occur across documents with terms found together?
•Create stable variables from text analysis to use in modeling. Understand if topics contribute in a positive or negative way to an outcome variable.
•Create output for market basket/association analysis.