Summary: This course teaches you how to find and analyze patterns in textual data using Text Explorer in JMP Pro. A workflow is defined for effective and efficient exploration. Multivariate methods are applied to the frequency of terms after weighting and dimension reduction to capture relationships among important terms and phrases.
Duration: 14 hours of content.
Prerequisites: Before attending this course, it is recommended that you complete the JMP® Software: A Case Study Approach to Data Exploration and JMP®: Statistical Decisions Using ANOVA and Regression courses or have equivalent experience.
Learning Objectives:
- use relevant concepts and terminology in text analytics
- prepare textual data for exploration
- use a workflow to find important terms and phrases
- customize various rules that determine the important terms and phrases
- use regular expressions to retrieve text that matches a pattern
- explore important terms and phrases with the word cloud and other JMP platforms
- work with the document term matrix (DTM)
- cluster related documents
- reduce dimensionality with the singular value decomposition (SVD) of the DTM
- associate meaning to related terms
- identify attraction and repulsion in the meaning of terms
- score new documents after supervised learning with training corpus
- find patterns and analyze a collection of text data files
Course Outline:
Getting Started
- nature of text as data
- Text Explorer platform
Curating the List
- importing text data
- tokenizing stage
- terming stage
- phrasing stage
Exploring Patterns
- exploring word clouds
- beyond Text Explorer
Sentiment Analysis
- determining positive and negative sentiment
Term Selection
- selecting terms that are highly predictive of an outcome
Discriminant Analysis
- scoring new documents
- interpretation of discriminant analysis
Latent Semantic Analysis
- dimension reduction
- topic analysis
- interpretation of latent semantic analysis
Latent Class Analysis
- clustering documents
- interpretation of latent class analysis