Andrew T. Karl, Senior Management Consultant, Adsurgo LLC Heath Rushing, Principal Consultant and Co-Founder, Adsurgo LLC
Some estimates suggest that unstructured text accounts for roughly 80 percent of the information stored by most organizations. This presentation will provide an overview of methods easily implemented with the R interface to JMP to find previously unknown relationships from a collection of unstructured data. By utilizing R packages for text mining and sparse matrix algebra, JMP may be equipped to extract information from text without requiring end-user knowledge of R. The text – which may be from emails, survey comments, social media, incident reports, insurance claim reports, etc. – may be used for several purposes. Vectors from a singular value decomposition of the document term matrix produced in R may be added to the original data table in JMP and included in predictive models (e.g., via the Fit Model or Neural platforms) or clustering algorithms (via the Cluster platform). Another goal may be to explore the underlying themes of the text though word counts or latent semantic indexing. We will demonstrate a JSL/R script that provides such functionality.