May 28, 2014

JMP 13 Preview: New text analytics in JMP Pro

Building on the new features in JMP 13 for exploring unstructured text data, JMP Pro 13 enables you to do more with text data, like cluster terms and phrases and use text in predictive models. You’ll be able to answer more questions, scale to larger data and stay in flow. Because organizations collect so much text data, JMP now provides visual, interactive, and easy-to-use capabilities to analyze all that text data and make valuable use of it.

As Chris Gotwalt, Director of Statistical R&D at JMP, explains, some of the capabilities required for text analysis are analogous to those required for tabular data so that adding text analytics made sense for the product. “Text analytics is like general multivariate analysis. Topic analysis is like factor analysis. Singular value decomposition (SVD) is like principal component analysis, but these algorithms need to be fast enough to be useful on text data,” Chris says.

While SVDs are a standard means for dealing with the high dimensionality that is typical of text data in most text mining software, the challenge is to do it quickly so the user’s analysis “flow” is not disrupted. Chris has implemented not only a super-fast sparse Lanczos SVD, but also designed it so that it handles messy data well and yields more meaningful factors. This SVD implementation also supports topic analysis.

Screen Shot 2016-08-23 at 11.47.20 AMThe "Show Text" option allows you to see text associated with a single data point or the text in common for several selected data points. Here it's surfaced in the SVD plot.

 In addition, JMP Pro 13 also includes latent class analysis (LCA), useful for another kind of topic analysis (as multinomial mixtures) as well as for clustering text data. This LCA clustering approach customized for applications within Text Explorer allows for overlapping cluster membership probabilities for each document and takes advantage of sparse data to calculate fast summaries to show where high factor loadings are, which is important when dealing with ultra-wide data so typical of text data.

Screen Shot 2016-08-29 at 1.18.28 PMLatent Class Analysis clustering allows for overlapping cluster membership probabilities for each document.

JMP Pro provides text scoring with SVD scores, but also saves the formula to calculate SVD scores for all analyses (any variable, scoring matrices, parses all tokens). You can also save the document-term-matrix, SVD and LCA scores as inputs to other analyses, such as predictive models.

And of course, these implementations are integrated and interactive as you would expect them to be in JMP, with new graphics to visualize and further explore findings.

Heath Rushing, co-founder of Adsurgo, is a fan. "I have used many text mining tools. In terms of ease of use, Text Explorer is the best of the breed. You can efficiently clean unstructured data, visualize relationships, find major themes and group documents. Brilliant!"  Heath says.

Whenever Chris has shown these text analytics additions, he runs out of time because the audience asks so many questions like, “Does it do this or that?” or “Can I use this to analyze my survey, maintenance log, web data, etc.?”

“There is a lot of excitement when people see the platform. Upon first sight, they start thinking of all the new things they could do with the text data that they have always had lying around but could never take advantage of before,” Chris says.

John Sall, chief architect of JMP, also worked on the new text analytics features and enjoyed it. “It’s been fun to work on a new area, supporting one more form of data from which users can derive value,” John says.

To learn more about the new Text Explorer platform, watch the Analytically Speaking interview with Adsurgo co-founder, Heath Rushing. Heath was very influential in the development of Text Explorer, including naming the new platform.

Heath also prepared two great demos for a Technically Speaking webcast. You can also check out the talk he and co-presenter James Wisnowski are giving at Discovery Summit, “Mind the Gap: JMP on the Text Explorer Express.” Others will be presenting more about text exploration at Discovery Summit, including a tutorial by Chris Gotwalt, “The U-to-the-V: A Hitchhiker’s Guide to JMP 13 Text Explorer.”

Michael Anderson wrote:

Wow ... another awesome feature! Really looking forward to the JMP Discovery Conference in <2 weeks to hear & see even more!

Anne Milley wrote:

Thanks for your comment, Michael! Looking forward to seeing you and learning more at Discovery Summit!

MG Crissey wrote:


So delighted to see Text Analytics now bundled in with the newest JMP release!

Can you clarify whether it comes ONLY for those with the PRO license - or will those of us with the SMALL plain old JMP version get it when we renew? I am on JMP 11 now but it expires next month - and I am looking forward to trying the new features of JMP 13 when I get it!

Anne Milley wrote:

Hi, Mary and thanks for your comment! You may have missed the previous post that talks about the basic text exploration available in JMP: That post highlights what's new in the Text Explorer platform in JMP 13, while this post talks about some basic text analytics available in JMP Pro 13. Hope that answers your question and thanks for using JMP!