Heath Rushing, Co-Founder and Principal, Adsurgo
James Wisnowski, Co-Founder and Principal, Adsurgo
Most organisations are analysing structured numerical data such as product trials, R&D, process development, process monitoring, sales and marketing, and commercial manufacturing. Structured numerical data is, well, numerous and used throughout most organisations. However, the majority of stored data is not numerical; it is in the form of unstructured text in reports and documents, such as survey results and nonconformance reports. While these companies are spending resources to collect this unstructured text data, they are not doing anything with it. Although the concepts that will be presented are applicable to any industry, the speaker will demonstrate analysis of unstructured text data in regulatory compliance documents. Using a JMP script that calls R, the speaker will demonstrate end-to-end examples starting from assembling disparate text sources (from both a folder of text files and directly from the FDA website) into a structured data set, constructing a document-term matrix, then apply data mining methods such as cluster analysis and decision trees to discover previously unknown relationships. While relevant theory will be discussed, the focus of the talk will be on providing the audience an appreciation for discovering useful and actionable compliance and business insights from regulatory compliance documents using JMP.