Michele Boulanger, Professor, International Business Department, Rollins College, Orlando Mia Stephens, JMP Academic Ambassador, SAS (Co-Author)
Our ability to capture transactional data in most fields has led us to the era of “big data”. What is big data? Does “big” necessarily mean dirty, messy, inconsistent, or unwieldy? How much toning and conditioning do we need to do in the big data world? What differentiates preparation in the big data world from the traditional data cleaning phase? In this talk we discuss the different challenges encountered in potentially the most time-consuming phase of big data analytics: data preparation. We present two case studies with very different goals, requiring different approaches to shaping up the data for modeling. Along with these approaches, we also highlight techniques and platforms from JMP such as query, recode, standardization, transformation, imputation, text mining, and others to develop a traceable and reproducible methodology to prepare big data for the modeling phase. All demonstrations will be done live with JMP Pro 13.