Scott Nicholson, formerly of LinkedIn and now with Accretive Health, gave an interesting keynote at the first Predictive Analytics World (PAW) in Boston. It inspired some later discussions on the term "data science." Some of the reactions to the term seem to have a generational element. From conversations I have had with others before and during this conference, if you identify yourself as a statistician, a data miner or something else, it may be hard to warm up to the newer data scientist term. We have for some time been seeing a decline in the use of "data mining" as evidenced in Google trends, which Gregory Piatetsky-Shapiro also noted in his talk. All of these terms either come with or acquire some baggage and may or may not stick over time — fashion applies in the world of data analysis, too.
Having recently attended the JMP Discovery Summit, I couldn't help but think of the well thought-through use of the term "statistical engineering" that Roger Hoerl and Ron Snee advocate (I LOVE their second edition of Statistical Thinking!). Their use of the term, statistical engineering, is analogous to what Scott Nicholson was conveying in terms of the analyst taking a more strategic end-to-end view. SAS presented a keynote at PAW 2009 where we also advocate this whole-process view, which is greatly facilitated by having analysts placed higher in the organizational hierarchy, and by establishing an analytic center of excellence.
Michael Berry's keynote on day one of PAW about his work at Trip Advisor was entertaining and succinct, driving quite a bit of traffic to the JMP booth. His case study on day 2 provided solid reasoning on why people should make greater use of matched pairs to answer questions like the one he did: "Is the new product stealing money from the old one?" It's not as straightforward as one may initially think to get at the truth. He laid out the challenges and made the case for matched pairs:
Clinical studies love twins.
Control for many factors all at once.
When everything else is the same, the observed difference is attributed to the treatment.
Michael's talk was met with many questions to close out the conference. The talk before his was also of note: Jane Zheng of Focus Optimal, formerly at Fidelity, presented on the subject of uplift or — as she and Victor Lo of Fidelity refer to it — True lift modeling.
Apart from the talks, there were several interesting conversations with attendees — many of them already JMP customers who were happy to learn things they didn't know JMP could do and many others who were pleased to see JMP's interactive graphical data discovery capabilities.
Looking forward to PAW Dusseldorf next month, or perhaps I should say, ich freue much sehr darauf!