Subscribe Bookmark
jmarquardt

Staff

Joined:

Jul 21, 2014

Dean Abbott on what makes a good data miner

“Some of the best data miners have a Freakonomics mindset,” says Dean Abbott, President of Abbott Analytics and thought leader in the areas of modeling, data mining and data visualization. These people are curious data sleuths who ask the interesting questions.

Abbott discussed this and much more with Anne Milley, Senior Director of Analytics Strategy at JMP, during an Analytically Speaking webcast.

I watched along with more than 800 viewers. Here are some of his key points:

  1. “The most important part of any data mining project is defining the problem clearly.” And it’s not easy “to hook up the analytics to the front end of the problems you really want to solve” because business objectives almost never match data mining techniques. The first step? Identify the target variables that encapsulate the problem.
  2. “I’m not trying to replace the bench!” The fear that an analyst can take the place of researcher “on the bench” is unfounded. You have to know something about the data in order to interpret it – and that’s the value that domain experts bring to the table. As a data miner, you aren’t handing over solutions; rather, you’re breaking the information into smaller chunks that experts can use to make better decisions.
  3. “No single model tells the complete story.” That’s why ensembles (a combination of models) work. Each model provides a different perspective, giving you a broader sense of what’s going on with your data.
  4. “Decision trees are greedy.” And they can fool you because you “can’t go back to beginning.” Random forests incorporate more diversity and “force trees out of the greedy path they go down.”
  5. “Summaries don’t do the data justice.” They are helpful, but can be deceiving. So always visualize the data.
  6.  “There’s nothing that beats clock time.” Managers ought to give their analysts time with the data. When you least expect it – say when you’re driving home or having a conversation at work – something will trigger your thought process for solving a problem.
  7. “There is no rule that says when you’ve exhausted [the data].” Abbott says that if his models are behaving consistently, then he knows he’s gotten a lot from the data. And sometimes you must judge your analysis based on how much time you have left to complete your project. There are diminishing returns, so ask, “How much value or more money can I bring to the company if I continue?”
  8. This just scratches the surface of the conversation. The on-demand webcast will be available soon, so check the Analytically Speaking website to view it, as well as upcoming live webcasts with additional thought leaders.

    1 Comment
    Community Member

    4 Interview Questions for Data Miners - Dice News wrote:

    [â ¦] sumo-wrestling competitions to crime rates. When he mentions that text, Abbott is referring to a data minerâ s habit of asking questions and insatiable desire to understand a problem before they build a [â ¦]