Recently, Georges Grinstein, head of the Bioinformatics Program and Co-Director of the Institute for Visualization and Perception Research at the University of Massachusetts Lowell, was in our studios hosting a webcast and promoting his upcoming seminar Exploring Data Visualization in Life Sciences Research. I had a chance to sit down with Dr. Grinstein and ask him about the changing nature of life sciences data, analysis and the new technologies that allow us to visualize new insights into what our data is telling us.
Why do you believe that it’s important to visualize life sciences data and results of analysis?
Without visualization, we depend on numbers, which most often represent summaries or aggregations, statistics, or other computed results. These can hint at structure in the data but are often too precise to show related structures. I often think of these numbers as representative of not just the data but aliases of the data as well.
When should you visualize life sciences data?
It depends on the task, but my gut feeling says almost always.
Although visualization of this kind of data can bring new insights, are there any dangers to relying on visuals?
Yes, there's danger if one jumps to conclusions without validating either analytically or visually the "insights." I think of these insights as hypotheses, and so they must be validated. But visualization really does bring on new insights (and there are many different visualizations).
Data sets are extremely large. Do you need to do anything with your data before you begin creating visual representations of it?
This depends on the task as well. Some algorithms and some visualizations cannot deal with large data, or they take much too much time to generate results. In such cases, data reduction has to take place through such techniques as subsetting, dimensional reduction and sampling, to name a few. And if one wants to interact with a large data set, this can be tricky.
What’s next in visualization formats and trends?
Here are four trends that you should be aware of:
1. Much tighter coupling between analysis and visualization.
2. More interaction with visualizations of larger data.
3. Parallelization of many algorithms and visualizations.
4. Precomputation of large data sets to save time in the later discovery steps.
Dr. Grinstein will be presenting at the Exploring Data Visualization in Life Sciences Research seminars April 24 at the Broad Institute in Cambridge, MA, and on April 25 at the Chauncey Conference Center in Princeton, NJ. Visit our website for more information on – including how to sign up for – the Cambridge or Princeton seminars.