JMP Genomics vs. Open Source Software for Teaching
Apr 18, 2008 9:30 AM
I had an interesting conversation with a longtime JMP Genomics user the other day about open source vs. commercial software after (yet another) query from a prospect about what makes JMP Genomics different from free genomics software tools. This may be a controversial topic, and there are plenty who won’t agree with what I write. However, this user’s opinion on the subject resonated with my own opinions and those of many here at JMP and SAS who work hard to make our commercial software products worth the money that customers pay for them. This user is a longtime SAS programmer who is under considerable pressure from many sides to convert over to using open source R to do her data analysis. Despite a limited budget for software purchase, she spends much of it on SAS software.
This particular user loves the straightforward visualizations of large data sets that JMP Genomics offers as standard outputs. I liked how she called this part of array analysis the Sesame Street analysis – as in, "Which of these things is not like the other?" The easy-to-use, interconnected 2-D and 3-D PCA plots, scatterplots, boxplots and kernel density estimate plots make it simple for biologists to select potential outliers directly in one plot and see where that group of data lies in the other plots. This allows direct visualization of data and informs decisions about data quality prior to proceeding to comprehensive downstream data analysis. This particular strength of JMP Genomics comes from the amazing visualization capabilities in JMP, in particular the linking among different types of visualizations to a common data table, and greatly complements the raw processing power of SAS running behind the scenes.
In downstream analysis, this joint visualization and selection allows users to directly select subsets in the cluster heat map, and then visualize those points in the volcano plots, or look at particular genes in a separate data table. Coloring among cluster diagrams and volcano plots is also consistent, allowing users to make visual inferences about mean relationships when looking at sets of significant differences. JMP and SAS already make a great combination, and the integration is only going to get better, judging from what I have already seen of the impressive new feature sets of prerelease JMP 8 and just-released SAS 9.2.
On using R for her own research data analysis, this user commented that one of her main reservations is that R modules can be written by any programmer, leaving doubts in her mind about the assumptions of the programmer writing the code, the numeric accuracy of the calculations, and how well the code has been tested. In some cases, the code may be very good and well tested, but in other cases it may be unstable and contain errors. This user’s opinion was that if she switched over to using R, she would feel obligated to check and test the code herself to make sure that it was in fact giving the accurate results she was depending on. She commented that she loves the fact that PROC MIXED is running beneath the JMP Genomics ANOVA and Mixed Model processes because she can be sure about the accuracy of the numeric calculations without needing to check and recheck the results.
Although this user is a dedicated SAS programmer, she uses JMP Genomics as a teaching tool. She feels that teaching biologists who are not familiar with programming to write code for analysis of genomic data with command-line tools consumes too much valuable time in a semester-long course. For example, the exercise of filtering a data table for missing values took students 25 minutes to navigate through using R, but required just a couple of minutes using the Select Where function in JMP Genomics. This is only one of many examples of the relative ease of directing such students to manipulate data with JMP Genomics. However, if students are interested in seeing the open SAS macro code that lies behind a given JMP Genomics process, she is more than happy to walk students through the code and provide an additional learning experience.
I have more notes from this particular conversation. Anyone who knows me knows that I fill notebooks on a regular basis – and have been awaiting the release of a new text capture pen, which captures handwriting on regular paper, for over a year. I am glad to report that it’s finally available and I’ve ordered one! It’s nice to know that this should speed up the process of capturing my copious handwritten notes on topics such as these into blog-ready text.