Our World Statistics Day conversations have been a great reminder of how much statistics can inform our lives. Do you have an example of how statistics has made a difference in your life? Share your story with the Community!
Choose Language Hide Translation Bar
JMP Genomics vs. Open Source Software for Teaching

I had an interesting conversation with a longtime JMP Genomics user the other day about open source vs. commercial software after (yet another) query from a prospect about what makes JMP Genomics different from free genomics software tools. This may be a controversial topic, and there are plenty who won’t agree with what I write. However, this user’s opinion on the subject resonated with my own opinions and those of many here at JMP and SAS who work hard to make our commercial software products worth the money that customers pay for them. This user is a longtime SAS programmer who is under considerable pressure from many sides to convert over to using open source R to do her data analysis. Despite a limited budget for software purchase, she spends much of it on SAS software.

This particular user loves the straightforward visualizations of large data sets that JMP Genomics offers as standard outputs. I liked how she called this part of array analysis the Sesame Street analysis – as in, "Which of these things is not like the other?" The easy-to-use, interconnected 2-D and 3-D PCA plots, scatterplots, boxplots and kernel density estimate plots make it simple for biologists to select potential outliers directly in one plot and see where that group of data lies in the other plots. This allows direct visualization of data and informs decisions about data quality prior to proceeding to comprehensive downstream data analysis. This particular strength of JMP Genomics comes from the amazing visualization capabilities in JMP, in particular the linking among different types of visualizations to a common data table, and greatly complements the raw processing power of SAS running behind the scenes.

In downstream analysis, this joint visualization and selection allows users to directly select subsets in the cluster heat map, and then visualize those points in the volcano plots, or look at particular genes in a separate data table. Coloring among cluster diagrams and volcano plots is also consistent, allowing users to make visual inferences about mean relationships when looking at sets of significant differences. JMP and SAS already make a great combination, and the integration is only going to get better, judging from what I have already seen of the impressive new feature sets of prerelease JMP 8 and just-released SAS 9.2.

On using R for her own research data analysis, this user commented that one of her main reservations is that R modules can be written by any programmer, leaving doubts in her mind about the assumptions of the programmer writing the code, the numeric accuracy of the calculations, and how well the code has been tested. In some cases, the code may be very good and well tested, but in other cases it may be unstable and contain errors. This user’s opinion was that if she switched over to using R, she would feel obligated to check and test the code herself to make sure that it was in fact giving the accurate results she was depending on. She commented that she loves the fact that PROC MIXED is running beneath the JMP Genomics ANOVA and Mixed Model processes because she can be sure about the accuracy of the numeric calculations without needing to check and recheck the results.

Although this user is a dedicated SAS programmer, she uses JMP Genomics as a teaching tool. She feels that teaching biologists who are not familiar with programming to write code for analysis of genomic data with command-line tools consumes too much valuable time in a semester-long course. For example, the exercise of filtering a data table for missing values took students 25 minutes to navigate through using R, but required just a couple of minutes using the Select Where function in JMP Genomics. This is only one of many examples of the relative ease of directing such students to manipulate data with JMP Genomics. However, if students are interested in seeing the open SAS macro code that lies behind a given JMP Genomics process, she is more than happy to walk students through the code and provide an additional learning experience.

I have more notes from this particular conversation. Anyone who knows me knows that I fill notebooks on a regular basis – and have been awaiting the release of a new text capture pen, which captures handwriting on regular paper, for over a year. I am glad to report that it’s finally available and I’ve ordered one! It’s nice to know that this should speed up the process of capturing my copious handwritten notes on topics such as these into blog-ready text.

Article Labels

    There are no labels assigned to this post.

Article Tags

Jan Sutherland wrote:

Hi Jeffrey, did you ever get a response to these observations? Care to pass it along? I have worked with both SAS and SPSS, but before JMP, and am really interested in open Source, so am doing a lot of reading, paying particular attention to the objections that os is less accurate.



Jeffrey Lehman wrote:

I am in a slightly different situation. I am a SAS programmer that joined a Group within a Federal Agency about a year ago. The Agency traditionally has used SPSS, because most of its statistical applications are in the area of human factors. The group I joined converted to JMP a few years ago when a co-op student became a Fed and then a manager. JMP is deified here, whereas I am a diehard SAS proponent.

I have worked on parts of projects here that require importing multiple data files or tables, data manipulation, use of statistical models, and genration of graphics and reports. I found the generation and management of multiple tables, dialog boxes, and the column-structured less flexible in JMP than SAS' free-format editor. Further, with the SAS Macro Language, I was able to write a macro loops to import and either stack or merge 20 files (either flat files with the Data Step or Oracle tables using Proc SQL- both cases within a loop macro). In addition, I find the programmer's control for outputting graphics, results, tables, and reports much better than attempting to cut-and-paste from overly formatted JMP output.

I conveyed my experiences to the manager, who feels that JMP can do anything SAS can do. I have reviewed JSL Guide and, although there is obvious extensibility with JSL, it seems like a glorified and complicated macro recorder language. To give it a fair comparison, I have tried to look for users' postings comparing the two tools with little or no success.

I guess my question would be how to convince a skeptic that SAS has marginal benefits above and beyond JMP plus JSL. I did make reference to page of the JMP JSL Manual, which provides a disclaimer about what JMP should be used- even with a deferential comment toward SAS.

Any thoughts?