When should programming come into play in statistics courses?
May 20, 2016 10:21 AM
| Last Modified: Jul 12, 2018 11:18 AM
Both academically and professionally, more courses are being offered and developed to make more people comfortable with data, analysis and risk assessment. This necessitates some use of statistics, and software is pretty much a tool of the trade. Software — some new, some enhanced, some commercial and some open source — is increasingly available to broader audiences and is ever-changing.
For the quantitative courses I took in college, I had to learn some coding languages to use SAS, SPSS and SHAZAM. I was not a fan of learning JCL and other programming languages initially and found learning the syntax of the languages an impediment to understanding statistical concepts.
On the positive side, even my limited coding skills later proved useful for my career, but many of my classmates’ exposure to coding dampened their enthusiasm — for both programming and statistics. Once I was exposed to the highly visual and interactive experience that JMP provides in data exploration and analysis, I wondered whether I would have understood statistical concepts more quickly and whether fellow classmates would have had greater enthusiasm for statistics had we used JMP.
More intro stats courses are being offered as MOOCs. Many universities are evolving their curricula to include business analytics and other courses to appeal more broadly to engage more people in statistical thinking. Professionally, more basic data analysis courses are being offered as well. In light of all this, it’s interesting to see which software is used: spreadsheets, interactive visual software like JMP, some SAS interfaces, interfaces to R, Minitab, etc., as well as language-based approaches like R, SAS, Python and others.
What factors affect which software is used in courses?
I wonder if I would have understood statistical concepts more quickly if I had had access to JMP in college.
Having written a blog post about teaching statistics with JMP and continuing to engage with academics on how they teach statistical concepts, I’m curious about the motivating factors in choosing software for use by students with such varied levels of numeracy. Often, cost is the driving factor. Open source software is freely available. Excel is so ubiquitous that it is essentially perceived as free (but many recognize the limitations of spreadsheets).
Another motivating factor of some intro-level courses may be to leave the students with more marketable skills, and knowing a popular programming language is certainly such a skill (in addition to knowing about data analysis, of course).
Yet another consideration could be that the software is already there, what’s been there and what the instructor already knows.
Teaching how to think statistically
But beyond these factors, many instructors truly want to engage more students to see and feel the power of data, to experience what it is to “think statistically.” They recognize that many people will appreciate and benefit from understanding statistical concepts, but may never go on to learn any programming languages. They may be capable of statistical thinking without knowing how to program. Obvious examples would be doctors and judges, whose recommendations and decisions can powerfully affect people's lives.
I recently finished reading Risk Savvy: How to Make Good Decisions by Gerd Gigerenzer. For many important decisions regarding our health, finances and more, he shares well-founded research in how we can better assess risk to make better decisions. For example, he has done a lot of work with doctors to better communicate probabilities to their patients (in short, he advises translating probabilities into natural frequencies). For more along these lines, David Spiegelhalter, who has done a great deal to educate the masses about understanding uncertainty and the many things to consider in presenting risk to decision-makers, has written a great blog post with interactive graphics on 2845 ways to spin the Risk.
Understanding risk is part of thinking statistically, an important skill in this data-rich era. For attracting the broadest audience and to give more people a foundational understanding of important statistical concepts, there is considerable evidence that interactive data visualization plays an important role. Through dynamic and interactive graphs, learning becomes play.
Observations from statistics professors
Many professors/instructors offer compelling reasons for taking a visual path (and choose JMP) as a means to introduce more people to statistical thinking. For example, here are a few excerpts from an interview last year with Christian Hildebrand, Assistant Professor of Marketing Analytics at the Geneva School of Economics and Management:
“[Students] said ‘Wow, I never knew that statistics could even be fun!’ That’s when I realized that the statistical software is not just a medium, it is an environment that can actually help in understanding statistical concepts better. JMP was a big amplifier for that."
"With the software focusing so heavily on visualization, it’s much easier for you to really understand what is the issue in the data. It's critical for students to understand their data better by interacting with the data in a software environment like JMP. "
"What students really loved about the software was that they had a very intuitive way of learning. This intuition is very important because statistics is very much cognitive, and you have to learn the basics. At the same time, it is very important to still be creative and to think about new hypotheses, and very often you learn that out of the data. The capabilities you have with JMP — with the rich visualization capabilities — those are key to understand statistical concepts better.”
Peter Goos, Full Professor at the University of Antwerp in the Department of Environment, Technology and Management, and David Meintrup, Professor of Mathematics and Statistics at the Ingolstadt University of Applied Sciences co-authored, Statistics with JMP: Graphs, Descriptive Statistics and Probability. In their preface, they say:
"We chose JMP as supporting software because it is powerful yet easy to use…. We believe that introductory courses in statistics and probability should use such software so that the enthusiasm of students is not nipped in the bud. Indeed, we find that, because of the way students can easily interact with JMP, it can actually spark enthusiasm for statistics and probability in class."
David Meintrup also recently shared this story: "I always end the first session on JMP with Graph Builder. The first time my students see how to interactively create a map of the unemployment rate in Europe over the years 2000-2015, they are blown away. I can see how their facial expression changes, and from that point on I don't need to worry about motivation anymore."
Iddo Gal, Senior Lecturer and past Chair, Department of Human Services at the University of Haifa, and past President of the International Association of Statistical Education:
"In 2015, I attended the JMP workshop (three hours) in our IASE Satellite in Rio, and remember being particularly impressed with these tools, which far exceed options in other packages, and for me can help our participants see what is unique about it and also does not require strong formal/procedural skills. I also recall how the local (Brazilian) statisticians were taken by surprise — they said they work so hard to impart the technical [formulaic, statistical] underpinnings of multivariate stuff and running traditional analyses, and their students struggle with traditional outputs — yet within 15 minutes into the visualization portion of the JMP workshop, all of a sudden, they realized how their students can view things so much easier and understand and see what is coming out.”
Earlier this year in an interview with Jason Brinkley, biostatistician and senior research methodologist at American Institutes for Research, he discussed some of his experiences teaching with JMP from his 2014 Discovery Summit paper, Using JMP as a Catalyst for Teaching Data-Driven Decision Making to High School Students. Though the course targeted high school students who were gifted in math and science, Jason explained that this hands-on approach was well received, especially by the students who had not yet taken Advanced Placement Statistics. They could see and feel the power of data, and this piqued their interest. Jason said, “You could see the passion start to come up from the students, not necessarily about the research but about the data.”
What about you?
For those of you in the noble profession of teaching, how do you teach statistical concepts to a broad audience? Is some level of programming involved from the beginning, do you take a more visual approach, or do you give the students options to choose the tools they use?
For those of you who were/are students, how were you introduced to statistics? Did you have to learn a programming language first or did you learn via an interactive tool like JMP? If the former, do you think you would’ve understood the concepts more quickly if you’d had a more visual introduction? If the latter, did you later invest in learning a language (perhaps JSL?) anyway because it helped you do more with your data?
Thanks for your interest and I look forward to hearing from you!