Subscribe Bookmark
anne_milley

Staff

Joined:

May 28, 2014

When should programming come into play in statistics courses?

blurrycode

Does exposure to coding in statistics courses dampen students' enthusiasm — for both programming and statistics?

Both academically and professionally, more courses are being offered and developed to make more people comfortable with data, analysis and risk assessment. This necessitates some use of statistics, and software is pretty much a tool of the trade. Software — some new, some enhanced, some commercial and some open source — is increasingly available to broader audiences and is ever-changing.

For the quantitative courses I took in college, I had to learn some coding languages to use SAS, SPSS and SHAZAM. I was not a fan of learning JCL and other programming languages initially and found learning the syntax of the languages an impediment to understanding statistical concepts.

On the positive side, even my limited coding skills later proved useful for my career, but many of my classmates’ exposure to coding dampened their enthusiasm — for both programming and statistics. Once I was exposed to the highly visual and interactive experience that JMP provides in data exploration and analysis, I wondered whether I would have understood statistical concepts more quickly and whether fellow classmates would have had greater enthusiasm for statistics had we used JMP.

More intro stats courses are being offered as MOOCs. Many universities are evolving their curricula to include business analytics and other courses to appeal more broadly to engage more people in statistical thinking. Professionally, more basic data analysis courses are being offered as well. In light of all this, it’s interesting to see which software is used: spreadsheets, interactive visual software like JMP, some SAS interfaces, interfaces to R, Minitab, etc., as well as language-based approaches like R, SAS, Python and others.

What factors affect which software is used in courses?

Screen Shot 2016-05-19 at 2.03.36 PM

I wonder if I would have understood statistical concepts more quickly if I had had access to JMP in college.

Having written a blog post about teaching statistics with JMP and continuing to engage with academics on how they teach statistical concepts, I’m curious about the motivating factors in choosing software for use by students with such varied levels of numeracy. Often, cost is the driving factor. Open source software is freely available. Excel is so ubiquitous that it is essentially perceived as free (but many recognize the limitations of spreadsheets).

Another motivating factor of some intro-level courses may be to leave the students with more marketable skills, and knowing a popular programming language is certainly such a skill (in addition to knowing about data analysis, of course).

Yet another consideration could be that the software is already there, what’s been there and what the instructor already knows.

Teaching how to think statistically

But beyond these factors, many instructors truly want to engage more students to see and feel the power of data, to experience what it is to “think statistically.” They recognize that many people will appreciate and benefit from understanding statistical concepts, but may never go on to learn any programming languages. They may be capable of statistical thinking without knowing how to program. Obvious examples would be doctors and judges, whose recommendations and decisions can powerfully affect people's lives.

I recently finished reading Risk Savvy: How to Make Good Decisions by Gerd Gigerenzer. For many important decisions regarding our health, finances and more, he shares well-founded research in how we can better assess risk to make better decisions. For example, he has done a lot of work with doctors to better communicate probabilities to their patients (in short, he advises translating probabilities into natural frequencies). For more along these lines, David Spiegelhalter, who has done a great deal to educate the masses about understanding uncertainty and the many things to consider in presenting risk to decision-makers, has written a great blog post with interactive graphics on 2845 ways to spin the Risk.

Understanding risk is part of thinking statistically, an important skill in this data-rich era. For attracting the broadest audience and to give more people a foundational understanding of important statistical concepts, there is considerable evidence that interactive data visualization plays an important role. Through dynamic and interactive graphs, learning becomes play.

Observations from statistics professors

Many professors/instructors offer compelling reasons for taking a visual path (and choose JMP) as a means to introduce more people to statistical thinking. For example, here a few excerpts from an interview last year with Christian Hildebrand, Assistant Professor of Marketing Analytics at the Geneva School of Economics and Management:

  • “[Students] said ‘Wow, I never knew that statistics could even be fun!’ That’s when I realized that the statistical software is not just a medium, it is an environment that can actually help in understanding statistical concepts better.  JMP was a big amplifier for that."
  • "With the software focusing so heavily on visualization, it’s much easier for you to really understand what is the issue in the data. It's critical for students to understand their data better by interacting with the data in a software environment like JMP. "
  • "What students really loved about the software was that they had a very intuitive way of learning. This intuition is very important because statistics is very much cognitive, and you have to learn the basics. At the same time, it is very important to still be creative and to think about new hypotheses, and very often you learn that out of the data. The capabilities you have with JMP — with the rich visualization capabilities — those are key to understand statistical concepts better.”
  • Peter Goos, Full Professor at the University of Antwerp in the Department of Environment, Technology and Management, and David Meintrup, Professor of Mathematics and Statistics at the Ingolstadt University of Applied Sciences co-authored, Statistics with JMP: Graphs, Descriptive Statistics and Probability. In their preface, they say:

    "We chose JMP as supporting software because it is powerful yet easy to use…. We believe that introductory courses in statistics and probability should use such software so that the enthusiasm of students is not nipped in the bud. Indeed, we find that, because of the way students can easily interact with JMP, it can actually spark enthusiasm for statistics and probability in class."

    David Meintrup also recently shared this story: "I always end the first session on JMP with Graph Builder. The first time my students see how to interactively create a map of the unemployment rate in Europe over the years 2000-2015, they are blown away. I can see how their facial expression changes, and from that point on I don't need to worry about motivation anymore."

    Iddo Gal, Senior Lecturer and past Chair, Department of Human Services at the University of Haifa, and past President of the International Association of Statistical Education:

    "In 2015, I attended the JMP workshop (three hours) in our IASE Satellite in Rio, and remember being particularly impressed with these tools, which far exceed options in other packages, and for me can help our participants see what is unique about it and also does not require strong formal/procedural skills. I also recall how the local (Brazilian) statisticians were taken by surprise — they said they work so hard to impart the technical [formulaic, statistical] underpinnings of multivariate stuff and running traditional analyses, and their students struggle with traditional outputs — yet within 15 minutes into the visualization portion of the JMP workshop, all of a sudden, they realized how their students can view things so much easier and understand and see what is coming out.”

    Earlier this year in an interview with Jason Brinkley, biostatistician and senior research methodologist at American Institutes for Research, he discussed some of his experiences teaching with JMP from his 2014 Discovery Summit paper, Using JMP as a Catalyst for Teaching Data-Driven Decision Making to High School Students. Though the course targeted high school students who were gifted in math and science, Jason explained that this hands-on approach was well received, especially by the students who had not yet taken Advanced Placement Statistics. They could see and feel the power of data, and this piqued their interest. Jason said, “You could see the passion start to come up from the students, not necessarily about the research but about the data.”

    What about you?

    For those of you in the noble profession of teaching, how do you teach statistical concepts to a broad audience? Is some level of programming involved from the beginning, do you take a more visual approach, or do you give the students options to choose the tools they use?

    For those of you who were/are students, how were you introduced to statistics? Did you have to learn a programming language first or did you learn via an interactive tool like JMP? If the former, do you think you would’ve understood the concepts more quickly if you’d had a more visual introduction? If the latter, did you later invest in learning a language (perhaps JSL?) anyway because it helped you do more with your data?

    Thanks for your interest and I look forward to hearing from you!

    14 Comments
    Community Member

    rob reul wrote:

    I learned programming first - developing linear programming methods with linear/matrix algebra... My stats education followed - and it remained a deep mystery until early careers made use of SPC - that's where it all started to click - that's where it became visual - that made all the difference. Had I have the right visuals it would have made much more sense much sooner.

    Anne Milley wrote:

    Thanks for sharing that, Rob! Visual makes such a difference for some of us.

    Community Member

    Joshua Jendza wrote:

    My university was (and probably still is) standardized on SAS for all statistics courses. I struggled with the programming requirements through the first course, and had to drop and retake my seconds stats course at least in-part due to the programming requirements. In between attempt 1 and 2 at my second course I discovered JMP and it turned me from someone taking the courses because I had to, to someone who really likes the data analysis steps. I would never pass for a professional statistician, but I truly believe JMP has made me a better researcher.

    Anne Milley wrote:

    Thank you so much for sharing that, Joshua! We're glad you discovered JMP!

    Community Member

    Michael Clayton wrote:

    At age 80 you can be sure that programming was a nightmare for us in college in 50's and even 60's. And engineering calculations were done with a slide rule and focus on getting the units right and estimating the answer then getting out the LOG and TRIG TABLES to get the precision needed for some tests, especially science tests. But at that time, the early statistical methods, especially any kind of hypothesis testing, was based on small sample sizes Thus the infamous p-values and eventually the 18 or more unique hypothesis tests demanded in stats classes based on data types, dependence, ad nauseum. So YES...when spreadsheets appeared and before that the affordable calculators that came about as the IC was developed, and finally JMP appeared, those of us that hated statistics suddenly loved statistical visualization, but never really liked complex hypothesis testing as we worked in industries that were now "data rich."

    Those born a few years later came into college when the Dec Vax and HP programmable calculators made learning to program a pleasure, so those of us from the old school joined teams that included those young computer-literate folks (not the ones in MIS...or IT as it was later called, but the CIM or "pocket programmer" types that understood the engineering context of the problems. JMP has changed everything, as John Sall envisioned, the elimination of command line programming for the new GUI world captured the engineers of the world who were too busy to write code just to study a problem, and were "enlightened" most of the time by the Variability Plot or Distribution Plot or SPC plots and went off to fix the problems without even running a statistical test until prodded to do so, and of course that was made easier, too. THANKS. JMP extended my useful work life by at least 20 years and made "big data" fun even if we didn't call it that.

    Community Member

    Michael Clayton wrote:

    BUT to answer you "programming in class" question:

    Many of my clients have found the open source R language fed with data structured by Python, to be a major enabling toolset when teaching the more complex statistical courses and engineering simulation courses involving background skills in Linear Algebra (not taught in 1950's engineering except EE's as I remember). Edx.org and Coursera.org have R and Python-based analytical classes that assume the students can load R and Python to their PC's and slowly become productive while learning analysis at the same time. A decade earlier, JSL was taught to about 1 or 10 JMP users to support the other 9 on each team mostly for data mining the many MES and Eng DB's. And at sites with large JMP user group, we find a few R-augmentors and Python-SQL data mining and structuring folks. So YES, those teams benefit by getting an intro to JSL which they can then recommend to their "data mining" teammates and get them to join the JMP user group.

    Anne Milley wrote:

    Thanks for sharing, Michael. Love that you say JMP has changed everything, as John Sall envisioned. I feel the same! And agree that it makes data fun--big or otherwise!

    Anne Milley wrote:

    Other languages (R, Python...) for more complex courses makes sense--they're free and they have the most bleeding-edge implementations available. Still, JMP (including JSL) offer time savings, especially in the discovery phase. Thanks again for sharing your thoughts!

    Anne Milley wrote:

    Since I mentioned Peter Goos and David Meintrup's first book, it's worth sharing that their second book has just been published: Statistics with JMP: Hypothesis Tests, ANOVA and Regression. You can download the first chapter at http://www.wiley.com/WileyCDA/WileyTitle/productCd-1119097150.html. It looks quite good!

    Community Member

    Steve Figard wrote:

    I personally learned statistics primarily by learning how to use JMP. I am now using JMP to teach an introductory biostatistics course and find the visual orientation of the software to be a great benefit for the students as they struggle to learn how to think like a statistician. I would consider programming to be an additional step for those who want to become data analysts, but unnecessary for those whose field requires just the analytics provided by the software (unless, of course, there are other extenuating reasons for doing so).

    Anne Milley wrote:

    Thank you so much for sharing that, Steve! We are so glad you credit JMP with playing a role in you learning statistics and that you see the visual aspects of JMP to be beneficial to students. We have heard users talk about the visual interactivity of JMP's data exploration as the gateway to doing more with data--even writing some code in more than one language!

    Anne Milley wrote:

    Thanks for your comment. It is an interesting relationship statistics has with programming languages, especially the lower-level languages used to create statistical software. In the early days of statistical software development, the logic in the math chips for many operating systems had to be extended to support some of the statistical computations software developers were trying to attain. But for purposes of exposing a broader audience to the power of statistics, perhaps programming isn't the best starting point for many.

    Community Member

    Carl wrote:

    A student's perspective;

    As an adult learner who is enrolled and about to graduate from Data Analytics programme. I can say that JMP has been an invaluable tool. The software helps to reinforce concepts, and furthermore because of the wide amount of statistical output it drives curiosity so I find myself learning about new distributions, and more which might not be seen in the class until much later.

    Anne Milley wrote:

    Thanks for your comment! Great to read that you feel JMP helps reinforce concepts and drives your curiosity. Best wishes to you as you graduate and congratulations!