Note: John Sall, co-founder and Executive VP of SAS, discussed why statistics matters on the occasion of the International Year of Statistics in 2013. SAS was a participant in Statistics2013. Sall is the chief architect of JMP.
How is statistics essential to everyday life?
Most people don't realize how essential statistics is. Daily life is surrounded by the products of statistics.
You brush your teeth. The fluoride in the toothpaste was studied by scientists using statistical methods to carefully assure the safety and effectiveness of the ingredient and the proper concentration. The toothpaste was formulated through a series of designed experiments that determined the optimal formulation through statistical modeling. The toothpaste production was monitored by statistical process control to ensure quality and consistency, and to reduce variability.
The attributes of the product were studied in consumer trials using statistical methods. The pricing, packaging and marketing were determined through studies that used statistical methods to determine the best marketing decisions. Even the location of the toothpaste on the supermarket shelf was the result of statistically based studies. The advertising was monitored using statistical methods. Your purchase transaction became data that was analyzed statistically. The credit card used for the purchase was scrutinized by a statistical model to make sure that it wasn't fraudulent.
So statistics is important to the whole process of not just toothpaste, but every product we consume, every service we use, every activity we choose. Yet we don't need to be aware of it, since it is just an embedded part of the process. Statistics is useful everywhere you look.
Why now? What's new about statistics?
Statistics is an emergent discipline that has rapidly adapted to current challenges.
In today's era of big data -- where the computer and network are everywhere and everything can be measured -- you need statistics to make that data useful.
To learn how a process behaves, you need to conduct experiments. And now we have ways of designing those experiments so that you learn the most from a limited number of experimental runs.
To estimate the models, you need tools to refine and validate the models, and the state of the art has much improved for this in the last few years.
Also we have much better tools to use the models to optimize processes and make them robust with respect to variation.
Statistical software -- like JMP -- becomes the interface to harness statistical methods. Statistics is not easy, and using the computer can be a challenge. So we view our job as making the experience, the workflow, of analyzing data as easy, as comfortable and as powerful as possible.
Do we really trust statistics? Different statistics say different things.
Statistics must be used responsibly. You need good data, and enough of it. Controlled experimental data is best. Selection and confounding biases can easily corrupt the analysis if you don't pay attention to the process that created the data. When you go about doing statistics yourself, you need to learn the problems that can happen. You need to think through the analysis carefully.
Some of us learn statistics as part of our education. Some of us learn on the job. Some of us still have a lot to learn.
The International Year of Statistics 2013 is the occasion to remind us of the value of:
- Statistical methods.
- Learning how to use them responsibly.
- Statistical software as the tools of analysis.
- Using statistical professionals to help us out when needed.
What is the purpose of the International Year of Statistics?
The purpose of the International Year of Statistics is to focus awareness of the importance of statistical methods in our world. Statistics has played an increasingly important role in science, in industry, in health and in business.
For the past few decades, people have increasingly learned to collect and use data, perform designed experiments, to use evidence instead of intuition to make decisions.
With the computer and network revolutions, data is now easy to collect and store. And with modern software, it is comparatively easy to analyze, even when you have a lot of data.