Introduction
I still think JMP's one of the most powerful features is it's capability to quickly create different visualizations. This post is just a reminder/discussion starter/wish list item about that visualizing your data is extremely important. This post will include two different "simple" yet very powerful examples of why you should always visualize your data and not just trust summary/descriptive statistics.
Reminder part
Anscombe's quartet
Anscombe's quartet (wikipedia) is fairly popular data set to demonstrate why data should be visualized. The data set consists of four different data sets that have almost identical descriptive statistics. The data set was constructed by Francis Ancombe in 1973. More information can be found from the article where they are demonstrated (Anscombe, F.J., 1973. Graphs in statistical analysis. The American Statistician, 27(1), pp.17-21.).
JMP also has Anscombe.jmp as one of the sample data tables. The data table includes table script "The Quartet" which will use Fit Y by X to demonstrate the data set.
![jthi_0-1666104553193.png jthi_0-1666104553193.png](https://community.jmp.com/t5/image/serverpage/image-id/46375i1A4D3D9C4A93A320/image-size/medium?v=v2&px=400)
Datasaurus Dozen
I could see this as a bit nevere Anscombe's Quartet. You can find more information from Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics th... . The idea of the article isn't exactly to demonstrate that visualization is important but rather how to create such datasets. The article uses Datasaurus dataset created by Alberto Cairo as the baseline for creating different plots.
![jthi_0-1666174470461.png jthi_0-1666174470461.png](https://community.jmp.com/t5/image/serverpage/image-id/46409iCC1CF1414F78FE2A/image-size/medium?v=v2&px=400)
Datasaurus dozen from Same Stats, Different Graphs (autodesk.com)
Wish part
I hope that JMP would add either datasaurus dozen as dataset to JMP or even better, possibly use the paper by Matejka, J. and Fitzmaurice, G., 2017, May. Same stats, different graphs: generating datasets with varied appearance and identical statistics through simulated annealing. In Proceedings of the 2017 CHI conference on human factors in computing systems (pp. 1290-1294). to implement the algorithm which would allow users to create this type of datasets.
JMP already has JMP Man Dozen.jmp sample data which has been built from JMP Man by using methods suggested demonstrated in Matejka, J. and Fitzmaurice, G., 2017, May. Same stats, different graphs: generating datasets with varied appearance and identical statistics through simulated annealing. In Proceedings of the 2017 CHI conference on human factors in computing systems (pp. 1290-1294).
![jthi_1-1666204175385.png jthi_1-1666204175385.png](https://community.jmp.com/t5/image/serverpage/image-id/46433i99C84005218CF5C4/image-size/medium?v=v2&px=400)
JMP Man Dozen dataset visualized in Graph Builder
Demo
Note: Demo does require JMP16 or newer to run.
I have also committed this to my github page. It has slightly different script and different text than this post here (jthi0/visualize_your_data (github.com).
I "quickly" wrote a demo script which can be used to show Anscombe's Quartet and Datasaurus Dozen. This has been attached as a zip file to this post. Unzip visualize_your_data.zip to one folder and run demo.jsl and it should open a window with both data sets visualized.
The demo window includes slider which can be used to make markers more visible on all graph builders. Images on the left side have markers hidden and the ones on right side have transparency set to 1 (100%).
![jthi_1-1666175059410.png jthi_1-1666175059410.png](https://community.jmp.com/t5/image/serverpage/image-id/46411i0711AD3358BFFC7E/image-size/large?v=v2&px=999)
I have also posted this to my github (github.com/jthi0) but the github version doesn't include datasaurus set, as I didn't bother checking out if I could freely share that (but you can freely download it from https://www.autodesk.com/content/dam/autodesk/www/autodesk-reasearch/Publications/pdf/SameStatsDataA... , I used DataSaurusDoze.tsv as the baseline for the demo script. Convert it to .jmp file and save to same folder as demo.jsl and support scripts).
This script could be fairly easily converted to add-in and there are quite a few improvements which could be done (such as exporting data from the user interface).
Discussion
Have you faced similar situations where visualization has saved you or your organization from a lot of headache?
-Jarmo