This script tests multiple data columns for normality using the Anderson-Darling method. Each column can be split into groups by a categorical column and each group tested individually if required.
The output is an interactive report ranked by the A-squared statistic as a ratio of the critical value at a 5% confidence level (so data with different population sizes can be compared directly).
The work is done using the scipy python library and requires JMP 18+.
I was originally looking for a way to practice using the new python integration in JMP18, but that ended up being a very small part of the script, with the majority being the data manipulation and the interactive report writing!
When you run the script it will ask you to choose the data columns you wish to calculate normality for, and optionally a further grouping column (I'm using the Semiconductor Capability sample data set):
The output is an interactive report ordered from most non-normal to most normal. If you specify a grouping column, it defaults to showing one data column using a local data filter. If there's no grouping column, the filter is removed.
Select a row to show a run chart and distribution report for that subset of the data.
Note: it is NOT extensively tested or optimised, as it was mainly a spare-time learning project for myself. It's probably overkill for what it does, but there are useful tricks in there that may be valuable to others. Suggestions and comments therefore always welcome.
The Semiconductor Capability dataset isn't the best for demo purposes as none of the data is especially non-normal, but it shows the idea.