Learn more in our free online course:

**Statistical Thinking for Industrial Problem Solving**

In this video, we show how to create box plots and summary statistics using the **Distribution** platform in JMP for the **Impurity** data.

To start, we select **Distribution** from the **Analyze** menu.

We select **Impurity **for **Y, Columns**, and click **OK**. We select **Stack **from the top red triangle next to **Distributions **to change the layout to horizontal.

By default, you see a histogram and box plot, quantiles, and summary statistics for **Impurity**.

Histograms are described in another demonstration, so we’ll focus on the other output that is provided.

Let’s start with the box plot.

Recall that the lower end of the box in the box plot is the first quartile, the line in the box plot is the median, and the upper end of the box is the third quartile. The distance between the first and third quartiles is called the *interquartile range*, or *IQR*, and the lines drawn from either end of the box are called *whiskers*.

Two other pieces of information displayed with the outlier box plot are the sample mean and a measure called the *shortest half*.

The center of the diamond is the sample mean. In this example, we see that the mean is slightly higher than the median. This is an indication that the distribution is somewhat skewed.

The tips of the diamond define a confidence interval for the mean. You learn about confidence intervals in the Decision Making with Data module.

The shortest half shows the densest region of the data. This shows where the “tightest” grouping of 50% of the observations fall. Notice that the shortest half corresponds to the tallest bars in the histogram.

Let’s take a look at the default summary statistics that are reported.

The Quantiles report includes the minimum, maximum, median, quartiles, and quantiles (or percentiles) in different increments.

We can right-click on the values in this table and select **Format Column** to change the data format for the values that are displayed. For example, we might want to show only three decimal places for the quantiles, and also for the summary statistics. It might make sense to do this if we have only a few significant digits.

We can also change the default quantile increments displayed in the Quantiles report or request custom quantiles using red triangle options. For example, to set quantile increments, we select **Display** options from the red triangle next to **Impurity**, and then enter the quantile increment value. To display quantiles in increments of 10%, we enter **0.1**.

We’ll repeat these steps and select **Revert to default quantiles** to show the original quantile values.

The Summary Statistics table reports the mean, along with several other measures.

We can add other summary statistics to this report. To add statistics, click the red triangle next to **Summary Statistics** and select **Customize Summary Statistics**.

A variety of measures for shape, centering, and spread of the distribution are available.

Here, as an example, we select **N Missing** and click **OK**. We can see that none of the values for **Impurity** are missing.

Note that, if we prefer different percentile increments each time we run the analysis, or if we’d like different summary statistics to display by default, we can set preferences. To do this, we go to **File** then **Preferences** (or **JMP** then **Preferences** on a Mac). We select **Platforms** from the **Preference Group** and select **Distribution** or **Distribution Summary Statistics** from the **Platforms** list.