☑ cool new feature

☑ could help many users!

☐ removes a „bug“

☐ nice to have

☐ nobody needs it

**What inspired this wish list request? **

In Jmp, due to the possibilities of Graph Builder it is very easy to generate complicated plots - with GroupX/GroupY/Wrap/Page and all the other approaches to generate hierarchical structures. The default variable which is displayed in Bar charts, Heatmaps etc. is the number of corresponding rows, i.e. the occurrence of the respective event. That's very convenient

What happens if the user is not interested in **occurrences**, but in **probabilities**.

Let's take

`Open( "$SAMPLE_DATA/Airline Delays.jmp" );`

as an example. **Southwest** has the largest number of large delays !!!

- but just because it has the largest number of flights in the table.

It's clear, one has to divide the number of occurrences by the number of total flights.

Actually, no problem.

But now, let's assume that we just have a list of pass/fail infos for chips on wafers with different process parameters.

alternative application: https://community.jmp.com/t5/Discussions/count-unique/m-p/592337/highlight/true#M79638

Important: The data was not intentionally generated by a DOe, therefore, the number of wafers is not the same for the different process parameters. In analogy to the first example, the wafers with process parameters *Southwest* has by far the largest # of fail devices.

Does *Southwest* also have the largest **failure rate**?

This time, there is no easily accessible number in the database which we can use for the normalization.

What we need to remove this bias from the data: the number of wafers per group

Fortunately, there is a column with the wafer ID in the data table.

Unfortunately, there is no function in JSL which we can use to (directly) "count" the wafers - and to distribute to results among the respective rows of the data table. To cite @Beaux :

Strange that that function is not available in the formula editor (statistical) or JSL... Maybe next...

The shortcut function** "Count"** from the **New Formula Column** Right Click Context menu sounds great - but it doesn't count **different/unique values**, it counts every single **non-empty** **row: **In the generated JSL code, **Count** is **Col Number .**

**What is the improvement you would like to see? **

Follow @Beaux 's wish and add a function **Col N Categories.** It should behaves similar to **Col Number - ** but it should just count unique values: entries which show up multiple times should just be counted **once**.

NB:

**1) **There is already a **N Categories** in **the Tables/summary** function. So, I hope that the effort will be low to provide such a functionality as well as a JSL function.

**2) **Like** Col Number, **also **Col N Categories** should provide a GroupBy option, which generates groups of rows, executes the analysis for each group and distributes the results to the corresponding rows.

**Why is this idea important?**

With **N categories** available, one can calculate probabilities from occurrences, like:

`Percentage failChipsPerWafer = Col Sum(:defects ,:processVariant) / Col N Categories(:wafer_ID, :processVariant)*100`

`Percentage failWafers = Col N Categories(if(Col Sum(:defects,:wafer_ID)>0,:wafer_ID,.),:processVariant) / N Categories(:wafer_ID, :processVariant)*100`

more wishes submitted by