Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- How to classify groups with categorical data?

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Mar 21, 2016 2:45 PM
(864 views)

Hi,

Here is an easy one. I have a group of rodent skulls. They are from several named groups which are also geographic regions. One group is classified as "unknown". I have metric data and used Discriminant Analysis and a plot of the Canonical 1 and 2 scores. And, using the values in the Discriminant Scores table to show predictions.

I also have categorical data. Briefly, things like the number of holes in the skull for nerves and veins. They vary geographically from 2 to 4 (e.g. 1 on one side, a double hole on the other side). I am looking at the data in the Fit Y by X analysis. This is showing me Mosaic Plots and chi square tests for each variable. I need to retrace my steps, but some command also compares means with a student t test such that groups that are statistically different are labeled A, b, C, etc. Any advice there would be appreciated.

However, I would like to find a statistically valid test for classifying the unknown group on these categorical variables. Or, at least create a table suggesting greatest similarity as a group, not as individuals. I know there is a clustering method, but not clear to me how to use this for groups rather than individuals.

Thanks,

Chris

1 REPLY

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Mar 22, 2016 4:51 AM
(750 views)

It depends on the version of JMP you are using as Multiple Correspondence Analysis might be the technique you wish to use.

A more manual approach might be to:

- Label the points based on the group column.
- Carry out cluster analysis.
- Turn on colour clusters from the hotspot.
- Turn on constellation plot (optional) from hotspot.
- Vary the number of clusters until your known groups are roughly all coloured the same. Hopefully your unknowns will lie within a group or on the same branch as a known group.
- The save options on the hotspot include constellation coordinates, distance matrix and formula for cluster.

Perhaps one of these scores be a suitable basis for a test?