Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- JMP 11 Pro: Distribution Platform - Normal Mixture...

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jul 28, 2015 6:47 AM
(651 views)

I fit a mixture of 3 normal distributions to a (single) response. The 3 distributions identified by JMP are labelled 1, 2 and 3 in the JMP output, i.e. (π_{1}, μ_{1,} σ_{1}), (π_{2}, μ_{2,} σ_{2}) and (π_{3}, μ_{3,} σ_{3}). How does JMP decide which of the 3 identified distributions to label respectively 1, 2 and 3? Is it based on the relative values of the μ’s ? Or is it eg. so that I can be most ‘confident’ in the distribution labelled 3 and less ‘confident’ in the distribution labelled 1 (i.e. iterative like e.g.: first the distribution labelled 3 is identified as the most ‘obvious’ distribution in the data. Then this distribution is ‘filtered out’ and JMP identifies the distribution labelled ‘2’ in the ‘rest’ after distribution 1 is filtered out. Finally, the distribution labelled ‘1’ is identified after distribution 3 and 2 are ‘filtered out’)?

4 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jul 30, 2015 12:25 AM
(468 views)

"The answer is that there is no ordering given to the distributions. Fitting the normal mixtures model is solved as an optimization problem in the mixture proportions, means, and variance matrices. The user sees the parameters that are wherever that optimization algorithm lands at its last iteration, e.g. no further processing happens. As a user I wouldn’t read much into the ordering that is given, and I would focus attention on the clusters with larger mixture probabilities."

(answer provided by Chris Gotwalt - thanks, Chris!)

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jul 30, 2015 1:21 AM
(468 views)

Thank you very much!

Is it possible to get a reference to the algorithm used?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jul 30, 2015 6:38 AM
(468 views)

If I remember correctly, for the Distribution Platform, the cluster labels are based on the estimated cluster mean. The cluster with the lowest cluster mean gets the label "1", next highest gets label "2", and so on.

You can see an example of where I took advantage of that in this blog post http://blogs.sas.com/content/jmp/2013/05/01/is-your-data-too-precise/

Unfortunately, the link to the JMP add-in I created that "bins" rows according to their most likely cluster doesn't work any more. I will see if I can upload the add-in to the JMP User Community.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jul 30, 2015 6:48 AM
(468 views)