Discussions

camth · Jul 28, 2015 09:47 AM

JMP 11 Pro: Distribution Platform - Normal Mixtures. How does JMP decide how to label the distributions?

I fit a mixture of 3 normal distributions to a (single) response. The 3 distributions identified by JMP are labelled 1, 2 and 3 in the JMP output, i.e. (π₁, μ_1, σ₁), (π₂, μ_2, σ₂) and (π₃, μ_3, σ₃). How does JMP decide which of the 3 identified distributions to label respectively 1, 2 and 3? Is it based on the relative values of the μ’s ? Or is it eg. so that I can be most ‘confident’ in the distribution labelled 3 and less ‘confident’ in the distribution labelled 1 (i.e. iterative like e.g.: first the distribution labelled 3 is identified as the most ‘obvious’ distribution in the data. Then this distribution is ‘filtered out’ and JMP identifies the distribution labelled ‘2’ in the ‘rest’ after distribution 1 is filtered out. Finally, the distribution labelled ‘1’ is identified after distribution 3 and 2 are ‘filtered out’)?

volker_kraft · Jul 30, 2015 03:25 AM

"The answer is that there is no ordering given to the distributions. Fitting the normal mixtures model is solved as an optimization problem in the mixture proportions, means, and variance matrices. The user sees the parameters that are wherever that optimization algorithm lands at its last iteration, e.g. no further processing happens. As a user I wouldn’t read much into the ordering that is given, and I would focus attention on the clusters with larger mixture probabilities."

(answer provided by Chris Gotwalt - thanks, Chris!)

camth · Jul 30, 2015 04:21 AM

Thank you very much!

Is it possible to get a reference to the algorithm used?

MathStatChem · Jul 30, 2015 09:38 AM

If I remember correctly, for the Distribution Platform, the cluster labels are based on the estimated cluster mean. The cluster with the lowest cluster mean gets the label "1", next highest gets label "2", and so on.

You can see an example of where I took advantage of that in this blog post http://blogs.sas.com/content/jmp/2013/05/01/is-your-data-too-precise/

Unfortunately, the link to the JMP add-in I created that "bins" rows according to their most likely cluster doesn't work any more. I will see if I can upload the add-in to the JMP User Community.

MathStatChem · Oct 28, 2016 6:19 AM

Just uploaded the add-in. You can find it here: Univariate Binning using the Normal Mixtures Distribution

Discussions

JMP 11 Pro: Distribution Platform - Normal Mixtures. How does JMP decide how to label the distributions?

Re: JMP 11 Pro: Distribution Platform - Normal Mixtures. How does JMP decide how to label the distributions?

Re: JMP 11 Pro: Distribution Platform - Normal Mixtures. How does JMP decide how to label the distributions?

Re: JMP 11 Pro: Distribution Platform - Normal Mixtures. How does JMP decide how to label the distributions?

Re: JMP 11 Pro: Distribution Platform - Normal Mixtures. How does JMP decide how to label the distributions?

Recommended Articles