cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • Register to attend Discovery Summit 2025 Online: Early Users Edition, Sept. 24-25.
  • New JMP features coming to desktops everywhere this September. Sign up to learn more at jmp.com/launch.
Choose Language Hide Translation Bar
Alicia_500
Level I

Simulating clusters using K Means - Negative Values

Hi,

When I simulate clusters from the K Means platform I get some negative simulated values for one of my variables which, in practical terms, can only be positive.

 

Looking at the original distribution of this variable, it is non-normal and bounded at zero (so something like a log-normal distribution fits it well).

 

Is there a way to ensure the data generated from the cluster simulation remains positive?

 

Many thanks,

 

Alicia

1 ACCEPTED SOLUTION

Accepted Solutions
Victor_G
Super User

Re: Simulating clusters using K Means - Negative Values

Hi @Alicia_500,

 

Welcome in the Community !

 

Clustering can be done with different algorithms, depending on your objectives, data types, and the criterion on which you are creating the clustering : based on distributions, on points density, on hierarchical structures/links between points, ...

You can have a look at available algorithms based on your data types here : Overview of Platforms for Clustering Observations

 

If you need more infos about how to use the different algorithms, you can watch this video : Clustering | JMP 

There is also a very nice blog by @Chelsea-Parlett explaining the differences between clustering methods : Clustering methods for unsupervised machine learning (jmp.com)

 

Concerning your use case, with the relative low information provided and absence of data to test some approaches, I think K-Means may not be the best suitable clustering techniques as you're facing different distributions with different "spread". K-Means creates spherical clusters, as it doesn't assume any differences on the distributions.

You could try using Normal Mixtures, as it will be influenced by distributions and variances differences of your features or Hierarchical Cluster, that doesn't assume any distributions for clustering. You could compare the outcomes of the clustering to see which one(s) make more sense, and the agreement between each method.

 

Hope I did understand your situation,  

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

View solution in original post

2 REPLIES 2
Victor_G
Super User

Re: Simulating clusters using K Means - Negative Values

Hi @Alicia_500,

 

Welcome in the Community !

 

Clustering can be done with different algorithms, depending on your objectives, data types, and the criterion on which you are creating the clustering : based on distributions, on points density, on hierarchical structures/links between points, ...

You can have a look at available algorithms based on your data types here : Overview of Platforms for Clustering Observations

 

If you need more infos about how to use the different algorithms, you can watch this video : Clustering | JMP 

There is also a very nice blog by @Chelsea-Parlett explaining the differences between clustering methods : Clustering methods for unsupervised machine learning (jmp.com)

 

Concerning your use case, with the relative low information provided and absence of data to test some approaches, I think K-Means may not be the best suitable clustering techniques as you're facing different distributions with different "spread". K-Means creates spherical clusters, as it doesn't assume any differences on the distributions.

You could try using Normal Mixtures, as it will be influenced by distributions and variances differences of your features or Hierarchical Cluster, that doesn't assume any distributions for clustering. You could compare the outcomes of the clustering to see which one(s) make more sense, and the agreement between each method.

 

Hope I did understand your situation,  

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
Alicia_500
Level I

Re: Simulating clusters using K Means - Negative Values

Thank you Victor for the reply - this is extremely helpful!

Recommended Articles