Subscribe Bookmark RSS Feed

Kmeans Clustering CCC problem

jessez

Community Trekker

Joined:

Oct 20, 2014

JMP Kmeans clustering is not calculating a CCC statistic for a subset of my data. I have a data file with a little over 4000 unique sites. Using this data set I have successfully used kmeans clustering and JMP displays a CCC statistic. I created a subset of this data using a variable and now have two new data tables (one with around 200 rows and the other with 4000+). Whenever I go through the same kmeans clustering process on the larger subset of the data JMP will not spit out a CCC statistic. Any ideas about what is going on here?

1 ACCEPTED SOLUTION

Accepted Solutions
Solution

After a couple hours (and you will see they were wasted hours...) of trying to figure this out I caught the problem. One of the variables I used in the clustering process was zero for most of the data set. It turns out that it was zero for all of the rows in the subset I was interested in using Kmeans clustering. The zero values blow up the CCC. Problem solved.

2 REPLIES
Solution

After a couple hours (and you will see they were wasted hours...) of trying to figure this out I caught the problem. One of the variables I used in the clustering process was zero for most of the data set. It turns out that it was zero for all of the rows in the subset I was interested in using Kmeans clustering. The zero values blow up the CCC. Problem solved.

chitra

Community Member

Joined:

Mar 24, 2017

Just wondering how you solved this problem. Did yo ujust remove that attribute? I am facing the same situation where almost all of my columns have a large number of 0s, but since they are significant for my analysis I want to include them in my analysis.