Level I

Proc Cluster vs JMP Clustering


My query is on the differences between Proc Cluster and the Clustering procedures in JMP.

I clustered about 50K data points, 13 variables, using the ward method into 5 clusters using both JMP and Proc Cluster. I noticed that there were quite a few differences between the two, namely-

1. Jmp is blazingly fast. It takes about 3 hours for SAS to process my 50K dataset, whereas Jmp takes about... 3 minutes?!!

2. The results don't match very well... the 5 clusters I got from both the programs were very different. to start with the number of points falling in each cluster were way different.

3. This might be specific to the dataset I am using, but Jmp clusters are much more interpretable (i.e. cleanly separated)  as compared to the ones from Proc Cluster.

I was hoping someone could shed some light on how JMP differs from SAS Proc Cluster, especially with regards to point 1 (speed)

Edit:- Just so you know, The proc Cluster was running on a server ('SPDS') with some 16GB RAM, 'evil fast processor',  the works, whereas Jmp was running on a laptop with 1GB of RAM, . So it doesn't look like a problem with the pagination etc.

Thanks in advance,


Article Labels

    There are no labels assigned to this post.