Subscribe Bookmark RSS Feed

Using Clustering to predict?

garethw

Community Trekker

Joined:

Nov 11, 2013

Hi All,

I've tried every combination of keywords I can think of to search for an answer; and can't find what I'm looking for. If I've overlooked something obvious or in the manuals, please point me in the right direction!

What I have done is built a model using a cluster. This covers my needs very well, and allows a useful analysis on my dataset.

I now have a second dataset, and I want to assign members of this dataset to the existing clusters that I have. I can't however find anyway of getting a formula or prediction out of the clustering algorithm that will allow me to do this. Is this possible? Might a different technique give me what I need?

Many thanks for any extra information you can provide,

Gareth

5 REPLIES
Jeff_Perkinson

Community Manager

Joined:

Jun 23, 2011

Hi Gareth,

Which Clustering algorithm are you using?

The Hierarchical method doesn't lend itself to prediction as you would like, but the K-Means methods do.

After Clustering using one of the K-Means methods, you can choose Save Cluster Formula from the red triangle menu to get a formula that you can use to assign membership for new rows.

-Jeff

-Jeff
garethw

Community Trekker

Joined:

Nov 11, 2013

Thanks Jeff,

I'm using the default method which is Heirarchical/Ward.

I have a few categorical columns, so the K-maes is not an option for me at this point in time.

And Thanks Steve,

However, I'm in JMP with no access to SAS and cannot find an option to do SCORE.

Many thanks,

Gareth

stevedenham

Community Trekker

Joined:

Jun 23, 2011

It looks like you want to SCORE your new data, so try searching on that.  At least for the SAS procedures that would be the direction to look.

Steve Denham

Jordan_Hiller

Joined:

Jun 23, 2011

Clustering is an unsupervised method, designed for exploration and interpretation, but not prediction. It sounds like you might want to check out the Partition platform (Analyze > Modeling > Partition) or Discriminant analysis (Analyze > Multivariate Methods > Discriminant).

Here's a blog post that can help you get started with the Partition platform. It's older, but still relevant. Link

Good luck,

Jordan

garethw

Community Trekker

Joined:

Nov 11, 2013

Thanks Jordan,

I have used the partitition platform before and it does generate useful models for this data.

Gareth