cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
The Discovery Summit 2025 Call for Content is open! Submit an abstract today to present at our premier analytics conference.
Choose Language Hide Translation Bar
View Original Published Thread

Is there a way to tell LCA that the rows are ordered by class?

mtowle419
Level II

Typical dataset: 120-200 rows, 15 classes.

 

Rows are known to be ordered by class. What is unknown is where the 'fences' between classes are.

 

As-is -- i.e., without taking row order into account -- LCA correctly classifies about 94% of rows. My intuition is that if I knew how to tell the algorithm that the rows are grouped by class on input, we'd be at 100%.

 

Visual, in case my use of 'grouped by' is unclear:

 

Correct would look like this:

 

1a

1a

1a

1a

1a

1b

1b

1b

1b

1b

1b

1b

1b

1c

1c

1c

1c

1c

1c

1c

All 1a rows will always be neighbors in terms of row order, and all 1b, etc.

 

 

Currently, I sometimes get:

 

 

1a

1a

1a

1c

1c

1b

1b

1b

1b

1b

1b

1b

1b

1c

1c

1c

1c

1c

1c

1c

Ideas?

 

Potentially, I could make a column with Row() as the value, but my worry is that most of the clustering cols have 'class'-type values. For 15 classes, each col in use might have between 2-8 unique values. Row() will have 120 unique values, which feels like throwing a wildcard into the mix. 

0 REPLIES 0