cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Register for our Discovery Summit 2024 conference, Oct. 21-24, where you’ll learn, connect, and be inspired.
Choose Language Hide Translation Bar
mtowle419
Level II

Is there a way to tell LCA that the rows are ordered by class?

Typical dataset: 120-200 rows, 15 classes.

 

Rows are known to be ordered by class. What is unknown is where the 'fences' between classes are.

 

As-is -- i.e., without taking row order into account -- LCA correctly classifies about 94% of rows. My intuition is that if I knew how to tell the algorithm that the rows are grouped by class on input, we'd be at 100%.

 

Visual, in case my use of 'grouped by' is unclear:

 

Correct would look like this:

 

1a

1a

1a

1a

1a

1b

1b

1b

1b

1b

1b

1b

1b

1c

1c

1c

1c

1c

1c

1c

All 1a rows will always be neighbors in terms of row order, and all 1b, etc.

 

 

Currently, I sometimes get:

 

 

1a

1a

1a

1c

1c

1b

1b

1b

1b

1b

1b

1b

1b

1c

1c

1c

1c

1c

1c

1c

Ideas?

 

Potentially, I could make a column with Row() as the value, but my worry is that most of the clustering cols have 'class'-type values. For 15 classes, each col in use might have between 2-8 unique values. Row() will have 120 unique values, which feels like throwing a wildcard into the mix. 

0 REPLIES 0