cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
emanuelx
Level I

use euclidean distance on principal components to quantify change (over time)?

hi everyone!

I'm back with another naive question for which I would much appreciate some input.
we are currently measuring how groups of cells perform in antibody production on how this changes over time.
We have a bunch of different readout (highest cell density, size, waste product production,... and of course the amount of antibody they produce). We have this for when the cells were "young", "middle aged" and "old". For some the amount of antibody they produce changes, but also some other attributes (not always the same ones and also in different directions). The principle question I wanted to answer is if they only produce less antibody and everything else stays the same, or if they change as a whole and the amount they produce is just one of the things that becomes different. Empirically I can say the latter is (mostly) the case, but I wanted to quantify this and - in particular - put it in a nice figure (for presentation purposes, not detailed analysis). So my idea was to performa PCA on the attributes other than the amount they produce (no clear trends here) and then measure the euclidean distance along the first 4 PCs (describing about 80% of diff / the "knee" in the scree plot) to respective "youngest cells". The idea here was, the bigger this distance is, the more different the cells have become from their ancestors. Indeed when I plot this distance against the change in the amount they produce the correlation is pretty decent. I understand of course that this doesn't tell me anything about the mechanisms behind this, but I was just wondering if ppl felt this made any sense. I assume I could skip the whole PCA step and just do the eucl. distance for all the attributes, but it seems like a good way to both center the data and reduce the amount columns / dimensions.
What do you think? P.s.: if you don't like cells, just think of plants!

1 REPLY 1
ih
Super User (Alumni) ih
Super User (Alumni)

Re: use euclidean distance on principal components to quantify change (over time)?

Could you build a PCA model using a baseline period period and score new rows against that model with the MDMCC platform, introduced in JMP 15?  The T2 value would indicate whether there was a shift from the baseline period that is explained by the model, the SPE would indicate whether the data fits the model built in the baseline period, and the contributions to each would indicate which variables caused the variation.

 

To do to this:

  • In data table, put baseline period at top of table
  • Hide/exclude data not in baseline period
  • Build PCA model, save components
  • Unhide/unexclude new data
  • Open MDMCC platform, add saved components, set row when historical data ends
  • Add SPE chart
  • Select current or new rows and add mean contribution proportion plots for both T2 and SPE
  • Also look at the contribution heat maps