BookmarkSubscribe
Choose Language Hide Translation Bar
Highlighted
doechemist
Occasional Contributor

How to best use discrete principal component data points in a design?

Hi there,

 

I am on my way to investigate a chemical reaction, where I would like to vary different factors in the experiment:

  • Temperature (I probably need a split-plot design for this; I am going to run several test at the same temperature at the same time)
  • Reaction time
  • Concentration
  • Ratio of different reagents
  • and the three first principal components (PC) of the solvent

I really like the concept of PCs to describe the solvents, as this gives a better (and probably a qualitative) discrimination between different solvents.

 

What troubles me is that I cannot change the PCs of the solvents continuously - meaning that my range of candidate solvents rather gives me a 'cloud' of discrete points in 3D space instead. Each solvent has a particular set of PCs.

I might be able to circumvent this issue by selecting solvents 'close enough' to the targeted design points (+1/-1 in each PC dimension) and then use the true (coded) coordinates for the analysis.

However, due to a restriction in chemical stability, all I have left is two classes of solvents with PC1-3 coordinates that sort of form two distinct domains in the PC1-3 space. This makes it impossible to find solvents sufficiently close to any reasonable design point.

 

I have tried to use the custom design tool by putting all my solvents and their data into a table, and then use them as a covariate factor (cf. the DOE manual). Trying different number of runs, I have to use a lot of runs (>80, JMP suggests 125) to get a reasonable power (0.8) for the PCs and avoid aliasing. For some number of runs, there seems to be partial aliasing of main factors as well as interaction terms. Sometimes, JMP doesn't give my any estimate of power.

 

Is there any smarter way to implement the PCs in my design? Could I ignore the low power for the PCs when using much fewer runs, hoping to a) still detect an effect anyway or b) remove certain non-significant factors and keep the PCs for the next iteration?

 

(JMP 14.0.0)

0 Kudos
6 REPLIES 6
phil_kay
Staff

Re: How to best use discrete principal component data points in a design?

Great question.

Using your 3 PCs as covariate factors in a Custom Design is, by definition, the optimal approach. In a situation like this you would not expect to be able to generate a perfectly orthogonal design. There will be some correlation of effects or "partial aliasing" as you call it. (Brad Jones recently pulled me up on using that term - effects are either aliased or not, partial doesn't really make sense).

I am surprised that you require so many runs for adequate power. But remember power is also dependent on the signal size you wish to detect and the expected experimental noise. Are you happy with your estimates of these?

It is possible that your power is very low because your "library" of solvents does not cover the PC space well.

It might help if you are able to post your data, suitably anonymised, if required.

Regards,
Phil
0 Kudos
phil_kay
Staff

Re: How to best use discrete principal component data points in a design?

Sorry, just re-reading your post: "due to a restriction in chemical stability, all I have left is two classes of solvents with PC1-3 coordinates that sort of form two distinct domains in the PC1-3 space"

So, as I suspected your solvent library does not cover the space well. You can only learn about the effect of a factor by changing it.

However, the fact there are 2 distinct groups should mean that you can learn about the effect of using solvent group A vs solvent group B.

Are these 2 groups separated in the PC space according to just one of the PCs?

Again, it would really help to see the data and the PCA and Design dialog.
0 Kudos
doechemist
Occasional Contributor

Re: How to best use discrete principal component data points in a design?

Hi,

Thanks for your thoughts. Below I have attached plots of PC1/PC2 and PC1-3 of the two classes of solvents, esters (blue) and alcohols (orange). In total 91 different solvents.

PC12.pngPC123.png

 

I tried to reproduce what I tried last time, but I guess it has been a long time since last I had time to tinker with JMP.

Anyhow, I added PC1-3 values in a table and inserted them as covariates in the design (see below).

I just want to know the impact of the main effects as a start, just as a screen, so I defined a model with interaction with "Estimability: If possible".

 

jmp1.png

 

This time, JMP suggested 20 runs in 4 whole plots (the temperature is hard to change). I just have a hard time decoding the alias matrix - I know 0 is no correlation and 1 is complete - but how to interpret all other numbers? I'd like to parallel it to the contribution of each effect or interaction to the estimated parameters of the postulated response function estimate (here just linear).

Anyhow, in this case (not shown) it doesn't provide a power estimate - but the anticipated coefficients are all also set to 0?

 

jmp2.png

 

Hope this helps!

0 Kudos
phil_kay
Staff

Re: How to best use discrete principal component data points in a design?

Thanks for your explanation. This is a really interesting application of DoE.

 

To look at the aliasing or correlation of effects I personally prefer to look at the color map on correlations to get a more immediate understanding.

 

I explain correlation of effects in my DoE blog series. In particular, post #4 looks at correlation color maps and what happens when you have strongly correlated effects.

 

 

doechemist
Occasional Contributor

Re: How to best use discrete principal component data points in a design?

Thanks for the links - I'll read it as soon as I can.

To help, I've included the correlation map below.

jmp3.png

From what I read, it shouldn't look too bad? As George Box might have said, I am going to do new experiments anyway - so if I miss an effect or have one too many, it'll settle out during the following experiments.

0 Kudos
phil_kay
Staff

Re: How to best use discrete principal component data points in a design?

Yes, given the large number of effects, that looks really good. Not many red squares. And you can see that there is very little correlation in the main effects, which is exactly what you should expect Custom Design to achieve.
0 Kudos