Discussions

MetaLizard62080 · Apr 17, 2024 08:44 AM

Hi,

I have performed a DoE as designed, but would like to know how to evaluate replicate analytical data which feeds into the response for each condition.

We usually use analytical data provided by our analytical department, but are investigating a rapid analysis tool for future applications. While this tool is helpful, it appears most accurate when performing replicate analysis, and then averaging to achieve the mean. Typically, we manually (and by judgment call) will remove outlier response data, but I was hoping this data could be input into the JMP DoE platform, and it would automatically determine outliers.

How do I input this replicate data, but avoid JMP believing these are true replicates, and not simply replicate analysis of the same condition? If I don't, I assume JMP will provide an "Overly confident" profile, unless I am misunderstanding something.

statman · Apr 17, 2024 11:39 AM

I'm a bit lost on your terminology. I believe the analytical data is repeats in relationship to the treatments in the experiment.

1 way to add repeats to the design is to add additional columns for each repeat. Once you have these columns populated, stack those columns. Then you can graph the within treatment variation, assess consistency of the within treatment variation and determine appropriate statistics to summarize those repeated measures (e.g., mean, log standard deviation, etc.)

"All models are wrong, some are useful" G.E.P. Box

MetaLizard62080 · Apr 17, 2024 10:47 PM

Hi Statman,

I guess what I am trying to ask is, is there a way to define rows in JMP as being replicate analytical data, rather than replicate run data.

For example, if I perform 12 runs in a DoE, and analyze them 10 times each, I have 120 responses. If I just stack these 12 runs 10x, and apply each replicate response, I can model the data, however, JMP will view these as replicate runs, rather than replicate analytics.

Would JMP view this data incorrectly and shrink the confidence intervals inappropriately assuming process variance was low (Since we would only have analytical variance at this point)?

gonzaef · Apr 21, 2024 8:34 AM

building on what have already being commented,

I believe you are doing repeats (repeated measurements, which do not add degrees of freedom) not replicates (independent events, meaning redoing the setup of factor in each run, which add degrees of freedom).

What I would do is to summarize the 120 rows table taking the average of your each of your treatments ending up with a table with only 12 rows (each row represents the average of 10 measurements) and use this table to fit your model.

You can use the average to look for factor which may be active to control the mean position (location factors) and storing also any dispersion metric (range, std dev or variance), usually log transformed, you can build another model to look for factors that may be active to reduce the process dispersion (dispersion factors).

Please let me know if you need any clarification,

Yours truly,
Emmanuel

========================
Keep It Simple and Sequential

MRB3855 · Apr 22, 2024 1:47 AM

Hi @MetaLizard62080 : Another option would be to stack the data as you describe; then, along with the other fixed factors in your DoE, include "run" as a random effect in your model. The results will be the same as analyzing the means (as @gonzaef and @Steffen_Bugge suggested), but you'll get the additional information about run-to-run variability, and within-run variability.

Edit: How many design factors are in your DoE?

Steffen_Bugge · Apr 18, 2024 06:21 AM

Hi @MetaLizard62080

Perhaps this demonstration can give you some indication: https://www.youtube.com/watch?v=zagpxGbfuiA&t=358s

If you have separate columns for each analytical replicate series, you can calculate the mean and the standard deviation and model them both.

GregF_JMP · Apr 24, 2024 11:58 AM

Hello MetalLizard...

I appreciate your initial instinct to ask the question- repeated measure of the same experimental run are different than a "fresh" repeat of the iteration that encompasses all sources of variability: starting materials, adherence to factor settings etc. I concur with the previous community input on this.

Your initial question was two part: repeated measure averaging/outlier detection & exclusion as part of measuring responses in a DOE.

JMP can help with both parts, but addressed individually.

I also agree with the other community members suggestions, my instinct would also be to aggregate in a separate data table, then summarize.

While the savings in time/cost with the "rapid measurement system (prone to outliers, that can be manually identified and excluded)" vs use of "analytical department measures" might be a great choice for your situation - there are a few caveats.
These two measurement systems might give net results that have an offset, slope or linearity difference.

JMP also has great tools for measurement systems evaluation.

If the goal of the experiment is to determine the factor settings that will maximize or minimize a response, any difference in measurement systems will probably not matter.

*But*

If the goal is to determine the factor settings to hit a certain target (as measured by analytic dept)... or
if there is a specification limit for the response (again "as measured by analytical dept") that needs to be avoided as part of factor operating range establishment... or
if the goal is to build (and publish) a model to predict outcomes, given future set of factor settings (for input X's, we expect what value of Y)

.....then a means to translate between measurement systems becomes more relevant.

Regarding outlier detection/exclusion

JMP has a variety of tools that support this effort. Do they make the exact same decisions as your "manually/by judgment call"... "it depends". The suggest architectures of summarizing multiple rows will work, with a step where outlier rows have been reviewed and selectively excluded. There also could be some outlier screening put into column formulas that would selectively exclude (conditional "missing") rows deemed unusual.

Discussions

DoE and inputting replicate analytics

Re: DoE and inputting replicate analytics

Re: DoE and inputting replicate analytics

Re: DoE and inputting replicate analytics

Re: DoE and inputting replicate analytics

Re: DoE and inputting replicate analytics

Re: DoE and inputting replicate analytics

Recommended Articles