Analyzing pH Data

VariancePony864 · Jan 10, 2024 06:47 PM

Hello-

I am reviewing the attached pH/lot data set and I have a few goals I want to achieve. Could anyone help me with determining the best tests to run in JMP? Thank you!!

1. My pH specification is 6.5-7.1. Based on the results that I have gathered per lot, can I statistically justify changing my lower specification limit to 6.0? and if I cant, what would an acceptable lower limit be based on my data?

2. Based on the dosage form column and the associated pH data set, is there any dosage forms that are outliers?

Victor_G · Jan 11, 2024 2:14 AM

Hi @VariancePony864,

There are not enough informations given to help you.

A first option for you might be to visualize your data before doing any tests, to have a better idea of your results and experimental setup.

Some questions to help starting the discussion (if not answered by visualization):

What is your goal ? Detecting a possible difference between lots, dosage form, both ? Or doing an equivalence test based on your domain expertise (for example, stating that pH results differing from +/- 0,1 can be considered equivalent due to repeatability of this equipment) for the different lots and/or dosage form ? Or evaluating variance sources and their relative impact on the pH results with a MSA-type study ?
Are the batch ID unique ? What I mean by looking at your data is that each experiment use a different batch ID, so you don't seem to have any replicates or repetitions in your design. Is this really the case, or are some experiments using the same batch ID ? If yes, could you add this information ? If no, why this choice ? This prevent to get an estimate of tests repeatability, which can severely reduce the number of options, tests, or the confidence in some conclusions (since you can't compare other variance sources to the repeatability value, except if you exclude batches as possible variance sources and use tests from different batches as replicates).
What is the dosage form ? A factor you can change easily, or something you can't control ? Are there only these 5 levels possible, or much more ? How the repartition between different dosage forms has been done (since it seems the tests repartition is unbalanced on these 5 levels) ?

About your questions, and as emphasized at the beginning, start by visualizing your data and you might be able to answer most of your questions :

Why would you change your specification limit to 6 ?
Can you spot some outliers or any big differences between dosage forms ?

I hope this first response will help you,

Victor GUILLER
L'Oréal Data & Analytics

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

statman · Jan 11, 2024 10:35 AM

Here are my thoughts:

As Victor points out, there is not enough situational information to provide good advice, so take my input with a grain of salt. How was the data obtained? Do you know the measurement system error?

First, since you are asking about the use of JMP, if you could, in the future, attach JMP files instead of excel, that would be better.

I'm confused by the first question. The specification (i.e., voice of the customer) should be what the target value and acceptable variation around that target should be for performance that is required. Perhaps you use the term specification to mean something else? You may be able to determine what are the expected limits of variation as a function of your data set. I have attached a JMP data table with a script that will show a distribution of the pH values(click on the green arrow). There appears to be some outliers.

It does look like dosage 30 has some outliers associated with it (see result vs. dosage). See also IR of result.

"All models are wrong, some are useful" G.E.P. Box

Analyzing pH Data

Re: Analyzing pH Data

Re: Analyzing pH Data