Using Generalized Regression To Analyze Designed Experiments With Detection Limited Responses (2023-EU-30MP-1308)

4 Kudos

Fangyi Luo, Group Scientist, Procter & Gamble
Christopher Gotwalt, Chief Data Scientist, JMP
Beatrice Blum, Senior Scientist, Procter & Gamble

Most measurement systems have detection limits above or below which one cannot accurately measure the quantity of interest. Although detection-limited responses are common in many application areas, such as the pharma, chemical, and consumer products industries, they are often ignored in the analysis. Ignoring detection limits biases in the results and even drastically lowers the power to detect active effects. Fortunately, the Custom Designer and Generalized Regression in JMP® make incorporating detection limits easy and automatic. In this presentation, we will use simulated versions of real designed experiments to show how to get the analysis right in JMP® Pro 17 and the pitfalls that will occur if detection limits are ignored in the analysis. We will also show how simple graphical tools can identify parts of the design region that could be problematic or even make it impossible to estimate certain model terms or interactions. Our examples will include an experiment designed to maximize the yield of a chemical product where the response is a reduction in the number of microorganisms in microbial susceptibility testing of consumer cleaning products.

Hi, I'm Chris Gotwalt with JMP, and I'm presenting with Fangyi Luo of Procter & Gamble, and her colleague, Beatrice Blum, who'll be joining us for the in- person presentation at the Discovery Conference in Spain.

Today, we are talking about how to model data from designed experiments when the response is detection limited. This is an important topic because on the one hand, detection limits are very common, especially in industries that do a lot of chemistry, like the pharmaceutical and consumer products industries.

While on the other hand, the consequences of ignoring detection limits leads to seriously inaccurate conclusions that will not generalize. This leads to lost R&D time and inefficient use of resources. The good news that we are here to show today is that getting the analysis right is trivially easy if you are using generalized regression in JMP Pro and know how to set up the detection limits column property.

In this talk, we're going to give a brief introduction to sensor data, explaining what it is, what it looks like in histograms and a brief description of how you analyze it a little bit differently. Then Fangyi is going to go into the analysis of some designed experiments from Procter & Gamble. Then I'm going to go through an analysis of a larger data set than the one that Fangyi introduced. Then we're going to wrap up with a summary and some conclusions.

What are detection limits? Detection limits are when the measurement system is unable to measure, at least reliably, when the actual value is above or below a particular value. If the actual value is, say, above an upper detection limit, the measured value will be observed as being at that limit. For example, if a speedometer in a vehicle only goes to 180 kilometres an hour, but you are driving 200 kilometres an hour, then the speedometer will just read 180 kilometres an hour.

In the graphs above, we see another example. We see five histograms of the same data. The true or actual values are over here on the left, and moving to the right, we see what results when you apply an increasing detection limit to this data.

What happens is we see this characteristic bunching at the detection limit. When you see this pattern, it's a really good sign that you may need to think about taking detection limits into account in your distributional or regression analysis.

Why should we care about detection limits in a data analysis? Well, if you don't take your detection limits into account properly, you'll end up with very heavily biased results, and this leads to very poor model generalization. The regression coefficients will be way off. You'll have an incorrect response surface, which leads to matched targets with the Profiler being way off.

I think the situation is a little bit less dire when maximizing a response, but there's still quite a lot of opportunity for things to go wrong. In particular, Sigma, your variance estimate will still be way off, which leads to much lower power, you have completely unreliable p- values. The tendency is that variable selection methods will heavily under select important factors . The actual impact that a factor has on your response will be dramatically understated if you don't take the detection limits into account.

The two tables of parameter estimates that we see here illustrate this very nicely. On the left are the parameter estimates from a detection- limited LogN ormal analysis of a regression problem. On the right, they are the resulting parameter estimates when we ignore the detection limit. We see that the model on the left is a lot richer and that a lot of our main effects, interactions, and quadratic terms have been admitted into the model.

Whereas on the right, when we ignore the detection limit, we're only able to get one main effect and its quadratic term included in the model, and the quadratic term is heavily overstated with a value of negative 11.5 about relative to the value in the proper analysis where that quadratic term is equal to just negative 3.

We see that we're really missing it on a lot of the other parameters here as well. When we take a look at this in the Profiler, this becomes really apparent. O n the left, we have the Profiler of the model correctly analyzing with the Limit of Detection, and we see that all the factors are there, and overall, the response surface is pretty rich- looking.

On the right, we see that only the one factor, dichloromethane, has been included in the model. T he solution to the problem that you would get with the problem on the left is likely rather different from the one that you would get on the right.

Thanks, Chris. Now I'm going to share with you a little bit background on the experiment of the data mentioned by Chris, the time to bacterial detection. The objective of that experiment was to understand hostility impacts of our formulation ingredients or factors on a liquid consumer cleaning formulation.

The experiment was a micro- hostility Design of Experiment with 36 samples, and we have five key formulation factors A, B, C, D, E. W e have two responses from this experiment. They're from microbial testing. The first one was the one mentioned by Chris. It is time to bacteria detection in two days, and it was measured by hour. I f we are not able to detect the bacteria in two days, then time to bacteria detection is right censored at 48 hours. So the Limit of Detection for this endpoint is 48 hours.

Another endpoint is log reduction in mode from Micro Susceptibility Testing. For this endpoint, what we did is that we add certain amount of mold to the formulation, wait for two weeks, and measure amount of mold in the product after two weeks. T hen we calculate the reduction in log base time mold, and this is the second endpoint.

Limit of Detection for this endpoint is six unit. T his shows you the detailed data from the experiment, the first 15 samples. Y ou can see the formulation factors A, B, C, D, E, and they were from response surface design. W e have two endpoints, the bacteria detection time in hours and the log reduction in mold. The data highlighted in red are right censored data.

We can use histograms and scatterplots to visualize our data as well as factor versus censoring relationship. As you can see from the histogram, more than 50 % of samples are right censored at 48 hours. If an observation is not censored, then most of them will be below 15 hours.

O n the right, we have the scatter plot. The red circle indicates the censored data points. You can see that we have censoring at all levels of the factors except for factor C. We don't have the censoring at higher level of C, but we observe censoring at all level of the factors.

In JMP Pro 16 and higher, we can specify column properties for detection limit. W hen you go to column property, you find detection limits, and then you can specify the lower detection limit and upper detection limit. I f a data point is below the lower detection limit, that means it's less censored at the lower detection limit. If a data point is higher than the upper detection limit, then it means that it's right censored at the upper detection limit.

For the bacterial detection time, we have an upper detection limit and it's 48 hours. W e put 48 hours in the upper detection limit box. After we specified detection limit on the column property in JMP, then we can use JMP generalized regression modeling to analyze the data by taking into account the Limit of Detection. So this is a new feature in JMP Pro 16 and higher.

For this type of analysis, we need to first specify the distribution for a response and estimation method. W e try different distribution for the data and use the forward selection method, and we found Normal distribution fits the data the best because it has lowest AICc.

We can also analyze data ignoring the detection limit. Y ou can see that we will have a much smaller model with five factors left in the final model. T he model ignoring Limit of Detection will have much less power to detect significant factors.

This showed you the factors left in the final model from the generalized regression modeling. If we take into account Limit of Detection for the response, or if we ignore Limit of Detection in the response. As you can see, if we take into account Limit of Detection, then we have much more significant factors in the model. W e can only detect the effect of C and D and their quadratic effect in the model if we ignore Limit of Detection for our response.

Again, this is comparison of the parameter estimate from the model if we consider Limit of Detection in the modeling or ignoring Limit of Detection in the modeling. Ignoring Limit of Detection in the modeling would give us the bias estimate of the parameter as well.

This slide shows you the prediction Profiler of the response if we perform the modeling by considering the Limit of Detection versus ignoring the Limit of Detection. If we consider the Limit of Detection in the modeling, then we get a model with all the terms in the model, the main effects as well as some of the interaction and quadratic terms. T his model makes much more sense to our collaborators.

Remember that at lower level of C and at higher level of D, we have more censoring data. That means the detection time is longer and the prediction Profiler showed that at lower level of C and a higher level of D, the predicted detection time is longer. A lso because we have more censored data in those region, so the confidence interval for the prediction P rofiler is wider.

If we ignore the Limit of Detection in the analysis, we get much less significant factors. Only C and D showed up in the model, and the parameter estimate is also biased. This one shows you the diagnostic plotting of observed data on the y- axis versus predicted data on the x- axis.

If we consider Limit of Detection in the generalized regression modeling, it gives correct prediction. But if we ignore Limit of Detection in the modeling, then it will give incorrect prediction for your data.

In addition to the prediction Profiler, JMP generalized regression modeling would also give you two profilers similar to those from Parametric Survival M odeling platform. Those are the Distribution Profiler and Quantile Profiler. The distribution profiler will give you the failure probability at a certain combination of our formulation factors and a certain detection time.

The Quantile Profiler will give you the quantile of the detection time at a certain combination of our formulation factors and the specified failure probability. T hese two profilers are available in JMP under the Generalized Regression Modeling.

But one advantage of using Generalized R egression Modeling to analyze time to failure type of data is that it would provide you the Prediction Profiler, and this type of profiler is much more easier for our collaborator to understand. I t's much harder to explain the Distribution Profiler and Quantile Profiler to our collaborators.

Now it comes to the analysis of the second endpoint, the log reduction in mold. Again, we can use histogram and the scatterplot to visualize our data and visualize the factor versus censoring relationship. As you can see from the left histogram, you can see that we have a lot of data that are right censored at six unit.

We can see censoring at all level of our formulation factors, except at higher level of C and lower level of E. T his is the region of concern. We have seen a lot of censoring at lower level of C and higher level of E. That means at lower level of C and higher level of E, it's good for the product. We have higher log mold reduction.

Again, we can use detection limit on the column property to specify the Limit of Detection for this endpoint. W e used upper detection limit of six in this column property. N ow the next step is to analyze this data using the Generalized R egression modeling by taking into account the Limit of Detection. W e use LogN ormal distribution and forward selection.

Interestingly, we found that the RS quare is one and this is very suspicious. A lso, we see some red flag. The AICc had a severe drop after step 17. T he standard error of the estimate as well as the estimate for the scale parameter seems to be extremely small. A lso, the diagnostic plot showed perfect prediction from the model. W e know that the model has overfit.

This is the Prediction Profiler, and they showed very narrow confidence interval for the prediction, and we knew that our model is overfit. So what we did for the modeling is that we tried a simpler model by removing the quadratic terms from the initial response surface model.

We found that LogN ormal with forward selection model fits the data the best because it has a lowest AICc and BIC. T his time, the solution path looks more reasonable as well as the standard error estimate of our parameters and estimate of the scale parameter of the LogN ormal distribution. T he diagnostic plot looks more reasonable now.

This is the Prediction Profiler of the final model after we removed the quadratic terms. This Prediction Profiler makes a lot more sense. Recall that at lower level of C and at higher level of E, we have more censored data you can see here. That means at lower level of C and higher level of E, we have higher log mold reduction.

It showed on the Prediction Profiler because it has more censored data in this region and the confidence interval for the prediction is wider. We can also compare the final model Prediction Profiler if we ignore Limit of Detection in the modeling.

If we ignore Limit of Detection in the modeling, we got less significant model factors as well as biased results. If we ignore Limit of Detection in the Generalized Regression modeling, then the second model, which is incorrect and is trying to use the quadratic term to predict in the lower level of C and higher level of E. So t rying to get the predictive value close to the Limit of Detection, and we knew that this result is biased.

Fangyi has nicely shown here that the incorrect analysis, ignoring the Limit of Detection, leads to some seriously biased results. And that getting the analysis right is easy if you set up the detection limits in either the custom designer or as a column property.

I'm going to go through one more example that has measurements at different times, which adds a little bit more complexity to the model set up, and in our case, required some table manipulation to get the data in the right format.

Here is the data table of the second DOE in basically the form that it originally came to us. In this data, we have 8 factors, A through H, and the data has measurements at 1 day, 2 days, and 7 days. Originally, our intent was to analyze the 3 days separately, but when we fit the day 7 data, the confidence intervals on the predictions were huge.

It was apparent that there was so much censoring that we were unable to fit the model, and so we were either going to have to come up with another strategy or back away from some of our modeling goals. What we ended up doing was we used a stack operation from under the tables menu so that the responses from different days would be combined together into a single column, and we added day as a column that we could use as a regression term.

In the histogram of log reduction, we see the characteristic bunching at the detection limit of five. Combining the data like this certainly seems to have improved the impact of censoring on the design and hopefully allows us to make more effective use of all the data that we have.

As in the previous examples, we start off fitting a full RSM model, but in this case, because we have day as a term, we add a day and interact all of the RSM terms with day in the Fit Model Launch Dialog prior to bringing up the generalized regression platform. Again, we're going to use the LogN ormal distribution as our initial response distribution.

Because this is a large model, we can't use best subset selection, so we used pruned forward selection as our model selection criterion. We try the LogN ormal, Gamma, and Normal distributions, and clearly the LogN ormal comes out as the best distribution because its A ICc is 205.3, which is more than 10 less than the second best distribution, which was the Normal, whose A ICc was 257.

Here, the model fit looks really reasonable with nothing suspicious. The solution path standard errors, scale parameter, and the actual by- predicted plots all look pretty good and realistic. There's a little bit of bunching down at the low end of the responses, but the thinking is that wasn't due to a detection limit, just a part of the discreetness of the measurement system at lower levels of reduction.

Now, if we repeat this analysis, ignoring the detection limit, it guides us towards the normal distribution. Here we see the Profilers for the model that incorporated the detection limit on the top and the model that ignored the detection limit on the bottom.

As in the other examples, we see that the size of the effects are dramatically muted when we ignore the detection limit and we get quite a different story as there's a strong relationship between log reduction in factor E when we take the detection limit into account properly, and that effect is seriously muted when we ignore the detection limit.

If we compare the actual by- predicted plots for the two models, the model with the Limit of Detection taken to account properly is tighter around the 45- degree line for the uncensored observations. W e see that the model ignoring the detection limit is just generally less accurate as the observations are more spread out across the 45- degree line.

Those are our two case studies. In summary, I want to reiterate that detection limits are very common in comical and biological studies. As we've seen in our case studies, ignoring detection limits introduces severe model biases. T he most important message is that using the column property or setting up the detection limits in the custom designer make analyzing detection- limited data much easier to get correct.

There are some pitfalls to watch out for in that if you see standard errors that are unrealistically small, or models are unrealistically accurate, you may need to back off from the quadratic terms or possibly even interaction terms.

We've shown how histograms can be used to identify when we have a detection limit situation. It's useful to see the censoring relationship between different factors, because if there are big corners of the factor space where all the observations are missing, then we may not be able to fit interactions in that region of the design space.

A gain, if the model looks too good to be true, go ahead and try a simpler model, back off a bit. That's all we have for you today. I want to thank you for your attention.

frankderuyck · ‎04-24-2023

Why not using lasso or elastic net instead of AICC to reduce risk for over fitting?