Choose Language Hide Translation Bar

Using Microbial Metabolic Profiles to Improve Scale-Down Model Qualification and Process Characterization Studies

In biopharmaceutical process development, the characterization of critical process parameters (CPPs) and controllable process parameters is crucial. As in many industries, process development and characterization studies start at the laboratory scale and the production process is subsequently upscaled to the final production facility. JMP plays a central role as a tool to support these studies – from DOE to process modeling.

The use of online sensor data is a potential source of information, which is currently underutilized in process development and characterization studies. In particular, the comparison of microbial metabolic profiles between the production scales is mostly performed exploratively. The statistical analysis of this data is an option to understand differences and can help to enable better process control strategies.

In this talk, we explore:

  • How Functional Data Explorer in JMP Pro can give a better understanding of scale differences.
  • The current usage of JMP for process characterization studies at Lonza.
  • Utilization of Functional Data Explorer in JMP Pro to support process understanding and optimization

 

Hello and thanks for joining my talk. I am Anne-Catherine Portmann. I am a Project Leader in the Microbial Process Development Department at Lonza. Today, I am here to talk about using microbial metabolic profiles to improve scale down model qualification and process characterization studies.

I want to briefly start with this disclaimer. I will not read out the whole text, but I want to mention that I believe that all information I share with you today is correct. Due to confidentiality reason, I specifically want to mention that all the data is normalized and anonymized.

Here is the agenda of my today's talk. We'll start with an introduction to Lonza, and I will give you some background information on bioprocess and process characterization studies. This will help to understand the three following examples where I use JMP. In the first example, I will show you how I applied JMP Functional Data Explorer to compare data from batches at different scales. In the second example, I will use to JMP in a very standard way to determine the proven acceptable range of a parameter. The third example will be on the utilization of Functional Data Explorer to minimize the product-related impurities. Finally, I will conclude this talk and answer your questions.

I'd like to start by giving you a quick introduction to the company I work for. Lonza is a multinational manufacturing company for the pharmaceutical biotechnology and nutrition sectors. On the right, you have a picture of the site of Visp in Switzerland, where I am working.

Let's have a look to some numbers about Lonza. You most probably already know that Lonza is a global company of about 18,000 employees with a long history of more than 100 years. We have more than 35 sites worldwide, supporting our customers in manufacturing innovative medicines. Visp is the biggest site of Lonza.

In Visp, the microbial production capabilities are ranging from 70 liters to 15,000 liters. When a new process arrive at Lonza, it is first transferred from the customer to process development department. The building where I am working in the process development is really central to the site, and we have all the manufacturing production around us. That helped a lot to have good collaboration between process development and manufacturing scale. In the process development department, we are testing and adapting this new process to derisk the upscaling at the target manufacturing scale.

I know that many of you are coming from different kinds of industries and are not familiar with microbial bioprocess. Therefore, I will describe you in the next couple of slides, a typical microbial bioprocess and what are the typical steps to characterize it.

A bioprocess is composed an upstream part and a downstream part. During the upstream part, the protein of interest, here in orange, is produced in microorganism in the fermentation. Then the cells are broken down, and the cell debris are removed during the separation. During the downstream part of the process, we are removing other proteins, and we are purifying the protein of interest until the final product, which is very pure.

The fermentation is an important step as it produce the protein of interest. The amount of product is defined by this step, as well as the product quality. Any mistake in the protein chain could not be corrected after what's in the process. Therefore, it's very important to correctly regulate the input parameters such as temperature, pH, dissolved oxygen, or other kinds of parameters, and to really good monitor the output attributes, which are, for example, the concentration of the product in titer, the purity of the product, or biomass that correspond to the amount of cells in the bioreactor during the fermentation.

During a PC study, a process characterization study, we want to understand the impact of the input parameters on the output attributes and define the parameter ranges where the attributes are within specification. Understanding the dependency of input parameters and output attributes is key to comprehending bioprocesses.

What is a process characterization, say, the PC? The process characterization is part of the process validation, and it contains four parts. In the first part, we have the risk assessment, where we are selecting the parameter to further investigate. We select this parameter based on the process knowledge, on the historical data, and the experience and expertise of the people involved in these risk assessments.

The second part is the qualification of the scale down model. We need a scale down model because we cannot perform all the experiments to investigate these parameters at scale. It will be too much expensive, and the efficiency of this experiment would not be really high. At the laboratory scale, we have the possibility to run multiple reactors in parallel, and that help for having an efficient experiment in the lab. However, the instrument at the lab scale had to perform exactly the same and giving the same result as we obtain in the manufacturing scale. If they are not, we have to evaluate the difference in between these two scales and to see these scale-dependent differences more in detail and to be able to explain it.

The next step that we are doing during PC study is preparing the design of experiment using JMP to optimize the number of study, and experiments are then performed in the lab. From this experiment, we will collect all the data from this output, and we will generate data table in JMP and analyze the data to evaluate the interactions between the parameters and the attributes, and define the parameter range and impact.

Why a process should be characterized? A process should be characterized to ensure that we deliver a constant product quality and a reproducible yield during manufacturing. When we are taking a medicine, you always want to have the same effect. If you take a painkiller, you always want to have no more pain after you take it. You don't want that once you take it, you don't have any more pain, but two days after, when you take again the same medicine, it's almost doing no effect. The day after, when you take it, the effect is too big. That is really the kind of example where you really want that your medicine is really constantly having the same quality and the same efficiency. To do that, we are doing a process characterization and a process validation.

Now, I will show you some examples. I will start with the qualification of a scale down model. As I told you, the scale down model qualification, it's a very key step during the PC study to ensure that the experiments are performed in a representative instrument. In the fermentor or in the fermentation reactor, we have many sensors connected to it, which are recording a lot of data continuously during the experiment. This is a large source of information that will be really good to be used for comparing difference between scales. I will show you how we explore these differences of scale by using this kind of data.

As I say now many times, it's really cheaper to use an instrument in the laboratory. We can use a high throughput device at on a laboratory scale, and that will allow us to make a lot of experiment in a cheaper way than doing all these experiment at manufacturing scale. By qualifying the scale down model, we are able to also determine the scale-dependent differences.

Until now, we are really using the offline data. It looks like this, how we are doing it. We have the data at lab scale and the data at manufacturing scale in a one-way analysis. We use an ANOVA or a mean comparison to determine if these two group of batches at lab scale are comparable or not.

The thing is, when we have online data, we cannot use this kind of graph to make a comparison. The data look like the right part of the graph. Some people are just comparing this curve by an expert eyes and telling, okay, that is comparable, or these data are completely different. The problem of this kind of approach is that the people comparing it could leave the company or some colleagues are looking at the data and saying, "Oh, no, I don't see a difference," and you see it. It's not really a good statistical way to compare data.

Thanks to JMP Functional Data Explorer, we are able to statistically compare this data in an appropriate manner and to even determine if we are getting some clusters. Qualifying the scale down model is key to translate the ranges from the PC study to the manufacturing control strategy. In case we have a difference, we are able to translate back this difference. I will show you very quickly how the JMP Functional Data Explorer is working and how we use it for our example that will come later.

The Functional Data Explorer is a very good tool to analyze the continuous data. I will not explain you very in detail how it works. I think a lot of talks were already made and are done currently about really how it works. I will really focus on the main part that is useful for my analysis. The first part is really to, when we have our data, is to fit a model. This model has to be chosen between the B-spline, the P-spline, the Fourier, or the wavelet. Then we check which one of the model is better fitting our data.

As a second step, we have a functional principal component analysis. If you already worked with principal component analysis, it's the same approach, but with functional data instead of having data points. The idea is to change the space design to uncorrelate the function, so the function shapes that we have here. These function shapes will explain the variability between the batches and the function according to the mean function. For example, here, the first shape function will explain 56% of the variation in the attribute trend between the batches.

Another graph that we obtain is the score plot. In the score plot, we can choose which component we want to compare. In this case, we have the choice of the five components. The five shape function component that we can plot in the X and Y-axis. Here, we can, according to the data, looking if we are grouping them along the X-axis and the Y-axis, so along the first or second component in this case.

Here, by looking at them, we can already see that along the component 1, we have with high probability, two groups, one at the left part of the X-axis and one at the right part. When we want to ensure this, we can also use the control chart. With the control chart, we generate one control chart for functional principal component. Then adding at the top the different scales. We can look at the means that generated between the scales and evaluate the scale differences and if the means are really the same.

Functional Data Explorer allow us to cluster comparable batches and to compare scale means. It's exactly what we were looking for to compare these attributes' data.

Let's have a look to an example. This data are real, but for confidentiality reason, we anonymized and normalized the data of this presentation. In this example, we would like to compare the two lab scales, so the lab scale 1 and 2 that are in purple and in blue with the manufacturing scale for a specific attribute which was a time course startup.

We would like to know which one of these two scale would be the better scale down model of this manufacturing scale. We run the Functional Data Explorer, we get some results. If we look at the eigenvalues of this functional principal component, we see that the first one is already explaining more than 78% of the variability between batches. You can get the second one, it's more than 9%, so we will concentrate the rest of the analysis on these two components.

We look at the score plot. In the score plot, we directly see along the X-axis for the functional component 1, which was the one explaining most of the variability between batches, we have two clusters: one on the left having only the lab scale data and one on the right having the lab scale 2 data and the manufacturing scale, the blue and green dots. We have some outliers. Along the component 2, we cannot really define two groups.

To confirm these groups, we look at the control chart. Here, we really see that the green line corresponding to the mean values between the batches are really similar between the blue and green dots, so the lab scale 2 and manufacturing scale. The lab scale 1 had really a different mean, and then this lab scale was not really optimal for us. The lab scale 2 is identified by Functional Data Explorer as the more representative scale down model for this attribute. It's important to say that it's for this attribute. For another attribute, it could be another scale down model. We have to check that for all the attribute that we have decided to explore in our analysis.

Now that we have defined the scale down model, we would like to see the next part of the analysis of the PC study. In this step, we would like to design the experiments, to perform them, and to look at the interaction between the parameters and the attributes, and the data analysis with the parameter range and the impact determination. For this two step, we are not using Functional Data Explorer, but JMP with a very standard DoE approach.

Why we use a DOE? Doe allowed us to compare parameters and attribute, and to see the correlation between them. Indeed, in the design of experiment, we can identify very different interaction: the main effect, the quadratic effect, and the interactions effect. We also are able to optimize the attribute. For example, we can maximize a titer, minimize some impurities or maximize some purities, some quality attributes. We can attain a certain target. We have many options that really could help us. In the report at the end, we also have the effect of the parameters on the attributes. Thanks to the P-values, we can define if they are significant or not.

To design the right model, we can use the JMP menu DoE and to choose the design that we want. If we have doubt how to do it, we always have the easy DoE option that is super convenient and help us to go step by step to design our model. In our case, we use the classical response surface with central composite design. In this case, we have a center point that correspond to our set point. Let's say we have 35 degrees as a set point.

Then we have what we call the operating range. It's the range including the accuracy of the probe as well as the small variation that could occur in a fermentation. For example, the 35 will not be a straight line, the 35. You will have a very small oscillation due to the accuracy of the probe, of the nutrients that we are adding, and how the fermenter is maintaining the temperature. That is known that in this operative range, we will be within the specification.

Now, we want to know if we enlarge a range or taking a larger range in case something is happening during the process that make during a short time a small difference, a higher temperature or a very low temperature compared to the set point during a short time or maybe during all the process, are we, during this change of temperature, still in the specification with our product or quality? JMP DoE is helping us to optimize the number of experiment performed in the lab, but also to understand the interaction of the parameters and the attributes.

Now, we are performing the experiment in the lab, and we are collecting all the data in the data table of JMP, and we want to build a model. When we have all the data in the table, we are doing it with the tidy data principle. That is one row per experiment, one column for each parameter or each attribute. Very standard way to use JMP. Then we can fit the model with the different option. For example, we can put the different parameter in response surface model. We can use the Y here to add the attribute that we want to explore. We can choose different personalities, for example, the standard least square approach or the stepwise approach and so on.

Then we will get, after running this model, a report. In the report, we have the tendency to scroll down and see the result in the prediction profiler, to see the interaction between the parameters and the attributes. But to do that and to ensure that we have the right model to do it, it's very important to concentrate on the evaluation of the model. That are the first step. It's why JMP is giving us a lot of output in this report. We have to really understand the result of this evaluation. We have, for example, the lack of fit, the studentized residuals, the summary of fit, and so on. I really advise you to have a deeper look on this data and really understand what it means and to get the best model to your data. JMP performs proper analysis and verifies the quality of the fitting model, ensuring the reliable PC study and drug product.

The relationship between attributes and parameters are in the prediction profiler, as I told you before. We have a set point in the middle. Normally, it goes to the top of this prediction curve. We have the operating range. That is where the data are within this attribute range. You have a larger range at the edge of the attribute range and intersection of the curve of the prediction profiler. That means that are the extreme points where we are still within the specification. That is the proven acceptable range that we want to find with a PC study. The PC study also give us the impact, and that is depending on the shape of the curves of the prediction profiler. More the curve is enlarged, lower is in the impact, narrower is the curve to the set point, higher is the impact. The attribute specification is only met within a parameter proven acceptable range. Now, I will show you an example where we use that.

In this example of PC study data from a fermentation, we create a response surface model with a stepwise approach and all possible model option. We met quite a lot of criteria in our model evaluation. We were just not getting the lack of fit. We have a lack of fit, and the model is not fitting well over that. We were wondering why. We discussed internally with a lot of experts, and we look really deeply in our data. What we found, it was that we were probably going to have a plateau effect that was not seen by a response surface model because it's a second-degree model. We had to add a third-degree polynomial to the parameters. It's what we did, and we rerun the analysis.

There, we were meeting all our criteria. In this case, you have the first model here and the second model there. You see at the end that the parameter that changed, it was the first parameter, and we have a curve a bit different in the prediction profiler. Indeed, we had a plateau effect here at the end.

At this point, we found a model that was correct, that was within all our specification. We were able to define the proven acceptable range, and the impact. The impact, as you see in the two first parameter, was higher than in the third parameter. That is something that is now applied in our production, and some runs are performed at production level with these ranges.

At the end of fermentation, some product-related impurities could be detected in the analytical method chromatogram as a post peak shoulder. That is really a thing that is complicated to remove in the downstream process. I will show you how by using Functional Data Explorer, we were able to minimize these product-related impurities.

As you see here, we have the chromatogram. What we were expecting is that the curve where it goes up here was going down in the same shape on the right side. But we have this bumpy side going down. That means we have impurities in this part of the product, so product-related impurities, which are hard to remove in the downstream process. This issue could be even bigger if we don't remove them, or we don't find a way to remove them, it could be that at the end of the process, we are not in the quality specification of the product.

We have now a curve that is getting some impurities in it. We know that with a JMP Functional Data Explorer, we can understand better the curve shape. We want to understand how some parameters, as you see before, we get a DoE, so we test different parameters, how we can maybe use this information of this parameter to explain the impurities and maybe reduce them.

The problem for us, it was that the Functional Data Explorer is giving us some principal component analysis that are not real parameters that we cannot change in the lab to have concrete parameters, real parameters. It's not a temperature that we can change from few degrees. But in JMP Functional Data Explorer, we have an option that is super useful, and it converts the data of the functional principal component in real parameter. That is the functional DoE. By adding at the first when you're building your model in the Functional Data Explorer, you can add the DoE analysis that you perform. By doing that, JMP is able to translate back the FPCs into real parameters in the profiler. Then we were able to convert them back. It was our hoping that we were able, with this approach, to remove or to decrease the impurities.

Here is the example on how we concretely did it. We have this shoulder. Here, I zoomed on the part of the peak to have a better look. We have this post peak shoulder of 0.11. It was something high. That is the amount when we have the set points at all the parameters. Then we optimize it by moving in the functional DoE profiler each parameter. We found that by decreasing the first parameter and increasing the two next parameters, that we were reducing then half of the amount of these impurities.

That is really a great result for us because it's so difficult to remove these impurities in the DSP part that if we can do it really early on the process by optimizing some parameters, that save us a lot of time and money. That is really a beneficial solution for us and for our customers. To that is a great result that we have to test still in the lab. It's not yet done. We have also to evaluate if that would not produce other issue, reducing too much other parameter, other attributes that we don't want to reduce or increase other aspects.

During this presentation, you probably realize the importance of statistics to support a process characterization. Today, I present you how we can improve the process characterization and process transfer from small scale to manufacturing scale thanks to JMP.

As take-home message, you can remember these four aspects of my presentation. First, a process characterization ensures a constant product quality and reproducible yield during at-scale production. Second, with JMP functional DE, we are able to cluster continuous data to detect scale differences. Third, the JMP offers the option to identify the best fitting model, even for complex data such as PC study data. Finally, JMP functional DoE translates the FPCs into operational parameters which support process understanding and optimization.

With that, I would like to warmly thanks all my colleagues who helped me preparing this presentation. A very special thank you to Sven, who had the idea of this presentation topic, but also to Claire and Ludovic, who have shared their statistical knowledge. To Jonas for his support on preparation of this presentation, Romain for sharing some data with me, and all the colleagues for their input and thought. Finally, a big, big thank you to Florian, who support our department for any JMP question, from the more complicated one to the most stupid one during the last years.

Thanks also to you for your attention, and I hope you found this all informative, and I am ready to answer any question you have. Also in Manchester, if you have more, we can still discuss about it. Thank you.