Characterizing Bio-processes With Augmented Full Quadratic Models and FWB+AV (20...

Americas 2020

Characterizing Bio-processes With Augmented Full Quadratic Models and FWB+AV (2020-US-45MP-548)

Philip Ramsey, Senior Data Scientist and Statistical Consultant/Professor, North Haven Group and University of New Hampshire
Tiffany D. Rau, Ph.D., Owner and Chief Consultant, Rau Consulting, LLC

Quality by Design (QbD) is a design and development strategy where one designs quality into the product from the beginning instead of attempting to test-in quality after the fact. QbD initiatives are primarily associated with bio-pharmaceuticals, but contain concepts that are universal and applicable to many industries. A key element of QbD for bio-process development is that processes must be fully characterized and optimized to ensure consistent high quality manufacturing and products for patients. Characterization is typically accomplished by using response surface type experimental designs combined with the full quadratic model (FQM) as a basis for building predictive models. Since its publication by Box (1950) the FQM is commonly used for process characterization and optimization. As a second order approximation to an unknown response surface, the FQM is adequate for optimization. Cornell and Montgomery (1996) showed that the FQM is generally inadequate for characterization of the entire design space, as QbD requires, given the inherent nonlinear behavior of biological systems. They proposed augmenting the FQM with higher order interaction terms to better approximate the full design regions. Unfortunately, the number of additional terms is large and often not estimable by traditional regression methods. We show that the fractionally weighted bootstrapping method of Gotwalt and Ramsey (2017) allows the estimation of these fully augmented FQMs. Using two bio-process development case studies we demonstrate that the augmented FQM models substantially outperform the traditional FQM in characterizing the full design space. The use of augmented FQMs and FWB will be thoroughly demonstrated using JMP Pro 15.

Auto-generated transcript...

Speaker	Transcript
Tiffany	First, thanks for joining us today. We're going to be talking about characterizing different bio processing...seeing...and really focusing on pDNA, seeing how fractionally weighted bootstrapping can really
	add to your processes. So Phil will be joining me at the second half of the presentation to go through the JMP
	example, as well as to give some different techniques.
	to be used.
	I'm going to be talking about the CMC strategy, biotech, how do we get a drug to market and how can we use new tools like FWB in order to deliver our processes.
	So let's get started. So the chemistry manufacturing control journey. So it's a very long journey and we'll be discussing that.
	Why DOE, why predictive modeling? It is very complex to get a drug to market. It's not just about the experiments, but it's also about the clinic.
	and having everything go together. So we'll look at systems thinking approaches as well. And then, of course, characterizing that bioprocessing and then the case study that Phil will
	discuss. So what does the CMC pathway look like? This is a general example and we go from toxicology. So that's to see does doesn't have any efficacy. Does it work.
	in a nonhuman trial. All the way up to commercial manufacturing and there's a lot of things that go into this, for example, process development.
	But you also have to have a target product profile. What do you want the medication that you're developing to actually do? And it's important to understand that as you're going through. And then of course the
	quality target product profile as well. This is...what are the aspects that are necessary in order for the molecule to work as prescribed?
	And then we look at critical quality attributes and then go through the process so it's...it's a...it's a group
	because of the fact that you have your process development, process characterization, process qualification and the BLA. There's a huge amount of work that goes into each one of these steps.
	And we also want to make sure that as you're going through your different phases that you're actually building these data
	groupings. Because when you get to process characterization and process qualification, you want to make sure that you can leverage as much of your past information as you can, so that you've actually shorten
	your timelines. You might say, "Tiffany, I do process characterization, all the way through the process." And I'll say, "Absolutely." But the process characterization that specific for the CMC pathway is what we need from a regulatory point of view.
	So everyone has probably heard in the news, you know, vaccines and cell and gene therapies. It's a very hot subject right now, and it's also
	bringing new treatments to patients that we've never been able to treat before.
	And so in the two big groupings of cell therapies and gene therapies, there's different aspects for it. Right. So we have immunotherapies. We have stem cells.
	We're doing regenerative medicine. So just imagine, you know, having damage in your back and being able to regenerate. There's huge
	emphasis on this grouping. But there's also a huge emphasis on gene therapies. Viral vectors. Do we use bacterial vectors? How do we get the DNA
	into the system in order to be a treatment for the patients? Well, plasma DNA is one of those aspects and Phil has an amazing case study where they did an optimization.
	So you might say, well, "What is pDNA. And why is it important, other than, okay, it's part of the gene therapy space, which is very interesting right now?"
	Well, the fact is, is that it can be used for...in preventative vaccines, immunization agents, for...Prepare...preparation of hyper immune globulin
	cancer vaccines, therapeutic vaccines, doing gene replacements. Or maybe you have
	a child, right, that has a rare gene mutation. Can we go in and make those repairs? All these things are around this and as the gene therapy technology continues to grow,
	the regulations continue to increase as you move through the pathway closer to commercialization and the amount of data also increases.
	Just imagine gene therapies and cell therapies are where we were 20 plus years ago with the ??? cell culture and monoclonal antibodies
	It's an amazing world where we're learning new things every day and we go, "Oh,
	this isn't platform." We need new equipment. We need new ways of working. We need to be able to analyze data sets that are very, very small because in cell therapies and gene therapies, the number of patients are typically smaller than in other indications.
	So what's next for pDNA? Well, of course, as the cell therapy and gene therapy market continues to grow, We're going to continue going on this pathway
	into commercialization. We need to be able to work with the FDA and work with them, hand in hand, because these are things that we've never done before. We're using raw materials that we don't use and other
	...other indications for medication. So there's a lot of things to be done. It's also critical to be able to make these products like the pDNA so the way that we get the vector in our appropriate
	volume, but also quality aspect. So if you have the best medication in the world, but you're not able to make it, then you don't have the medication. Right? You can't deliver it to the patients.
	So we also need to make sure that our process is well characterized. And as I mentioned earlier, many of these indications are very small. So the clinical trials are also small.
	And at the same time the patients are often very, very sick. So being able to analyze our data and also respond to their needs very quickly is very key.
	Both in the clinical aspect as well as when we become commercialized. We don't want to have this situation where, guess what, I can't make the drug, right? I want to be able to make it.
	Also manufacturing is a very important thing. So don't know if you've noticed in the news, there's been a lot of
	announcements of expansions. Of course, people are expanding capacity for vaccines, but also one of the big moves is pDNA. People are spending millions of dollars, sometimes billions of dollars
	in increasing those manufacturing sites. And you might say, well, okay, you increase your manufacturing site.
	That's great. But now I need to be able to tech transfer into that manufacturing site. I need to make sure my process is robust...robust and it not only
	can be transferred and scaled up but making sure that I have the statistical power to say
	I know that my process is in control. I might have a 2% variability but I always have a 2% variability and I have it characterized, for instance.
	And as more and more capacity comes online and as we also have shortages, it's like, where do I bring my product and taking those into consideration, so designing for manufacturing earlier.
	And you could have multiple products in your pipeline. So you want to make sure that you're learning and able to go and grab that information and say, let me do some predictive modeling on this, it might not be the exact product, but it has similar attributes.
	So with that, the path to commercialization is very integrated, just like the CMC strategy takes the clinical aspect,
	everything comes together in order to progress a molecule through. We also have
	to think about the systems aspect of it. Why? Because if we do something in the upstream space we might increase productivity to
	200%, let's say, which we be going, "yes I've made my milestone. I can deliver to my patient." But if my downstream or my cell
	recovery can't actually recover the product, whether that is a cell or a protein therapeutic for instance, then we don't have a product. All of that work is somewhat thrown out the door. So having the systems
	approach, making sure you involve all the different groups from business, supply chain, QC, discovering...everyone has knowledge that they bring to the table in order to deliver to the patient in the end, which is very key.
	So I'm going to hand it over to Phil now. I would have loved to have spoken a lot more about how we developed drugs, but let's...let's see how we can analyze some of our data. So, Phil, I'll hand it over to you now.
Philip Ramsey	Okay, so thank you, Tiffany, for that discussion
	to set the stage for what is going to be a case study. I'm going to spend most of the time in JMP demonstrating some of the important tools that exist in JMP. You may not even know that are there,
	that are actually very important to process development, especially in the context of the CMC pathway, chemistry manufacturing control, and quality by design.
	And two important characteristics of process development, and this is in general,
	is one where you want to design a process, but you also need to characterize it. In fact, you have to characterize the entire operating region.
	And of course, we want to optimize so that we have a highly desirable production. What we often don't talk about enough is these activities are inherently about prediction.
	We have to build powerful predictive models that allow us to predict future performance. That's a very important part of, especially in late stage development, for regulatory agencies. You have to demonstrate
	that you can reliably produce a product. Well, a key paper on on this issue of process characterization and prediction was very famous paper by George Box and his
	cohort Wilson, who was an engineer.
	And in that they talked about what is the beginnings of, as people note today, as response surface.
	And the key to this their work with something they called the full quadratic model. Well, what is that? Well, that's a model that contains the main effects,
	all two-way interactions and quadratic effects. And this is still probably the gold standard for building process models, especially for production.
	But what people may not realize, they're good for optimization. They're good second-order approximations to these unknown response functions. What is not as well understood is, over the entire design region, they often are a poor approximation to the response surface.
	And in 1996 the late John Cornell and, of course many people know, Doug Montgomery published a paper that is really underappreciated.
	And then that paper they raised the fact the full quadratic model often is inadequate to characterize a design space. Think about it from the viewpoint of a scientist and think how dynamic these biochemical
	processes often are. In other words, there's a great deal of nonlinearity that leads to response surfaces with pronounced compound curvature in different regions.
	And the full quadratic model simply can't deal with it. So what they propose was augmenting that design and they added things like quadratic by linear, linear by quadratic and even
	quadratic by quadratic interactions. It turns out these models do approximate design regions better than full quadratic models. I'm going to demonstrate that to you in a moment.
	But there was a problem for them. Number one, traditional statisticians didn't like the approach; that's changing dramatically these days.
	But there are a lot of these terms that can be added to a model such that even a big central composite design
	becomes super saturated. What does that mean? It means there are more unknowns p, then there are observations and to fit the models. Turns out that it's not really a constrait these days in the era of machine learning and new techniques for predictive modeling.
	So what we're going to do is, we're going to use something called fractionally weighted bootstrapping. This can be done in JMP Pro.
	And something called model averaging to build models to predict response surfaces, and I am actually going to use these large augmented models.
	Okay, so when you try to build these predictive models, say for quality by design, there are a number of things you have to be aware of. One,
	again in 1996, one of the pioneers in machine learning, the late Leo Brieman, wrote a paper that again is not nearly appreciated as much as it needs to be. And he pointed out
	that all these model building algorithms we use for prediction (and that includes forward selection, all possible models, best subsets, lasso)
	are inherently unstable. What does that mean? Being unstable means small perturbations in the data can result in wildly varying models.
	So he did some work to point this out and he suggested a strategy, said, "Well, if you could, in some way, simulate
	model fitting and somehow perturb the data on each simulation run, we could fit a large number of models and then average them." And he showed that that had potential.
	He didn't have a lot of tools in that era to do it. But today I'm going to show you in JMP Pro, we have a lot of tools and we're going to show you that Brieman's idea is actually a very good one.
	It is now one way or the other, commonly accepted in machine learning and deep learning, that is the idea of ensemble modeling and model averaging.
	By the way, I'll quickly point out years ago in the stepwise platform of JMP, John Sall instituted a form of model averaging. It's a hidden gem in JMP. Works nice and is available in both versions of JMP, but I'm going to offer a more comprehensive solution
	that can be done in JMP Pro. And this solution is referred to as fractionally weighted bootstrapping with auto validation and I'm going to explain what that means.
	When we build predictive models, we have a challenge. We need a training set to fit the model, then we need an additional
	or validation set of data to test the model to see how well it's going to predict. Well, DOE simply don't have these additional trials available. In fact, Brieman was stuck on this point.
	There's no way to really generate a validation error. Well, in 2017 at Discovery Frankfurt, Chris Gotwalt, head of statistical research for JMP, and myself
	presented a talk and what we called fractionally weighted bootstrapping and auto validation. What does auto validation mean? It means, this will not seem
	intuitive, we're going to use the training set also as a validation set. You say, "Well, that's crazy. It's the same data."
	But there's a secret sauce to this technique that makes it work. What we do is, we take the original data, copy it, call it the auto validation set,
	and then we in a special way, assign random weights to the observations and we do the weighting
	such that we drive anticorrelation between the training set and the auto validation set. And I'm going to illustrate this to you very shortly.
	Okay. And by the way, we have been supervising my PhD student Trent Lempkis, who has studied this method extensively in exhaustive simulations over the last year.
	And we will be publishing a paper to show that this method actually yields superior results to classical approaches to building predictive models from DOE.
	So I'm just going to move ahead here and talk about the case study. And this is what Tiffany mentioned
	pDNA. It's a really hot topic and pDNA manufacturer is considered a big growth area and the biotech world expect big growth, maybe even 40%
	year over year because of all the new therapies coming online where it'll be used.
	And in this case, and this is very common in the biotech world, there's not really any existing data we can use to build predictive models.
	So that leads us quite rightly to design of experiments. And in this case, we're going to use a definitive screening design.
	These are wonderful inventions of Brad Jones from JMP and Chris Nachtsheim from the University of Minnesota.
	Highly efficient and I highly recommend them all the time to people in the biotech world where you have limited time and resources for experimentation.
	So basically I'm just showing you a schematic of what a bioprocess looks like. And we're going to focus on the fermentation step.
	But in practice, as Tiffany was alluding to, we would look through the both upstream and downstream aspects of this process.
	pH, percent dissolved oxygen, induction temperature. That's the temperature we set the fermentor at to get the
	genetically modified E. Coli to start pumping out plasmids. And what are plasmids? Well, they're really non chromosomal DNA.
	And they have a lot of uses in therapies, especially gene therapies, and they're separate from the usual chromosomal DNA that you would find in the bacteria.
	So our goal is to get these modified E. Coli to pump out as much pDNA as possible. So we did the the trial. This is an actual experiment. And because we were new to DSDs, we also ran a larger, much larger traditional central composite design . And we did this separately.
	And what I plan to do is for today's work, we're going to use the CCD as a validation set and we're going to fit models using auto validation on the DSD. We'll see how it goes. Okay, so I'm going to now just switch over to JMP.
	And I'm going to open a data table. Here's the DSD data. We're going to do all our modeling on this data set. And oh, by the way, I am going to fit a 40 predictor model to a 15
	run design using machine learning techniques. And many people, you're going to have to get your head around the fact you can do these things and they're actually being done all the time in machine learning and deep learning.
	So there are a couple of add ins I want to show you that make this easy to do. You do need JMP Pro. One of them is an add in that sets up the table for you. This is by Michael Anderson of JMP.
	So I'm just going to show you what happens. The add in is available on JMP Communities.
	So notice it took the original data,
	created a validation set. And as I mentioned,
	we also have this weighting scheme and these weights are randomly generated, and as you'll see in a momen, we do a simulation study and we constantly change the weights on every
	run. And this has the effect of again generating thousands of iterations of modeling. And you'll also see, as Leo Brieman warned, as you
	perturb the responses (we don't change the data structure), you see wild variation in the models. So I'm going to go ahead and just illustrate this for you very quickly.
	So I'm going to go to fit model.
	And we have to tell JMP where the weights are stored. We're going to use generalized regression, highly recommended for this.
	And because this is a quick demo, I'm going to use forward selection, but this is a very general procedures SWB with auto validation. You can use it in many, many different prediction or modeling scenarios. I'm going to do forward selection. Okay, so I fit one model.
	And then I come down to the
	table of estimates. I'm going to right click and select simulate.
	And I tell it that I want to do some number of simulation runs, and on each trial I want to swap out the weights. I want to generate new weights. And by the way, I'll just do 10 because this is a demo.
	So there's the results. And you can see we have 10 models and all of them are quite different.
	So again, in practice, I would do thousands of these iterations. And then I'm going to show you later,
	we can then take these coefficients and average them together. And by the way, if you see zero, that means that turn did not get into a model. Okay, so what I'm going to do now is show you another add in.
	So I'm going to close some of this, so we keep the screen uncluttered.
	There's another add in that we've developed at Predictum. And this one does, not only does the faction weighted bootstrapping, but it also develops the model averaging. In other words,
	what I just showed you, the add in that you can use if you want to do model averaging, you're kind of on your own. Okay. It'll just be a lot of manual work.
	So I'm going to use the Predictum add in. It creates the table and then I'm going to actually very quickly, just to find a model to illustrate how the add in works, I'll use a standard response surface model.
	We want to predict pDNA. And we're going to use gen reg.
	So again, as an illustration, I'm just going to go ahead and use forward selection forward.
	And again, I do thousands of iterations in practice, but I'm only going to do 10.
	Click Go.
	Okay.
Philip Ramsey	Again, this is a quick, this is really, I do apologize, three talks conflated into one, but all the pieces fit together in the QbD framework. So I have a model. These are averaged coefficients. Again, I've only done 10. I'd save the prediction formula to the data table.
	And I'm going to try to keep the screen as uncluttered as possible.
	So there's my formula; the app did all the averaging for you, so you don't have to do it.
	And there's the formula. And a little trick you may not be aware of, this is a messy formula, especially if you want to deploy this formula to other data tables. In the Formula Editor, there's a really neat function called simplify.
	See, and it simplifies the equation and it makes it much more deployable to other data sets. Okay, so this was an illustration of the method.
	And what I'm going to do now is show what happened
	when we went through the entire procedure. So this is a data table. And here you'll notice the DSD and the CCD data bank combined together.
	And I've used the row state variable to eliminate or exclude the DSD data because I want to focus on performance of my models on the validation data. Again the models are fit to the DSD only.
	So here is my 41 term
	model. This is the augmented full quadratic done with model averaging over thousands of iterations. And for comparison I repeated the same process
	for the much smaller 21 term full quadratic model. So how did we do in terms of prediction? So let me show you a couple of actual by predicted plots.
	So remember, and I must strongly emphasize, this is a true validation test. The CCD is done separately. Different batches of raw material, including a new batch of the E Coli strain.
	Some of the fermenters were different, and they were completely different operators. So for those of you who work in biotech, you know, this is about as tough a prediction job as you're going to get.
	So again, the model was fit to the DSD, and on the left is the 41 term model, the augmented model, and the overall standard deviation of prediction error is about 67.
	On the right, again, I did use model averaging which helps improve performance, I fit just the 21 term full quadratic model and you can see the prediction error is about 70.
	In fact, without using model averaging as many people don't do full quadratic useful quadratic model, so performance would be significantly worse. Okay.
	So then I have the model. What do I do with it? Well, our goal is typically optimization and characterization. So let me open up a profiler. I'll actually do this for you.
	So I'm going to go to the profiler and the graph menu. I'm going to use my best model. And that's the one using the Predictum add in. And by the way, if you're interested in this add in, and even
	Beta testing it, just contact Predictum, just send email to Wayne@predictum.com and I'm sure he'd be more than happy to talk to you. So I'm going to...
	Went to the model and then using desirability,
	I'm just going to find settings that maximize production. And by the way,
	this is a major improvement over the production they were historically getting, and it gives us settings at which we should see
	on the improved performance and these were, by the way, somewhat unintuitive, but that's usually the case in complex systems.
	Things are never quite as intuitive as you think they are. And then also something really important, especially if you're doing late stage development in the CMC pathway.
	And that is, they want you to assess the importance of the inputs, which inputs are important.
	assess variable importance. Again, I won't get into all the technical details.
	So it goes through and it shows you, in terms of variation in the response, feed rate is by far the most important. That was not necessarily intuitive to people. And second is percent dissolved oxygen.
	So that, what does that tell you? Well it tells you, number one, you better control these variables very well, or you're likely to have
	a lot of variation. Now, in this particular case, I don't have critical to quality attributes. There were none available.
	So what we have is a critical to business attribute and that is pDNA production, But there's more we can do in JMP to fully characterize the design space. All I did was an optimization, but that's not characterization. So there's another wonderful tool in the profiler. Okay.
	It's called simulator. And this is just not used as much as it should be. So what I've done, I've defined distributions for the inputs.
	That is, I expect the inputs to vary. This is something like the FDA wants to know about. What happens to performance of your process as the inputs very. There are no perfectly controlled processes, especially once you scale up.
	By the way, while I think of it, these more complex models, these augmented full quadratic models,
	from experience, I can tell you they scale up better than full quadratic models. That's another reason to fit these more complex models.
	So in the simulator,there's a nice tool called simulation experiment. And what that does, it does what we call a space filling design. It distributes the points over the whole design region. So I'm going to just say I want to do 256 runs.
	and it's going to do 5000 simulations, at each point calculate a mean standard deviation and overall defect. Right.
	So this actually goes pretty quickly. And I'm just showing you what the output looks like. And again, I've already done this.
	So, in the interest of time, I'm just going to open another data table.
	Minimize the other one.
	So this is the results of the simulation study.
	And I won't get into all the details, but I fit a model to the main, I fit the model to the standard deviation, and I fit a model to overall defect rate.
	And the defect rates in some areas are low, in some of them are relatively high and these are what we call Gaussian process models, which are commonly used with simulated data. So what can we do with these
	models and with these simulation results? Well, again, characterization is important. So let me just give you a quick idea.
	Here's a three dimensional scatter plot, we're looking at feed rate and percent DO, because they're really important. And the plotted points are weighted by defect rate; bigger
	spheres mean higher defect rates. So if you look around this. You can see there are some regions where we definitely do not want to operate.
	So we are characterizing our design spaces and finding safer regions to operate in. And of course, I could do this for
	just pick some other variables and, in any case, it's just showing other regions you really want to avoid. And we can do more with this, but I think that makes the point. Where we can also
	go ahead and again use the profiler and I'm going to re optimize.
	But I'm going to do it in a different way. This way I want to maximize mean pDNA.
	And I want to do a dual response. And I want to minimize overall defect rate.
	So again, I'm going to go ahead and use desirability.
	This takes a few minutes. These are very complex models that we're optimizing.
	And notice, it comes up and says
	high feed rate, high DO,
	close to neutral pH and the induction. By the way, induction, if you want to know what induction OD 600 is, that's a measure of microbial mass
	and once you reach a certain mass (no one's quite sure what that is, so that's why we do the experiment) you then ramp up the temperature of the matter. And this actually
	forces the E. Coli to start pumping out pDNA or plasmids, and they're engineered to do this. So we call that the induction temperature. Okay. Well, notice at the settings,
	we are guaranteed a low defect rate, the overall optimize response wasn't as high. But remember, we're also going to have a process less prone to generating defects.
	Okay, so at this point,
	I'll just quickly go to the end.
	The slide. So everything is in these slides. They've all been uploaded
	to
	JMP communities. And at the end of this is an executive summary and basically what we're showing you is that process and product development
	using the CMC pathway (and a part of that is quality by design) requires a holistic or integrated approach. A lot of systems thinking needs to go into it.
	Process design and development is inherently a prediction problem, and that is the domain of machine learning and deep learning. It is not what you might think; it's not business as usual for building
	models in statistics, especially for prediction. We've shown you that fractionally weighted bootstrapping auto validation and model averaging
	can generate very effective and accurate predictive models. And I also, again, I want to emphasize these more complex augmented models of Cornell and Montgomery are actually quite important.
	They, they really do scale better and they do give you better characterization. And with that, I thank you and I will end my presentation.

Presenter

Philip Ramsey

Files

Characterizing Bio-processes with FWB_Discovery_Cary 2020.pdf