Hi, there. I'm Paul Taylor.
I'm David Hilton.
Yeah, we'll be talking to you about predicting molecule stability
in biopharm aceutical products using JMP.
We'll dive straight into it.
Oh, yes.
One thing to mention is just a little shout- out
to this paper that's been published, so Don Clancy of GSK,
which most of the work has been based on.
Just to introduce what we do.
We're part of biopharm process research,
and we're based in Stevenage in UK in our R& D headquarters.
We're the bridge between
discovery and the CMC phase, so the chemistry, manufacturing, control.
That would be the stepway
into the clinical trials and also the actual release of the medicine.
We have three main aspects.
We look at the cells.
We are looking towards developing a commercial cell line,
the molecule itself by expressing developable, innovative molecules
and the process in terms of de-risking the manufacturing and processing
purification aspects for our manufacturing facility.
Just an overview is to see
why do we need to study the stability of antibody formulations?
How do we assess the product formulation stability?
The overview of the ASM JMP add-in that we've got,
and the value of using such modeling approaches with the case studies.
Just to reiterate about biotherapeutics,
so these are all drug molecules that are protein- based.
The most common you'll probably know are all these vaccines from COVID,
so they're all based on antibodies and other proteins.
These are very fragile molecules.
The stability can be influenced by a variety of factors
during the manufacture, transportation, and storage of them.
Factors such as the temperature,
the pH could cause degradation of the protein.
The concentration of the protein itself can have a significant impact.
The salts to use, the salt type, even the salt concentration,
and even subject to light and a little bit of shear
can have an effect.
What's that cause is the aggregation of the protein fragmentation.
It can cause changes in the charge profiles,
which can then affect the binding and potency of the molecule.
That could be caused by isomerization, oxidation, and a lot more.
They're fragile little things, but also we need to keep an eye
on the stability to make sure they are safe and efficacious.
One way of looking at the stability is by subjecting to a number
of accelerated temperatures and taking various time points.
These long term stability studies can go up into five years.
It can take up to 60 months.
These will be more real time data.
The procedure is extremely resource intensive.
At each time point,
we can use a variety of analytics such as HPLC, mass spectrometry,
so essentially separating the impurities away from the main product itself,
the ones that could cause the problems
and then quantifying free mass spectrometry,
like scattering or just UV profiles itself.
We can separate by charge, size, or [inaudible 00:03:21].
Within that five years,
when we're gathering that data, a lot can happen in those five years.
We could have better developments in the manufacturing or the formulation.
Does that mean that we have to repeat
that five- year cycle again to get the stability data?
Short answer is no.
What we can do as an alternative is look at accelerated their stability studies.
These are more short term studies, so we can apply that more exaggerated
accelerated degradation temperatures and we can use shorter time periods.
In a matter of months and years, we can now look at a matter of days,
so we can go from seven to 14 to 28 days.
This technique is commonly applied in a small molecule space,
but not so much in the large Bison super space,
because of the small molecule space.
They involve a lot of tablets and solid formulations,
and it's only starting to hit trend
in the bioph arma industry with more liquid formulations.
In terms of the stability modeling,
we base our data using Arrhenius kinetic equations
that can be both linear and exponential with respect to time.
These are semi- empirical equations
based on the physical nature.
For example, accelerating
is if there's a nucleation point for aggregation,
it could cause an exponential growth.
Conversely, when you look at decelerating, when there could be a rate limits in step,
it could cause a slow growth in the degradation, too.
All of these models are fitted and performed on a fit quality assessment
through evasion information criterion, but also we can establish
confidence intervals as well using bootstrapping techniques.
This is where David Burnham at Pega Analytics.
He's worked closely with Don Clancy
on developing a JSL script or a JMP add-in for us scientists to use in a lab.
Just going to give you a quick demonstration of the JMP add-in itself.
If I can exit.
During our ASM study is you collect your data,
and obviously you put it into a JMP table.
In this instance, we're looking at the size exclusion data,
so we're looking at the monument percentage, the aggregate, and fragment.
What we got is a JMP add-in that will give us the fitting.
If I just quickly open it up.
You go through a series of steps, so you can select the type of product.
In terms of small modular space, where they're dealing
with solid tablet formulation,
you could use a model based on Aspirin, which is more of a generic approach,
or a generic tablet where it's more novel and different to Aspirin.
But for us in biopharma,
because it's all liquid formations, we look at the generic liquid.
We're good. Okay.
Inside in the data,
so just that data table I showed you earlier,
you can also do a quick QC check.
It's just a fancy check that everything matches up,
and ensure that everything is hunky-d ory,
or you can just remove it and replace it with a new data table.
The most strenuous part of this JMP add-in is to actually match the columns up.
In terms of the monomer, and aggregate, and fragments,
those are the impurities that we're interested to model.
We'll put those in the impurity columns, as well as matching up
all the other important aspects such as the time, temperature, pH,
and also batch which is something that could be of importance.
If you have a molecule that has different lot releases
and you want to see if all your lots are consistent,
this is one tool that you could possibly use
to ensure that your lots of lot variability is consistent.
Al so to ask is if you have specifications,
so if we have a gold target of having no more than 5 percent drop
in monomer,
so we're doing 100 minus the monomer in this case.
But in terms of aggregate, we have no more than three percent,
and for fragment, no more than two percent.
In terms of the model options, we can select all the models
that we want to fit and evaluate and also select just at a generic
temperature and pH you would like to look at.
You can choose that later on as well.
That can be flexible,
and also the different variants like the temperature only,
and temperature, and pH.
To save looking at fitting the data itself,
you can either go for a quick mode, which could be a two- minute quick fit,
which won't be as accurate as maybe the long term mode.
But to save you all from looking at spinning wheel of death,
I've already fit the data and then we can go straight into it.
We can fit all the models.
Once it loads up, eventually, you can have an overarching view
of the prediction profilers of each type of model that's been fitted and evaluated.
You can see that some have a confidence
that are a little bit broader and some are a little bit tighter,
so it can be either
the model doesn't reflect very well in it or it could be overfitted.
For scientists, we can then delve into selecting the candidate model.
This is where it's based on the Bayesian information criterion.
Apologies.
You can also look at those two criteria and see which one is more appropriate.
That you can use the drop down to see
how the model fits in with the actual predicted values.
Then last but not least, if we look at
when you select the preferred model, this can give you the override in...
Here you go.
You can manually override which model you'd like to choose.
But also, here in prediction profiler,
is you can select the conditions you would like and extrapolate from it.
Beyond the one month period,
you can extrapolate all the way up to two years,
and you can see how it fares in terms of its stability.
One last thing to add is bootstrap techniques.
If you want to find the control of how the bootstrapping is working
and to get more accurate modeling of the confidence intervals,
you can simply do that to each of your impurities.
Trust it not to work.
Okay, time to completion. It's done.
Okay.
You can see that you can look into it in a great detail.
Okay.
We go back to the presentation.
We'll swap these.
Okay. We'll be going into our first case study,
which is looking at the stability of our formulations.
They play an important role in drugs in general.
They, not only just helping biopharmas
in terms of stabilizing the protein during the storage or manufacture,
but also it can help aid the drug delivery when subjected to a patient.
Formulations contain many components, so they're called exceptions.
These are generally inactive components within the drug product itself
but they act as a stabilizer, so they could be the buffers,
some amino acids, stabilizers, surfactants preservatives, m etal ions, and salts.
But in terms of the formulation development,
you can screen many of the excipients to find the ideal formulation
and you can use design and experiments, and that's going into a different topic.
But one way to test and prove that your final formulation
fit the purpose is by doing stability testing.
Our case study, we've looked at three different pH of formulation
for this monoclonal map,
and we stressed them at elevated temperatures.
We looked at our ASM study from time zero all the way up to 28 days
and analyzed them through size exclusion.
A s you can see, it's the same snippet of the JMP add- in,
so you can see how the models are being fitted,
but also the extrapolation from the prediction profiler
to see how the monitoring stability fares.
When we look at the mon omer and aggregate, you can see we can take predictions
from that prediction profiler at 5, 25, 35, 40
or even other temperatures.
But within that model, we have an N1 value,
and that can reflect on how fast the degradation is,
either in acidic or basic pH.
What we found is a minus value.
It has faster degradation in acidic pH.
We found that there was a higher risk of low pH rather than a higher one.
Our next case study, which is on a similar trend,
is looking at a different m onoclonal antibody,
where we use its ASM stability study up to 28 days.
With that same molecule, we had some historical data,
which had five years worth of stability data.
What we did is just taken the data from both studies
and put them into JMP add-in and see how they compare.
What you can see is highlighted in green,
you can see that that is the model prediction.
In the bottom is long term study, the real time study.
In blue, you can see that the values fare pretty well.
Then in red is the confidence intervals.
They match up quite nicely, which is good.
But one of the downsides to ASM because it's the short term study,
if you look at the graphs itself, they seem to be quite linear.
Whereas in real time data, they seem to be a bit more curved and exponential.
But in terms of getting that
data back and that actually prediction, it's quite good.
That could help with some immediate formulation
development work that you need to do
rather than wait for long term stability data,
I'll pass it on the David.
Thanks a lot, Paul.
Paul's given an example of how we could be looking to long term study
stability predictions based on a month's worth of data.
What we intend to do though
is you're intending to design your formulation
in order to hit a certain minimum threshold for stability time spent.
In this case, what we need to do is having a fixed formulation
and we're then trying to use this technique to find out
what period of time does the product stay within a device for each threshold,
and therefore what do linear in terms of how long we can hold this material?
The material being in question is material that's been generated
during the dashing manufacturer of the bio therapeutic.
Essentially with this, you've got different unit operations,
which are linked in series.
What happens is your complete volume operation
and then depending on shift patterns or utilization of your facility,
it may be a case that you want to have holes in between different unit operations
in a way in order to regulate timings of your process.
One of the key things that we need to have here
is to know that if we're holding our material
in between the unit operation,
what's the maximum period of time we hold it for material
[inaudible 00:14:57]?
The way that we normally look to do this is we have a plot, something like that,
on the lower right hand side,
where we just
hold the material a month in a small scale study
and just do repeated analytic measurements
of our product quality interest to see how it changes
and whether it falls within tolerances.
In this case, you can see that we're looking for total 100 percent
over time on the X- axis.
In all cases, it's falling within those red bands,
so we can say it's [inaudible 00:15:32] criteria that we are after.
What we're looking to do in this study
is we were rather than looking at the standard conditions
that would be exclusively the standard conditions
that the material could be held at, so that would be 40 degrees C.
It was refrigerated or approximately 25 degrees C,
so room temperature.
What we're looking to do is to have parallel studies performed
at 30, 35, and 40 degrees C, but only for a week,
and then see if that high temperature data could be used to predict
the low temperature data.
What we've got in this slide is we've got
a few snapshots from the data that we collected,
which has been visualized within the graph builder.
The person the panel box on the left hand side, you've got the data reflected
at five degrees C and the columns in there represent
material coming from different unit operations.
Each row corresponds
to a particular form of analytics that's being deducted.
For example, if you measure the concentration
of the level of [inaudible 00:16:38] species.
On the right hand side, we've then got the equivalent data
for equivalent unit operations and analytics,
but just a higher temperature.
Just doing basic plot like this,
the first thing we can see is that the general trends
seem to be consistent.
If we were to look at the purple plots,
in this case, you can see that the first column,
so the first unit operation, we've got a descending straight line,
whereas the second and third unit operations, you have a slight increase.
Qualitatively, it looks quite promising in that an increase in temperature
isn't causing any changes to the general trends that we're observing.
In terms of a more positive prediction,
this is where we then began to use the ASM add-in.
In this case, what we've done is we've taken
the 30, 35, and 43 degrees C data and we then use that transmitter model.
In terms of the model bit quality,
you can see from the predicted versus actual plot
in the center that it appears to fit quite well,
so that's reassuring.
If you look at the model fits in the table on the top left hand side,
we can see that the model fit with the lowest BIC score
with [inaudible 00:17:57] model.
This study was that we didn't have pH as a variable.
That's why the BIC score for both of those is the same,
because we're essentially removing that parameter.
The linear genetic model
and the external linear model are essentially equivalent.
What we've then done is use this high temperature data
to fit this kinetic model and determine what these kinetic parameters would be,
so it's K 1 and K2
in the kinetic equation, show n on the bottom left of the slide.
We've then changed the temperature value in that to five and 25 degrees C
and try to predict what level of degradation we'd expect
at that temperature and over a longer period of time.
This is what's shown on the right hand side.
We have the red lines corresponding to predictions from this equation
based on the high temperature data,
and that's then fit into the experimental data
at lower temperatures just to see how good the prediction is.
In this case, you can see that actually the predictions
appears to be quite good.
It gives you cause of comfort in this case.
Sometimes, however,
we noticed that this wasn't always the case.
In this example, again,
with the high temperatures, we've been able to fit...
Have good model fittings which are predicted versus actual fit
is quite strong.
In this case, however, the model fit has been
stipulated to be the best, in this case, is the accelerating connect model,
so indicating that the reaction rate's getting faster over time.
We then apply the same procedure to this set of data and we start trying
to model what would be happening at lower temperatures.
We can begin to see that the prediction is a little bit erratic.
In reality, the increase in
the level of this particular purity was fairly linear.
The model was predicting that it was beginning to overestimate.
It's quite a drastically high time points.
I guess one thing that's important to bear in mind with this
is that you also need to have an incorporated level of subject matter
knowledge when applying these kind of technique techniques also.
You need to have the balance
between what's the best statistical model in terms of fit,
but also what's the most physically representative of the type of system
that we're dealing with.
In terms of subject matter knowledge,
another thing that's an area where it's important within this technique
is the selection of the temperature that you can use in this study
and the temperature range that you're going to look at.
There's two reasons for this, and you've got competing forces.
It's preferable to use as high temperature as possible
because that means that the reactions in exceeded a faster rate.
One of the issues that we often encounter with this type of study
is that at low temperatures,
fortunately for us, we're often dealing with products which are quite stable.
But the inherent problem with that is when you're trying to use
quite short time, quite narrow time series later
in order to measure these changes, you often end up getting caught
in the noise and your signal to noise ratio ends up being quite low.
That's what's being demonstrated within these plots.
The plots at the left hand side, are a kinetic rate fits, or reduced plots.
What you typically expect is that an entirely temperature driven behavior,
you'd have straight line.
If you look at the top left plot, you can see that that's the case.
We've got the first four blue points, which are forming straight line.
That corresponds to 40, 25 degrees C.
Across all of those temperatures,
we've got entirely temperature preventing that behavior.
But then the five degrees C point
on the right hand side of that plot appears to be off.
But when you dive into the data and find out whether this was
because it was not available in that temperature dependence,
if you look at the equivalent plot on the right hand panel,
you then begin to see that actually because there's so much noise in the data
that it's more of a fitting issue rather than a mismatch issue.
If you were to look at the gradient of, say, one of those red line, blue line,
despite the fact that the intercepts can be different,
they can easily fit that data because there's just too much noise
in the data to really be able to fully understand which one should be applied.
In terms of general conclusion, though, I think what this is
for this project we've been able to demonstrate is that JMP itself
has a number of powerful built-in tools
and with lots of knowledge of JMP scripting language
or someone who can do this for you, those can be compiled
in some form of user friendly package, which can then be used
for quite complex analysis, which makes it accessible to most users.
It's also demonstrated that by performing statistical fits to semi-empirical models,
we've actually got a lot of tangible benefits from that
and that we're able to make predictions about the future
which in the past, we've not been able to do,
and potentially, significantly reduce our timelines
in terms of identifying liabilities with particular drug products.
Frankly, this also demonstrates
the importance that you can't be in areas such as this.
You can't rely exclusively on statistical models.
You also have to incorporate with your own subject matter knowledge.
Try and work out which statistical model
or kinetic model, whatever it might be, is the most appropriate
to the situation you've got, and then which of those is the best fit.
In terms of acknowledgements as also getting a lot of this work
has been based on an original paper,
which came out of GSK by D on Clancy, Neil, Rachel, Martin, John.
This has been extended, so a biotherapeutic setting.
We also thank George and A na
for the supplying data which has been used to build this project,
and Ricky and G ary for the project endorsement
[inaudible 00:24:20].