Process characterization (PC) is the evaluation of process parameter effects on quality attributes for a pharmaceutical manufacturing process that follows a Quality by Design (QbD) framework. It is a critical activity in the product development life cycle for the development of the control strategy. We have developed a fully integrated JMP add-in (BPC-Stat) designed to automate the long and tedious JMP implementation of tasks regularly seen in PC modeling, including automated set up of mixed models, model selection, automated workflows, and automated report generation. The development of BPC-Stat followed an agile approach by first introducing an impressive prototype PC classification dashboard that convinced sponsors of further investment. BPC-Stat modules were introduced and applied to actual project work to both improve and modify as needed for real world use. BPC-Stat puts statistically defensible, flexible, and standardized practices in the hands of engineers with statistical oversight. It dramatically reduces the time to design, analyze, and report PC results, and the subsequent development of in-process limits, impact ratios, and acceptable ranges, thus delivering accelerated PC results.

Welcome to our presentation on BPC- Stat:

Accelerating process characterization

with agile development of an automation tool set.

We'll show you our collaborative journey

of the Merck and Adsurgo development teams.

This was made possible by expert insights, management sponsorship,

and the many contributions from our process colleagues.

We can't overstate that enough.

I'm Seth Clark.

And I'm Melissa Matzke.

We're in the Research CMC Statistics group at Merck Research Laboratories.

And today, we'll take you through the problem we were challenged with,

the choice we had to make,

and the consequences of our choices.

And hopefully, that won't take too long and we'll quickly get to the best part,

a demonstration of the solution BPC- Stat.

So let's go ahead and get started.

The monoclonal antibody or mAb fed- batch process

consists of approximately 20 steps or 20 unit operations

across the upstream cell growth process and downstream purification process.

Each step can have up to 50 assays

for the evaluation of process and product quality.

Process characterization is the evaluation of process parameter effects on quality.

Our focus is the statistics workflow

associated with the mAb process characterization.

The statistics workflow includes study design using Sound DOE principles,

robust data management, statistical modeling and simulation.

This is all to support parameter classification

and the development of the control strategy.

The goal for the control strategy is to make statements

about how the quality is to be controlled,

to maintain safety and efficacy through consistent performance and capability.

To do this, we use the statistical models developed from the design studies.

Parameter classification is not a statistical analysis,

but it can be thought of as an exercise

of translating statistically meaningful to practically meaningful.

The practically meaningful effects

will be used to guide and inform SME  (subject matter expert) decisions

to be made during the development of the control strategy.

And the translation from statistically to practically meaningful

is done through a simple calculation.

It's the change of the attribute mean, that is the parameter effect,

relative to the difference in the process mean

and the attribute acceptance criteria.

And depending on how much  of the acceptance criteria range

gets used by the parameter effect

determines whether that process has a practically meaningful effect on quality.

So we have a defined problem -

the monoclonal antibody  PC statistics workflow

and study design through control strategy.

How are we going to implement the statistics workflow in a way

that the process scientists and engineers will actively participate in

and own these components of PC, allowing us, the statisticians,

to provide oversight and guidance and allow us to extend our resources.

We had to choose a statistical software that included data management,

DOE, plotting, linear mix model capabilities.

Of course, it was extendable through automation,

intuitive, interactive and fit- for- purpose without a lot of customization.

And JMP was an obvious clear choice for that.

Why?

Because it has extensive customization and automation through JSL;

many of our engineers,

it's already their current go- to software for statistical analysis;

we have existing experience training in our organization;

and it's an industry- leading DOE with data interactivity.

The profiler and simulator in particular is very well suited for PC studies.

Also, the scripts that are produced are standard, reproducible, portable analysis.

We're using JMP to execute study design,

data management, statistical modeling and simulation

for one unit operation or one step that could have up to 50 responses.

The results of the analysis are moved to a study report

and used for parameter classification.

And we have to do all these 20 times.

And we have to do this for multiple projects.

So you can imagine it's a huge amount of work.

When we started doing this initially, we were finding that we were doing

a lot of editing of scripts before the automation.

We had to edit scripts to extend a copy existing analysis to other responses,

we had to edit scripts to add where conditions,

we had to edit scripts to extend the models to large scale.

We want to stop editing scripts.

Many of you may be familiar with the simulator set up can be very tedious.

You have to set the distribution for each individual process parameter.

You have to set the add random noise,

and you have to set the number of simulation runs and so on and so on.

There are many different steps,

including the possibility of editing scripts if we want to change

from an internal transform to the explicit transform.

The simulator is doing add random noise

on a log- normal type basis, for example, the log transform.

We want to stop that manual set up.

Our process colleagues and us we're spending enormous time

compiling information for the impact ratio calculations

that we use to make parameter classification decisions.

We are using profiles to verify those

and assembling all this information into a heat map

that then would be very tedious exercise

to decompose the information back to the original files.

We want to stop this manual parameter classification exercise.

Of course, last but not least, we have to report our results.

And the reporting involves copying,

or at least in the past involved copying and pasting from other projects.

And then, of course, you have copy and paste errors,

copy and pasting from JMP,

you might put the wrong profile or the wrong attribute and so on.

We want to stop all this copying and pasting.

We clearly had to deal with the consequences of our choice to use JMP.

The analysis process was labor- intensive, taking weeks, sometimes months

to finish an analysis for one unit operation.

It was prone to human error and usually required extensive rework.

It posed to be an exceptional challenge to train colleagues to own the analysis.

We developed a vision with acceleration in mind

to enable our colleagues

with a standardized yet flexible platform approach

to the process characterization and statistics workflow.

So we had along the way some guiding principles.

As we mentioned before, we wanted to stop editing JMP scripts

so any routine analysis, no editing is required.

And our JMP analysis, they need to stand on their own,

they need to be portable without having to install BPC-Stats

so that they live on only requiring the JMP version they were used.

We collected constant feedback, we constantly updated,

we constantly tracked issues relentlessly, updating sometimes daily,

meeting weekly with Adsurgo our development team.

Our interfaces, we made sure that they were understandable.

We use stoplight coloring such as green for good,

yellow for caution, and issues flagged with red.

We had two external approaches or inputs into the system,

an external standardization, which I'll show a bit later,

where the process teams

define the standard requirements for the analysis file.

And our help files, we decided to move them all externally

so that they can continue to evolve as the users in the system evolved.

We broke our development into two major development cycles.

Early development cycle,

where we develop proof of concepts so we would have a problem,

we would develop a proof of concept working prototype to address that problem.

We would iterate on it until it was solving the problem

and it was reasonably stable.

And then we moved it into the late development cycle

and continued the agile development, where in this case,

we applied it to actual project work and did very careful second- person reviews

to make sure all the results were correct and continue to refine the modules

based on feedback over and over again to get to a more stable and better state.

One of these proof of concept wow factors that we did at the very beginning

was the Heat Map tool,

which brought together all kinds of information in the dashboard

and it saved the team's enormous amounts of time.

I'll show you an example of this later.

But you can see the quotes on the right- hand side,

they were very excited about this. They actually helped design it.

And so we got early engagement, we got motivation for additional funding

and a lot of excitement generated by this initial proof of concept.

In summary, we had a problem to solve, the PC statistics workflow.

We had a choice to make and we chose JMP.

And our consequences copying and paste, manual, mistakes, extensive reworking.

We had to develop a solution, and that was BPC-Stat.

I'm extremely happy to end this portion of our talk

and let Seth take you through a demonstration of BPC-Stat.

We have a core demo that we've worked out

and we're actually going to begin at the end.

In the end, which is that proof of concept Heat Map tool

that we had briefly shown a picture of is when the scientists have completed

all of their analysis and the files are all complete,

the information for impact ratio is all available

and collected into a nice summary file,

which I'm showing here.

Where for each attribute in each process factor,

we have information about the profiler,

it's predicted means and ranges as the process parameter changes.

So we can compute changes across that process factor in the attribute

and we can compute that impact ratio that we mentioned earlier.

Now, I'm going to run this tool and we'll see what it does.

So first, it pulls up one of our early innovations by the scientists.

We organized all this information that we had across multiple studies.

Now, this is showing three different process steps here

and you can see on the x- axis, we have the process steps laid out.

In each process step, we have different studies that are contributing to that,

we have multiple process parameters,

and they are all crossed with these attributes that we use

to assess the quality of the product that's being produced.

And we get this Heat Map here.

The white spaces indicate places

where the process parameter either dropped out of the model or was not assessed.

And then the colored ones are where we actually have models built.

And of course, the intensity of the heat

is depending on this practical impact ratio.

This was great solution for the scientists,

but it still wasn't enough because we had disruptions to the discussion.

We could look at this and say, okay, there's something going on here,

we have this high impact ratio,

then they would have to track down where is that coming from.

Oh, it's in this study. It's this process parameter.

I have to go to this file.

I look up that file, I find the script among the many scripts.

I run the script,

I have to find the response and then finally I get to the model.

We enhance this so that now it's just a simple click and the model is pulled up.

The relevant model is pulled up below.

You can see where that impact ratio of one is coming from.

Here, the gap between the predicted process mean

and the attribute limit is the space right here.

And the process parameter trend is taking up essentially the entire space.

That's why we get that impact ratio of one.

Then the scientists can also run their simulators

that have been built  or are already here, ready to go.

They can run simulation scenarios to see what the defect rates are.

They can play around with the process parameters

to establish proven acceptable ranges that have lower defect rates.

They can look at interactions, the contributions from that.

They can also put in notes over here on the right

and save that in a journal file

to keep track of the decisions  that they're making.

And notice that all of this is designed to support them,

maintaining their scientific dialogue, and prevent the interruption to that.

They can focus their efforts on particular steps.

So if I click on a step, it narrows down to that.

Also, because our limits tend to be in flux,

we have the opportunity to update those

and we can update them on the fly to see what the result is.

And you can see here how this impact changed

and now we have this low- impact ratio

and say, how does that actually look on the model?

The limit's been updated now, you can see there's much more room

and that's why we get that lower impact ratio and we'll get lower failure rates.

That was the heat map tool and it was a huge win

and highly motivated additional investment into this automation.

I started at the end,

now I'm going to move to the beginning  of the process statistics workflow,

which is in design.

When we work with the upstream team,

they have a lot of bio reactors that they run in each of these runs.

This is essentially a central composite design.

Each of these runs is a bio reactor and that bio reactor sometimes goes down

because of contamination or other issues that essentially are missing at random.

So we built a simulation

to evaluate this to potential losses called evaluate loss runs.

And we can specify how many runs are lost.

I'm just going to put something small here for demonstration

and show this little graphic.

What it's doing is it's going through randomly selecting points

to remove from the design in calculating variance inflation factors,

which can be used to assess multi linearity

and how well we can estimate model parameters.

And when it's done, it generates a summary report.

This one's not very useful because I had very few simulations,

but I have another example here.

This is 500 simulations on a bigger design.

And you can see we get this nice summary here.

If we lose one bio reactor, we have essentially a zero probability of

getting extreme variance inflation factors or non- estimable parameter estimates.

And so that's not an issue.

If we lose two bio reactors up to about 4%, that's starting to become an issue.

So we might say for this design, two bio reactors is a capability of loss.

And if we really wanted to get into the details,

we can see how each model parameters impacted on variance inflation

for given number of bioreactors lost,

or we can rank order all the combinations of bioreactors that are lost,

which specific design points are impacting the design the most,

and do further assessments like that.

That's our simulation and that's now a routine part of our process that we use.

I'm going to move on here to standardization.

We talked about the beginning  of the process statistics workflow,

the end of the process statistics workflow,

now I'm going to go back to what is the beginning of BPC-Sta t itself.

When people install this, they have to do a setup procedure.

And the setup is basically specifying the preferences.

It's specifying those input parameter,

input standardizations that we had talked about earlier,

as well as the help file, what process area they're working with,

and default directory that they're working with.

And then that information is saved into the system and they can move forward.

Let me just show some examples here of the standardization files and the Help file.

Help file can also be pulled up under the Help menu.

And of course, the Help file is the JMP file itself.

But notice that it has in this location column,

these are all videos that we've created that explain and they're all timestamped

and so users can just figure out what they're looking for, what feature.

Click on it, immediately pull up the video.

But what's even more exciting about this  is in all the dialogues of BTC-Stat,

when we pull up a dialogue and there's a Help button there,

it knows which row in this table to go to get the help,

and it will automatically pull up.

If I click that Help button, it will automatically pull up

the associated link and training to give immediate help.

That's our Help file.

The standardization.

We have standardizations that we work with the teams

to standardize either across projects or within a specific project,

depending on the needs and for process areas.

We had this problem early on that we weren't getting consistent naming

and it was causing problems and rework.

Now, we have this standardization put in place.

Also, the reporting decimals that we need to use,

the minimum recorded decimals, what names we use when we write this

in a report, our unit, and then a default transform to apply.

That's our attribute standardization.

And then for our group standardization, it's very similar identifying columns,

except we have this additional feature here that we can require

only specific levels be present and otherwise will be flagged.

We can also require that they have a specific value ordering.

So let, for example, the process steps are always listed in process step order,

which is what we need in all our output.

Okay, so I'm going to show an example of this.

Let me see if I can close

some of the previous stuff that we have open here.

Okay, so let me go to this example.

So here's the file. The data has been compiled.

We think it's ready to go, but we're going to make sure it's ready to go.

So we're going to do the standardization.

First thing is looking at attribute

that recognizes the attributes that are erred the standard names.

And then what's left is up here.

And we can see immediately these are process parameters,

which we don't have the standardization set up for.

But we see immediately that something's wrong with this,

and we see in the list of possible attributes that the units are missing.

We can correct that very easily.

It will tell us what it's doing, we're going to make that change.

And then it generates a report,

and then that stoplight coloring and says, oh, we found these.

This is a change we made, pay attention to this caution.

These are ones we didn't find.

And this report is saved back  to the data table

so it can be reproduced on demand.

And I'll go through the group standardization

just to take a quick look at that.

Here, it's telling me red, stop light coloring.

We have a problem here, you're missing these columns.

The team has required that this information be included.

It's going to force those columns onto the file.

We have the option with the yellow to add additional columns.

And so we'll go ahead and run that, and it's telling us what it's going to do.

And then it does the same thing, creates a report.

And we look through the report and we notice something's going on here.

Process scale.

Our process scale can only have large lab micro.

Guess what, we have a labbb. We have an extra B in there.

So that's an error. If we find that value, correct it.

Rerun the standardization and everything is good there.

I did want to point out one more thing here.

You'll see that these are our attributes,

there are these little stars indicating properties.

The properties that are assigned when we did the standardization

is this custom property table deck.

And that's going to pass information  to the system

about what the reporting precision is when it generates tables.

Also, our default transformation for HCP was logged,

so it automatically created the log transform for us.

So we don't have to do that.

Okay.

That's the standardization, let's move on to a much more interesting things now.

The PC analysis.

Before I get to that, I just want to mention that

we have a module for scaled- down model qualification.

And essentially, it's using JMP's built- in equivalents testing.

But it enhances it by generating some custom plots

and summarizing all that information at a table that's report ready.

It's beautiful.

Unfortunately, we don't have time to cover that.

I'm going to go now into the PC analysis, which I'm really excited about here.

I have this.

Standardization has already been done.

We have this file that contains lab results,

experimental design that's been executed at labscale.

We have large scale data in here. We can't execute...

It's not feasible to execute BOEs at large scale,

but we have this at the control point.

We want to be able to project our models to that large scale.

And because we have different subsets and we have potentially different models,

this one only has a single model,

but we can have different models and different setups,

we decided to create a BPC-Stat workflow and we have a workflow set up tool

that helps us build that based on the particular model we're working with.

I can name each of these workflows and I provide this information

that it's going to track throughout the whole analysis.

What is our large- scale setting, what are our responses?

Notice this is already populated for me.

It looked at the file and said, oh, I know these responses,

they're in your standardized file, they're in this file, they exist.

I assume you want to analyze this and they get pre- populated.

It also recognize this as a transform

because it knows that for that HCP, we want that on a log transform.

And it's going to do it internally transform,

which means JMP will automatically back transform it

so that scientists can interpret it on the original scale.

There are some additional options here. This PC block right now is fixed.

In some cases, the scientists want to look at the actual PC Block means.

But for the simulation, we're interested in a population- type simulation.

We don't want to look at specific blocks, we want to see what the variability is.

So we're going to change that PC Block factor

into a random effect when we get to the simulation.

And we're going to add a process scale to our models

so we can extend our model to a large scale.

The system will review the different process parameters and check the coding.

If there's some issues here or a coding missing,

it will automatically flag that with the stoplight coloring.

We have here the set point. Very tedious exercise, annoying.

We constantly want to show everything at the set point in the profilers

because that's our control,

not the default that JMP's calculating the mathematical center.

So we built this information in so that it could be automatically added.

And then we can define the subsets for an analysis.

And for that, we use a data filter.

I'll show here for this data filter

and there's explanation of this in the dialogue.

But we want to do summary statistics on a small scale.

So I go ahead and select that.

It gives feedback on how many rows are in that subset

and what the subset is so we can double- check that that's correct.

And then for the PC analysis, in this case,

I have the model setup  so that, of course,

it's going to analyze the DOE with the center point

but it's also going to do  this single factor study,

or what they call OFAT and SDM block,

separate center points that were done as another block.

And that's all built into another block in the study for that PC block.

Lastly, I can specify subsets for confirmation points,

which they like to call verification points,

to check and to see how well the model is predicting.

We don't have those in this case.

And for what is our subset for large scale,

that would include both the lab and the large scale data.

Since it's all the data in this case, I don't have to specify any subset.

Now, I have defined my workflow.

I click okay, and it saves all that information right here as a script.

If I right- click on that, edit it, you can see what it's doing.

It's creating a new namespace.

It's got the model in there, it's got all my responses,

and everything I could need for this analysis.

A s soon as you see this, you start thinking, well,

if I have to add another response, I can stick another response in here.

But that violates the principle of no script editing.

Well, sometimes we do it, but don't tell anybody.

What we did is we built a tool that has a workflow editor

that allows us to go ahead back into that workflow

through point and click and change some of its properties and change the workflow.

I'm going to go ahead now and do the analysis.

And this is where the magic really comes in.

When I do the PC analysis set up,

it's going to go take that workflow information and apply it

across the entire set of scripts that we need for our analysis.

And you see what it just did there.

It dropped a whole bunch of scripts. It grouped them for us.

Everything is ready to go.

It's a step- by- step process, the scientists can follow it through.

If there are scripts that are not applicable,

they can remove those scripts and they're just gone.

We don't worry about them.

And then for the scripts that are present, we have additional customization.

These are essentially generator scripts.

And you can see it generates a dialogue

that's already pre- populated with what we should need,

but we have additional flexibility if we need it.

And then we can get our report and we can enhance it as we need to,

in this case, subsets I may want to include.

And then resave the script back to the table and replace the generator script.

Now, I have a rendered script here that I can use that's portable.

Then for the PC analysis, we have data plots.

Of course, we want to show our data.

Always look at your data, generate the plots.

There's a default plot that's built.

And now the user, we only did one plot

because we wanted the user to have the option to change things

so they might go in here, say, get rid of that title.

I just changed the size and I add a legend, whatever.

You can change the entire plot if they wanted to.

And then one of their all- time favorite features of BPC-Stat

seems to be this repeat analysis.

Once we have an analysis we like, we can repeat it.

And what this is doing is it's hacking the column switcher

and adding some extra features onto it.

It'll take the output, dump it in a vertical list box or tab box,

and allow us to apply our filtering either globally or to each analysis.

Now, I'm in the column switcher

and I can tell it what columns I want it to do.

This works for any analysis, not just plotting.

Click OK.

It runs through the switcher, generates the report.

There I have it. All my responses are plotted.

That was easy.

I go down and there's the script that recreates that.

I can drop it here, get rid of the previous one.

Done.

Descriptive statistics. Here we go.

It's already done.

I have the subsetting applied, to have the tables I need.

Look at this.

It's already formatted to the number of decimals I needed

because it's taking advantage of those properties that we had assigned,

those unique properties based on this table standardization.

So that one is done.

And then the full model.

Full model is set up, ready to go for what would you think?

It's ready to go for residual assessment.

We can go through each of the models one at a time

and take a look to see how the points are behaving, the lack of fit.

Does it look okay?

Here, we have one point that may be a little bit off we might want to explore.

Auto recalc is already turned on,

so I can do a row exclude and it will automatically update this.

Or we have a tool that will exclude  the data point in a new column

so that I can analyze it side by side.

And then since I've already specified my responses,

in order to include that side by side,

I would have to go back and modify my workflow.

And we have that workflow editor to do that.

I'm just going to skip ahead to save some time where I've already done that.

This is the same file, same analysis,

but I've added an additional response and it's right here.

Yield without run 21.

Now, scientists can look at this side by side and say,

you know what, that point, yeah,  it's a little bit unusual statistically,

but practically there's really no impact.

All right, let's take this further.

This is our routine process.

We do take it all the way through the reduced model

because we want to see if it impacts model selection.

We have automated the model selection

and it takes advantage of the existing stepwise for forward AIC

or the existing effects table where you can click to remove by backward

selection manually if you want, this automates the backward selection

which we typically use for split pot designs.

We also have a forward selection for mixed models which is not currently

a JMP feature and JMP that we find highly useful.

I'm going to go ahead, since it's a fixed model,

I'm going to do that  and gets the workflow information.

I know I need to do this on the full model.

It goes ahead and does the selection.

What it's doing in the background here is it's running each model,

it's constructing report of the selection that it's done

in case we want to report that.

And it's going to save those scripts back to the data table.

There's that report right there that contains all the selection process.

Those scripts were just generated and dumped back here.

Now, I can move those scripts back into my workflow section.

I know the reduced model goes in there

and this is my model selection step history.

I can put that information in there.

Okay, so this is great.

Now, when I looked at my reduced model, I have that gone through the selection.

Now I can see the impact of the selection on removing this extra point here.

And again, we see there's just basically,

likely the scientists would conclude there's just no practical difference here.

And they could even go, and should go

and look at the interaction profilers as well, compare them side by side.

This is great.

We want to keep this script because we want to keep track

of the decisions that we made, so there's a record of that.

But we also want to report the final model.

So we want a nice, clean report.

We don't want that without running response in there

because we've decided that it's not relevant,

we need to keep all the data.

Another favorite tool that we developed is the Split Fit Group

which allows us to break up  these fit groups

We have the reduced model here. Take the reduced model.

And allows us to break them up into as many groups as we want.

In this case, we're only going to group it up into one group

because we're going to eliminate one response.

We want one group.

When we're done, we're just using this to eliminate this response

we no longer want in there.

Click Okay.

That's some feedback from the model fitting, and boom, we have it.

The fit group is now there and without response analysis has been removed.

Now we have this ready to report.

Notice that the settings for the profiler, they're all the settings we specified.

It's not at the average point or the center point,

it's at the process set point,

which is where we need it to be for comparison purposes.

It's all ready to go.

Okay, so that generates that script there when I did that split.

I can put it up here and I can just rename that.

That's final models.

Okay, very good.

Now, for some real fun.

Remember we had talked about how tedious it is

to set up this simulation script.

Now watch this. Watch how easy this is for the scientists.

And before I do this, I want to point out that this was,

of course, created by a script,  obviously JSL,

but this is a script that creates a script that creates a script.

So this was quite a challenge for Adsurgo to develop.

But when I run this, I can pick my model here,

final models, and then in just a matter of seconds,

it generates the simulation script that I need.

I run that, and boom. There it is, all done.

It set up the ranges that I need for the process parameters.

It's set them up to the correct intervals. It set the ad random noise.

But there's even more going on here than what appears.

Notice that the process scale has been added,

we didn't have that in the model before.

That was something that was added so that we could take these labs,

scale DOE models, and extend them to the large scale.

Now we're predicting large scales. That's important.

That was a modification to the model.

Previously, very tedious editing of the script was required to do that.

Notice that we also have this PC block random effect in here we had specified

because we don't want to simulate specific blocks,

now it's an additional random effect.

And the total variance is being plugged into the standard deviation

for the ad random noise, not the default residual random noise.

We also added this little set seed here so we can reproduce our analysis exactly.

So this is really great.

And again, notice that we're at the process set point where it should be.

Okay, last thing I want to show here is the reporting.

We essentially completed the entire analysis,

you can see it's very fast.

We want to report these final models out into a statistics report.

And so we have a tool to do that.

And this report starts with a descriptive statistics.

I'm going to run that first,

and then we're going to go and build the report, export stats to Word.

And then I have to tell it which models do I want to export.

It's asking about verification plots.

We didn't have any in this case for confirmation points.

So we're going to skip that.

And then it defaults to the default directory that we set the output.

I'm going to open the report when I'm done.

And this is important.

We're leaving the journal open for saving and modification.

Because as everybody knows, when you copy stuff,

you generate your profilers,

you dump them in Word, and there's some clipping going on.

We may have to resize things,

we may have to put something on a log scale.

We can do all that in the journal and then just resave it back to Word.

That saves a step. So we generate that.

I click okay here.

It's reading the different tables and the different profilers,

and it's generating this journal up here.

That's actually the report that it's going to render in Word.

And it will be done in just a second here.

Okay.

And then just opening up Word.

And boom, there's our report.

So look at what it did.

It puts captions here, it put our table.

It's already formatted to the reporting precision that we need.

It has this footnote that it added, meeting our standards.

And then for each response, it has a section.

And then the section has the tables with their captions, and the profilers,

and interaction profiler, footnotes, et cetera.

And it repeats on and on for each attribute over and over.

It also generates some initial text

that the scientists can update with some summary statistics.

And so it's pretty much ready to go and highly standardized.

That completes the demo of the system.

Now, I just have one concluding slide that I want to go back to here.

So, in conclusion, BPC-Stat, it's added value to our business.

It's enabled our process teams. It's paralyzed the work.

It's accelerated our timelines.

We've implemented a standardized, yet flexible systematic approach

with that higher, faster acceleration, and much more engagement.

Thank you very much.

Published on ‎05-20-2024 07:52 AM by | Updated on ‎07-23-2025 11:14 AM

Process characterization (PC) is the evaluation of process parameter effects on quality attributes for a pharmaceutical manufacturing process that follows a Quality by Design (QbD) framework. It is a critical activity in the product development life cycle for the development of the control strategy. We have developed a fully integrated JMP add-in (BPC-Stat) designed to automate the long and tedious JMP implementation of tasks regularly seen in PC modeling, including automated set up of mixed models, model selection, automated workflows, and automated report generation. The development of BPC-Stat followed an agile approach by first introducing an impressive prototype PC classification dashboard that convinced sponsors of further investment. BPC-Stat modules were introduced and applied to actual project work to both improve and modify as needed for real world use. BPC-Stat puts statistically defensible, flexible, and standardized practices in the hands of engineers with statistical oversight. It dramatically reduces the time to design, analyze, and report PC results, and the subsequent development of in-process limits, impact ratios, and acceptable ranges, thus delivering accelerated PC results.

Welcome to our presentation on BPC- Stat:

Accelerating process characterization

with agile development of an automation tool set.

We'll show you our collaborative journey

of the Merck and Adsurgo development teams.

This was made possible by expert insights, management sponsorship,

and the many contributions from our process colleagues.

We can't overstate that enough.

I'm Seth Clark.

And I'm Melissa Matzke.

We're in the Research CMC Statistics group at Merck Research Laboratories.

And today, we'll take you through the problem we were challenged with,

the choice we had to make,

and the consequences of our choices.

And hopefully, that won't take too long and we'll quickly get to the best part,

a demonstration of the solution BPC- Stat.

So let's go ahead and get started.

The monoclonal antibody or mAb fed- batch process

consists of approximately 20 steps or 20 unit operations

across the upstream cell growth process and downstream purification process.

Each step can have up to 50 assays

for the evaluation of process and product quality.

Process characterization is the evaluation of process parameter effects on quality.

Our focus is the statistics workflow

associated with the mAb process characterization.

The statistics workflow includes study design using Sound DOE principles,

robust data management, statistical modeling and simulation.

This is all to support parameter classification

and the development of the control strategy.

The goal for the control strategy is to make statements

about how the quality is to be controlled,

to maintain safety and efficacy through consistent performance and capability.

To do this, we use the statistical models developed from the design studies.

Parameter classification is not a statistical analysis,

but it can be thought of as an exercise

of translating statistically meaningful to practically meaningful.

The practically meaningful effects

will be used to guide and inform SME  (subject matter expert) decisions

to be made during the development of the control strategy.

And the translation from statistically to practically meaningful

is done through a simple calculation.

It's the change of the attribute mean, that is the parameter effect,

relative to the difference in the process mean

and the attribute acceptance criteria.

And depending on how much  of the acceptance criteria range

gets used by the parameter effect

determines whether that process has a practically meaningful effect on quality.

So we have a defined problem -

the monoclonal antibody  PC statistics workflow

and study design through control strategy.

How are we going to implement the statistics workflow in a way

that the process scientists and engineers will actively participate in

and own these components of PC, allowing us, the statisticians,

to provide oversight and guidance and allow us to extend our resources.

We had to choose a statistical software that included data management,

DOE, plotting, linear mix model capabilities.

Of course, it was extendable through automation,

intuitive, interactive and fit- for- purpose without a lot of customization.

And JMP was an obvious clear choice for that.

Why?

Because it has extensive customization and automation through JSL;

many of our engineers,

it's already their current go- to software for statistical analysis;

we have existing experience training in our organization;

and it's an industry- leading DOE with data interactivity.

The profiler and simulator in particular is very well suited for PC studies.

Also, the scripts that are produced are standard, reproducible, portable analysis.

We're using JMP to execute study design,

data management, statistical modeling and simulation

for one unit operation or one step that could have up to 50 responses.

The results of the analysis are moved to a study report

and used for parameter classification.

And we have to do all these 20 times.

And we have to do this for multiple projects.

So you can imagine it's a huge amount of work.

When we started doing this initially, we were finding that we were doing

a lot of editing of scripts before the automation.

We had to edit scripts to extend a copy existing analysis to other responses,

we had to edit scripts to add where conditions,

we had to edit scripts to extend the models to large scale.

We want to stop editing scripts.

Many of you may be familiar with the simulator set up can be very tedious.

You have to set the distribution for each individual process parameter.

You have to set the add random noise,

and you have to set the number of simulation runs and so on and so on.

There are many different steps,

including the possibility of editing scripts if we want to change

from an internal transform to the explicit transform.

The simulator is doing add random noise

on a log- normal type basis, for example, the log transform.

We want to stop that manual set up.

Our process colleagues and us we're spending enormous time

compiling information for the impact ratio calculations

that we use to make parameter classification decisions.

We are using profiles to verify those

and assembling all this information into a heat map

that then would be very tedious exercise

to decompose the information back to the original files.

We want to stop this manual parameter classification exercise.

Of course, last but not least, we have to report our results.

And the reporting involves copying,

or at least in the past involved copying and pasting from other projects.

And then, of course, you have copy and paste errors,

copy and pasting from JMP,

you might put the wrong profile or the wrong attribute and so on.

We want to stop all this copying and pasting.

We clearly had to deal with the consequences of our choice to use JMP.

The analysis process was labor- intensive, taking weeks, sometimes months

to finish an analysis for one unit operation.

It was prone to human error and usually required extensive rework.

It posed to be an exceptional challenge to train colleagues to own the analysis.

We developed a vision with acceleration in mind

to enable our colleagues

with a standardized yet flexible platform approach

to the process characterization and statistics workflow.

So we had along the way some guiding principles.

As we mentioned before, we wanted to stop editing JMP scripts

so any routine analysis, no editing is required.

And our JMP analysis, they need to stand on their own,

they need to be portable without having to install BPC-Stats

so that they live on only requiring the JMP version they were used.

We collected constant feedback, we constantly updated,

we constantly tracked issues relentlessly, updating sometimes daily,

meeting weekly with Adsurgo our development team.

Our interfaces, we made sure that they were understandable.

We use stoplight coloring such as green for good,

yellow for caution, and issues flagged with red.

We had two external approaches or inputs into the system,

an external standardization, which I'll show a bit later,

where the process teams

define the standard requirements for the analysis file.

And our help files, we decided to move them all externally

so that they can continue to evolve as the users in the system evolved.

We broke our development into two major development cycles.

Early development cycle,

where we develop proof of concepts so we would have a problem,

we would develop a proof of concept working prototype to address that problem.

We would iterate on it until it was solving the problem

and it was reasonably stable.

And then we moved it into the late development cycle

and continued the agile development, where in this case,

we applied it to actual project work and did very careful second- person reviews

to make sure all the results were correct and continue to refine the modules

based on feedback over and over again to get to a more stable and better state.

One of these proof of concept wow factors that we did at the very beginning

was the Heat Map tool,

which brought together all kinds of information in the dashboard

and it saved the team's enormous amounts of time.

I'll show you an example of this later.

But you can see the quotes on the right- hand side,

they were very excited about this. They actually helped design it.

And so we got early engagement, we got motivation for additional funding

and a lot of excitement generated by this initial proof of concept.

In summary, we had a problem to solve, the PC statistics workflow.

We had a choice to make and we chose JMP.

And our consequences copying and paste, manual, mistakes, extensive reworking.

We had to develop a solution, and that was BPC-Stat.

I'm extremely happy to end this portion of our talk

and let Seth take you through a demonstration of BPC-Stat.

We have a core demo that we've worked out

and we're actually going to begin at the end.

In the end, which is that proof of concept Heat Map tool

that we had briefly shown a picture of is when the scientists have completed

all of their analysis and the files are all complete,

the information for impact ratio is all available

and collected into a nice summary file,

which I'm showing here.

Where for each attribute in each process factor,

we have information about the profiler,

it's predicted means and ranges as the process parameter changes.

So we can compute changes across that process factor in the attribute

and we can compute that impact ratio that we mentioned earlier.

Now, I'm going to run this tool and we'll see what it does.

So first, it pulls up one of our early innovations by the scientists.

We organized all this information that we had across multiple studies.

Now, this is showing three different process steps here

and you can see on the x- axis, we have the process steps laid out.

In each process step, we have different studies that are contributing to that,

we have multiple process parameters,

and they are all crossed with these attributes that we use

to assess the quality of the product that's being produced.

And we get this Heat Map here.

The white spaces indicate places

where the process parameter either dropped out of the model or was not assessed.

And then the colored ones are where we actually have models built.

And of course, the intensity of the heat

is depending on this practical impact ratio.

This was great solution for the scientists,

but it still wasn't enough because we had disruptions to the discussion.

We could look at this and say, okay, there's something going on here,

we have this high impact ratio,

then they would have to track down where is that coming from.

Oh, it's in this study. It's this process parameter.

I have to go to this file.

I look up that file, I find the script among the many scripts.

I run the script,

I have to find the response and then finally I get to the model.

We enhance this so that now it's just a simple click and the model is pulled up.

The relevant model is pulled up below.

You can see where that impact ratio of one is coming from.

Here, the gap between the predicted process mean

and the attribute limit is the space right here.

And the process parameter trend is taking up essentially the entire space.

That's why we get that impact ratio of one.

Then the scientists can also run their simulators

that have been built  or are already here, ready to go.

They can run simulation scenarios to see what the defect rates are.

They can play around with the process parameters

to establish proven acceptable ranges that have lower defect rates.

They can look at interactions, the contributions from that.

They can also put in notes over here on the right

and save that in a journal file

to keep track of the decisions  that they're making.

And notice that all of this is designed to support them,

maintaining their scientific dialogue, and prevent the interruption to that.

They can focus their efforts on particular steps.

So if I click on a step, it narrows down to that.

Also, because our limits tend to be in flux,

we have the opportunity to update those

and we can update them on the fly to see what the result is.

And you can see here how this impact changed

and now we have this low- impact ratio

and say, how does that actually look on the model?

The limit's been updated now, you can see there's much more room

and that's why we get that lower impact ratio and we'll get lower failure rates.

That was the heat map tool and it was a huge win

and highly motivated additional investment into this automation.

I started at the end,

now I'm going to move to the beginning  of the process statistics workflow,

which is in design.

When we work with the upstream team,

they have a lot of bio reactors that they run in each of these runs.

This is essentially a central composite design.

Each of these runs is a bio reactor and that bio reactor sometimes goes down

because of contamination or other issues that essentially are missing at random.

So we built a simulation

to evaluate this to potential losses called evaluate loss runs.

And we can specify how many runs are lost.

I'm just going to put something small here for demonstration

and show this little graphic.

What it's doing is it's going through randomly selecting points

to remove from the design in calculating variance inflation factors,

which can be used to assess multi linearity

and how well we can estimate model parameters.

And when it's done, it generates a summary report.

This one's not very useful because I had very few simulations,

but I have another example here.

This is 500 simulations on a bigger design.

And you can see we get this nice summary here.

If we lose one bio reactor, we have essentially a zero probability of

getting extreme variance inflation factors or non- estimable parameter estimates.

And so that's not an issue.

If we lose two bio reactors up to about 4%, that's starting to become an issue.

So we might say for this design, two bio reactors is a capability of loss.

And if we really wanted to get into the details,

we can see how each model parameters impacted on variance inflation

for given number of bioreactors lost,

or we can rank order all the combinations of bioreactors that are lost,

which specific design points are impacting the design the most,

and do further assessments like that.

That's our simulation and that's now a routine part of our process that we use.

I'm going to move on here to standardization.

We talked about the beginning  of the process statistics workflow,

the end of the process statistics workflow,

now I'm going to go back to what is the beginning of BPC-Sta t itself.

When people install this, they have to do a setup procedure.

And the setup is basically specifying the preferences.

It's specifying those input parameter,

input standardizations that we had talked about earlier,

as well as the help file, what process area they're working with,

and default directory that they're working with.

And then that information is saved into the system and they can move forward.

Let me just show some examples here of the standardization files and the Help file.

Help file can also be pulled up under the Help menu.

And of course, the Help file is the JMP file itself.

But notice that it has in this location column,

these are all videos that we've created that explain and they're all timestamped

and so users can just figure out what they're looking for, what feature.

Click on it, immediately pull up the video.

But what's even more exciting about this  is in all the dialogues of BTC-Stat,

when we pull up a dialogue and there's a Help button there,

it knows which row in this table to go to get the help,

and it will automatically pull up.

If I click that Help button, it will automatically pull up

the associated link and training to give immediate help.

That's our Help file.

The standardization.

We have standardizations that we work with the teams

to standardize either across projects or within a specific project,

depending on the needs and for process areas.

We had this problem early on that we weren't getting consistent naming

and it was causing problems and rework.

Now, we have this standardization put in place.

Also, the reporting decimals that we need to use,

the minimum recorded decimals, what names we use when we write this

in a report, our unit, and then a default transform to apply.

That's our attribute standardization.

And then for our group standardization, it's very similar identifying columns,

except we have this additional feature here that we can require

only specific levels be present and otherwise will be flagged.

We can also require that they have a specific value ordering.

So let, for example, the process steps are always listed in process step order,

which is what we need in all our output.

Okay, so I'm going to show an example of this.

Let me see if I can close

some of the previous stuff that we have open here.

Okay, so let me go to this example.

So here's the file. The data has been compiled.

We think it's ready to go, but we're going to make sure it's ready to go.

So we're going to do the standardization.

First thing is looking at attribute

that recognizes the attributes that are erred the standard names.

And then what's left is up here.

And we can see immediately these are process parameters,

which we don't have the standardization set up for.

But we see immediately that something's wrong with this,

and we see in the list of possible attributes that the units are missing.

We can correct that very easily.

It will tell us what it's doing, we're going to make that change.

And then it generates a report,

and then that stoplight coloring and says, oh, we found these.

This is a change we made, pay attention to this caution.

These are ones we didn't find.

And this report is saved back  to the data table

so it can be reproduced on demand.

And I'll go through the group standardization

just to take a quick look at that.

Here, it's telling me red, stop light coloring.

We have a problem here, you're missing these columns.

The team has required that this information be included.

It's going to force those columns onto the file.

We have the option with the yellow to add additional columns.

And so we'll go ahead and run that, and it's telling us what it's going to do.

And then it does the same thing, creates a report.

And we look through the report and we notice something's going on here.

Process scale.

Our process scale can only have large lab micro.

Guess what, we have a labbb. We have an extra B in there.

So that's an error. If we find that value, correct it.

Rerun the standardization and everything is good there.

I did want to point out one more thing here.

You'll see that these are our attributes,

there are these little stars indicating properties.

The properties that are assigned when we did the standardization

is this custom property table deck.

And that's going to pass information  to the system

about what the reporting precision is when it generates tables.

Also, our default transformation for HCP was logged,

so it automatically created the log transform for us.

So we don't have to do that.

Okay.

That's the standardization, let's move on to a much more interesting things now.

The PC analysis.

Before I get to that, I just want to mention that

we have a module for scaled- down model qualification.

And essentially, it's using JMP's built- in equivalents testing.

But it enhances it by generating some custom plots

and summarizing all that information at a table that's report ready.

It's beautiful.

Unfortunately, we don't have time to cover that.

I'm going to go now into the PC analysis, which I'm really excited about here.

I have this.

Standardization has already been done.

We have this file that contains lab results,

experimental design that's been executed at labscale.

We have large scale data in here. We can't execute...

It's not feasible to execute BOEs at large scale,

but we have this at the control point.

We want to be able to project our models to that large scale.

And because we have different subsets and we have potentially different models,

this one only has a single model,

but we can have different models and different setups,

we decided to create a BPC-Stat workflow and we have a workflow set up tool

that helps us build that based on the particular model we're working with.

I can name each of these workflows and I provide this information

that it's going to track throughout the whole analysis.

What is our large- scale setting, what are our responses?

Notice this is already populated for me.

It looked at the file and said, oh, I know these responses,

they're in your standardized file, they're in this file, they exist.

I assume you want to analyze this and they get pre- populated.

It also recognize this as a transform

because it knows that for that HCP, we want that on a log transform.

And it's going to do it internally transform,

which means JMP will automatically back transform it

so that scientists can interpret it on the original scale.

There are some additional options here. This PC block right now is fixed.

In some cases, the scientists want to look at the actual PC Block means.

But for the simulation, we're interested in a population- type simulation.

We don't want to look at specific blocks, we want to see what the variability is.

So we're going to change that PC Block factor

into a random effect when we get to the simulation.

And we're going to add a process scale to our models

so we can extend our model to a large scale.

The system will review the different process parameters and check the coding.

If there's some issues here or a coding missing,

it will automatically flag that with the stoplight coloring.

We have here the set point. Very tedious exercise, annoying.

We constantly want to show everything at the set point in the profilers

because that's our control,

not the default that JMP's calculating the mathematical center.

So we built this information in so that it could be automatically added.

And then we can define the subsets for an analysis.

And for that, we use a data filter.

I'll show here for this data filter

and there's explanation of this in the dialogue.

But we want to do summary statistics on a small scale.

So I go ahead and select that.

It gives feedback on how many rows are in that subset

and what the subset is so we can double- check that that's correct.

And then for the PC analysis, in this case,

I have the model setup  so that, of course,

it's going to analyze the DOE with the center point

but it's also going to do  this single factor study,

or what they call OFAT and SDM block,

separate center points that were done as another block.

And that's all built into another block in the study for that PC block.

Lastly, I can specify subsets for confirmation points,

which they like to call verification points,

to check and to see how well the model is predicting.

We don't have those in this case.

And for what is our subset for large scale,

that would include both the lab and the large scale data.

Since it's all the data in this case, I don't have to specify any subset.

Now, I have defined my workflow.

I click okay, and it saves all that information right here as a script.

If I right- click on that, edit it, you can see what it's doing.

It's creating a new namespace.

It's got the model in there, it's got all my responses,

and everything I could need for this analysis.

A s soon as you see this, you start thinking, well,

if I have to add another response, I can stick another response in here.

But that violates the principle of no script editing.

Well, sometimes we do it, but don't tell anybody.

What we did is we built a tool that has a workflow editor

that allows us to go ahead back into that workflow

through point and click and change some of its properties and change the workflow.

I'm going to go ahead now and do the analysis.

And this is where the magic really comes in.

When I do the PC analysis set up,

it's going to go take that workflow information and apply it

across the entire set of scripts that we need for our analysis.

And you see what it just did there.

It dropped a whole bunch of scripts. It grouped them for us.

Everything is ready to go.

It's a step- by- step process, the scientists can follow it through.

If there are scripts that are not applicable,

they can remove those scripts and they're just gone.

We don't worry about them.

And then for the scripts that are present, we have additional customization.

These are essentially generator scripts.

And you can see it generates a dialogue

that's already pre- populated with what we should need,

but we have additional flexibility if we need it.

And then we can get our report and we can enhance it as we need to,

in this case, subsets I may want to include.

And then resave the script back to the table and replace the generator script.

Now, I have a rendered script here that I can use that's portable.

Then for the PC analysis, we have data plots.

Of course, we want to show our data.

Always look at your data, generate the plots.

There's a default plot that's built.

And now the user, we only did one plot

because we wanted the user to have the option to change things

so they might go in here, say, get rid of that title.

I just changed the size and I add a legend, whatever.

You can change the entire plot if they wanted to.

And then one of their all- time favorite features of BPC-Stat

seems to be this repeat analysis.

Once we have an analysis we like, we can repeat it.

And what this is doing is it's hacking the column switcher

and adding some extra features onto it.

It'll take the output, dump it in a vertical list box or tab box,

and allow us to apply our filtering either globally or to each analysis.

Now, I'm in the column switcher

and I can tell it what columns I want it to do.

This works for any analysis, not just plotting.

Click OK.

It runs through the switcher, generates the report.

There I have it. All my responses are plotted.

That was easy.

I go down and there's the script that recreates that.

I can drop it here, get rid of the previous one.

Done.

Descriptive statistics. Here we go.

It's already done.

I have the subsetting applied, to have the tables I need.

Look at this.

It's already formatted to the number of decimals I needed

because it's taking advantage of those properties that we had assigned,

those unique properties based on this table standardization.

So that one is done.

And then the full model.

Full model is set up, ready to go for what would you think?

It's ready to go for residual assessment.

We can go through each of the models one at a time

and take a look to see how the points are behaving, the lack of fit.

Does it look okay?

Here, we have one point that may be a little bit off we might want to explore.

Auto recalc is already turned on,

so I can do a row exclude and it will automatically update this.

Or we have a tool that will exclude  the data point in a new column

so that I can analyze it side by side.

And then since I've already specified my responses,

in order to include that side by side,

I would have to go back and modify my workflow.

And we have that workflow editor to do that.

I'm just going to skip ahead to save some time where I've already done that.

This is the same file, same analysis,

but I've added an additional response and it's right here.

Yield without run 21.

Now, scientists can look at this side by side and say,

you know what, that point, yeah,  it's a little bit unusual statistically,

but practically there's really no impact.

All right, let's take this further.

This is our routine process.

We do take it all the way through the reduced model

because we want to see if it impacts model selection.

We have automated the model selection

and it takes advantage of the existing stepwise for forward AIC

or the existing effects table where you can click to remove by backward

selection manually if you want, this automates the backward selection

which we typically use for split pot designs.

We also have a forward selection for mixed models which is not currently

a JMP feature and JMP that we find highly useful.

I'm going to go ahead, since it's a fixed model,

I'm going to do that  and gets the workflow information.

I know I need to do this on the full model.

It goes ahead and does the selection.

What it's doing in the background here is it's running each model,

it's constructing report of the selection that it's done

in case we want to report that.

And it's going to save those scripts back to the data table.

There's that report right there that contains all the selection process.

Those scripts were just generated and dumped back here.

Now, I can move those scripts back into my workflow section.

I know the reduced model goes in there

and this is my model selection step history.

I can put that information in there.

Okay, so this is great.

Now, when I looked at my reduced model, I have that gone through the selection.

Now I can see the impact of the selection on removing this extra point here.

And again, we see there's just basically,

likely the scientists would conclude there's just no practical difference here.

And they could even go, and should go

and look at the interaction profilers as well, compare them side by side.

This is great.

We want to keep this script because we want to keep track

of the decisions that we made, so there's a record of that.

But we also want to report the final model.

So we want a nice, clean report.

We don't want that without running response in there

because we've decided that it's not relevant,

we need to keep all the data.

Another favorite tool that we developed is the Split Fit Group

which allows us to break up  these fit groups

We have the reduced model here. Take the reduced model.

And allows us to break them up into as many groups as we want.

In this case, we're only going to group it up into one group

because we're going to eliminate one response.

We want one group.

When we're done, we're just using this to eliminate this response

we no longer want in there.

Click Okay.

That's some feedback from the model fitting, and boom, we have it.

The fit group is now there and without response analysis has been removed.

Now we have this ready to report.

Notice that the settings for the profiler, they're all the settings we specified.

It's not at the average point or the center point,

it's at the process set point,

which is where we need it to be for comparison purposes.

It's all ready to go.

Okay, so that generates that script there when I did that split.

I can put it up here and I can just rename that.

That's final models.

Okay, very good.

Now, for some real fun.

Remember we had talked about how tedious it is

to set up this simulation script.

Now watch this. Watch how easy this is for the scientists.

And before I do this, I want to point out that this was,

of course, created by a script,  obviously JSL,

but this is a script that creates a script that creates a script.

So this was quite a challenge for Adsurgo to develop.

But when I run this, I can pick my model here,

final models, and then in just a matter of seconds,

it generates the simulation script that I need.

I run that, and boom. There it is, all done.

It set up the ranges that I need for the process parameters.

It's set them up to the correct intervals. It set the ad random noise.

But there's even more going on here than what appears.

Notice that the process scale has been added,

we didn't have that in the model before.

That was something that was added so that we could take these labs,

scale DOE models, and extend them to the large scale.

Now we're predicting large scales. That's important.

That was a modification to the model.

Previously, very tedious editing of the script was required to do that.

Notice that we also have this PC block random effect in here we had specified

because we don't want to simulate specific blocks,

now it's an additional random effect.

And the total variance is being plugged into the standard deviation

for the ad random noise, not the default residual random noise.

We also added this little set seed here so we can reproduce our analysis exactly.

So this is really great.

And again, notice that we're at the process set point where it should be.

Okay, last thing I want to show here is the reporting.

We essentially completed the entire analysis,

you can see it's very fast.

We want to report these final models out into a statistics report.

And so we have a tool to do that.

And this report starts with a descriptive statistics.

I'm going to run that first,

and then we're going to go and build the report, export stats to Word.

And then I have to tell it which models do I want to export.

It's asking about verification plots.

We didn't have any in this case for confirmation points.

So we're going to skip that.

And then it defaults to the default directory that we set the output.

I'm going to open the report when I'm done.

And this is important.

We're leaving the journal open for saving and modification.

Because as everybody knows, when you copy stuff,

you generate your profilers,

you dump them in Word, and there's some clipping going on.

We may have to resize things,

we may have to put something on a log scale.

We can do all that in the journal and then just resave it back to Word.

That saves a step. So we generate that.

I click okay here.

It's reading the different tables and the different profilers,

and it's generating this journal up here.

That's actually the report that it's going to render in Word.

And it will be done in just a second here.

Okay.

And then just opening up Word.

And boom, there's our report.

So look at what it did.

It puts captions here, it put our table.

It's already formatted to the reporting precision that we need.

It has this footnote that it added, meeting our standards.

And then for each response, it has a section.

And then the section has the tables with their captions, and the profilers,

and interaction profiler, footnotes, et cetera.

And it repeats on and on for each attribute over and over.

It also generates some initial text

that the scientists can update with some summary statistics.

And so it's pretty much ready to go and highly standardized.

That completes the demo of the system.

Now, I just have one concluding slide that I want to go back to here.

So, in conclusion, BPC-Stat, it's added value to our business.

It's enabled our process teams. It's paralyzed the work.

It's accelerated our timelines.

We've implemented a standardized, yet flexible systematic approach

with that higher, faster acceleration, and much more engagement.

Thank you very much.



0 Kudos