Hello. My name is Laura Lancaster. I am a statistical developer in the JMP group. Today, I'm here to talk to you about dissolved combinations and operating region optimization for critical quality attributes with the JMP 17 profiler.
Everything we're going to talk about today has to do with the Prediction Profiler, and I hope that everyone is familiar with it. It's a wonderful tool. But if you're not, the Prediction Profiler is a tool in JMP that's great for interactively exploring, visualizing, and optimizing the models that you create in JMP. Specifically, we're going to talk about two recent new features that were added to the Prediction Profiler. The first is the ability to explore and optimize the models that you've created in DOE in JMP that have known disallowed combination strength. The second is the ability to determine an optimal operating region for your manufacturing processes that ensure both quality and maximum production flexibility.
Let's go ahead and get started talking about exploring and optimizing models from designed experiments with this all combination constraints. I t often happens that when you're designing experiments, it's not possible or it's not desirable for various reasons to be able to experiment over the usual entire rectangular design region. When that happens, you need to be able to apply constraints to your design region before you create the design and certainly before you run the design. Thankfully, ever since JMP 6, which has been a long time, the custom design platform has been able to create design experiments with constrained design regions. Since then, constraint support has also been added to fast, flexible filling designs and covering array designs.
Now, what types of constraints are available in JMP's DOE platforms? The first type of constraint is the simpler of the two. It's linear constraints on continuous and mixture factors. H ere's a picture where we have two linear inequality constraints that are shown in the gray shaded region. Then the design, you can see stays out of the disallowed linear constrained region.
The next type of constraint is called a disallowed combination constraint. It's a more general and can be more complicated type of constraint. It can consist of continuous, discrete, numeric and or categorical factors. What it is is a constraint that's a JSL Boolean expression that evaluates to true for factor combinations that are not in your design region. Here's an example.
We have a two- factor design where X1 L1 and X2 L3 cannot be in the design. They're disallowed and they're written as a JSL Boolean expression, which you can see right here. N otice that this design is created and stays out of this disallowed region. Now, originally, all of these disallowed combination constraints had to be entered as JSL like this. But then in JMP 12, a disallowed combination filter was added that made it easier to create these JSL expressions if you have fairly easy disallowed combinations such as individual factor ranges combined with and/ or expressions. We'll look at an example of this shortly.
Now, what about the Prediction Profiler with constrained regions? Why is it important for the Prediction Profiler to be able to obey constraints when you have models with constraints? Well, if the profiler ignores constraints, then it's possible that the user could navigate to predictions that are not feasible and they may not realize it. So you could end up in an area that's not possible, not desirable, and you certainly haven't tested there. It's an extrapolation, so this is bad. Then probably even worse, if you want to optimize your model, you could end up with an infeasible optimal solution. If that happens, the user would have to either try to manually find a feasible optimal solution, which could be really hard or even impossible, or they would have to use another tool.
What were the challenges with getting the Prediction Profiler to obey constraints? Why did it take so much longer to get these constraints in the profiler versus DOE? Well, the main reason had to do with the constrained optimization. The desirability function is a nonlinear function. That means that our optimization has a nonlinear objective function and possibly both continuous and categorical factor variables that could be involved in constraints. This is known as a mixed integer nonlinear programming problem, and it's an extremely difficult type of optimization problem unless you know something favorable about your objective function or your constrained region. It's just very, very hard.
But good news, the Prediction Profiler now works with all the same constraints as the DOE platforms. Turns out that the Prediction Profiler has actually obeyed linear constraints on continuous variables all the way back to JMP 8, j ust a couple of releases after they were added in JMP 6. We were able to do this sooner because these constraints, these linear constraints on continuous variables have really nice properties. B ecause of that, we were able to implement a wolf- reduced gradient variant algorithm. That algorithm does a really, really good job of finding the global optimum, especially if you don't have categorical variables. In that case, you should find the global optimum.
Now, since JMP 16, the Prediction Profiler now also obeys disallowed combination constraints on both continuous and categorical variables. Now, this was a lot harder because these constraints are very general. You could put absolutely anything inside that JSL Boolean expression. So we cannot assume anything favorable about our constrained region in these cases. Thus, we had to implement a genetic heuristic algorithm, which is a very general type of algorithm for the constrained optimization. B ecause of this, we can't guarantee a global optimum solution. But you should find a solution that's very close to global optimum, if not the global optimum.
Let's go ahead and start looking at some examples. First, we're going to look at a chemical reaction experiment. This experiment has one response, and the goal is to maximize yield. We have three factors. Two of them are continuous, time and temperature, and catalyst is categorical.
We have two constraints. When catalyst B is used, temperature must be above 400. When catalyst C is used, temperature must be below 650. W e used Custom DOE to create a response surface design with dis allowed combinations. Because these are fairly simple constraints, we were able to use the disallowed combinations filter.
You can see here that when I set catalyst to B, temperature cannot be below 400. This is my first disallowed region. Also if catalyst is C, the temperature cannot be above 650. Then once we created the design, you can see that the design points stay out of the constrained regions that are gray here. These are the dissolved combinations regions. Then we ran the experiment and we used Fit Least Squares to fit a response surface model to the data.
Now, I want to show you how you would use the Prediction Profiler to explore the model and find the maximum yield. I'm going to go get out of PowerPoint and go to JMP really quickly. Here is the data table from the chemical reaction experiment that we created with JMPs custom design platform. We've already run it and entered all the results. The important thing I want to point out is that Custom DOE added this data table script called Disallowed Combinations. When you open it up, you see it's got the JSL Boolean expression of my disallowed combinations. And this is what the Prediction Profiler reads in, and that's how it knows about my disallowed combinations constraints.
I've already saved my response surface model, and I'm going to run it and go down to the profiler. Because I have that disallowed combinations constraint saved in the table, it's able to read those in and the profiler can obey the constraints. I f I set catalyst to B, notice that I cannot get to a temperature 400 or below. If I set catalyst to C, I cannot get to a temperature 650 or above because those are disallowed regions. Also, when I maximize yield, I end up with a solution that is feasible, it's not in a disallowed region.
Now, what would have happened in a version of JMP prior to JMP 16? Well, we can see what would have happened by looking at the exact same data table without the disallowed combination script. I'm going to run the same exact model and go to the profiler. Now this time, the profiler doesn't know about my constraints. So when I set catalyst to B, I can go down into a disallowed region down to 350. Catalyst C, I can wander up into another disallowed region, temperatures above 650. W hen I do the optimization, I do end up with an infeasible solution. I'm in the disallowed region where catalyst is C and temperature is 750.
I would be forced to have to try to manually find a feasible solution that's not in a disallowed region. But thankfully, that's been solved since JMP 16. Let's go clean up and let's go to another example.
Okay, the next example we're going to look at is a tablet production experiment. The goal of this experiment is to maximize dissolution. We have five factors. Four are continuous and one is categorical. We have two constraints. The first constraint is that when screen size is 3, mill time has to be below 16, and my spray rate and coating viscosity follow this nonlinear constraint.
I used Custom DOE next to create a response surface design with disallowed combinations using these two constraints. Because this is a complicated constraint, we could not use the disallowed combinations filter, so we had to enter it as a script, which is not hard to do. Here's where I've entered that nonlinear constraint as a script. Notice I've flipped the inequality to show what's disallowed instead of what should be allowed. T hen I've also added screen size equals 3 and mill time greater than 16 as the other disallowed region.
Now, we can see by looking at two different slices of my design. This first graph is spray rate versus coating viscosity. I can see that all the design points stay out of the disallowed region set by this nonlinear constraint. W hen I look at screen size versus mill time, when screen size is 3, m ill time cannot be above 16. Then we ran the experiment, and we used Fit Least Squares to fit a response surface model to the data. N ow we're going to use Prediction Profiler to explore the model and find the maximum dissolution.
I'm going to go back to JMP. This is the tablet production experiment that was produced by JMP's Custom DOE platform. N otice that once again, it has saved the disallowed combinations data table script to the table. I'm going to look at that. You see that it's the JSL Boolean expression of my dis allowed combinations, and this is what the profiler will read in. I've saved the response surface model to the table. When we go to the profiler to explore the model, you can see that it obeys my disallowed combinations constraint. When screen size is 3, mill time cannot be above 16.
Also, spray rate and coating viscosity obey that nonlinear inequality constraint. When I maximize the solution, I end up with an optimal solution that's feasible and notice that it's actually on the constraint boundary. T hat tells me that if I had not been recognizing the constraints, I almost certainly would have ended up with an optimal solution that wasn't feasible, and I would have had to try to manually find it, which would have been very difficult, if not impossible.
All right . Let's move on to the next topic. Here we go. Back to PowerPoint. Okay . Our next topic is operating region optimization for critical quality attributes. This is where I'm going to introduce the new Design Space Profiler that's new to JMP 17.
What do we mean by design space when we're talking about the Design Space Profiler? Well, this is an important concept that's used in pharmaceutical development that identifies the optimal operating region that gives maximal flexibility of your production while still assuring quality. This concept was introduced by the FDA and the International Conference on Harmonization when those agencies decided to adopt Quality by Design principles for development, manufacturing, and regulation of drugs. W hen they did that, they put out some really important guideline documents, ICH Q8-Q12, that most drug companies follow.
Specifically, we want to look at ICH Q8 ( R2) , which covers design space. It defines design space as the multidimensional combination and interaction of material attributes and process parameters that have been demonstrated to provide assurance of quality.
Now, there are a number of steps that need to be taken to determine design space for a product, and several of them need to be done before you can get to the Design Space Profiler and JMP. One of the first things that you need to do is you need to determine what your critical quality attributes are and what the appropriate spec limits are to maintain quality. We'll refer to these critical quality attributes as CQAS. The ICH document defines a critical quality attribute as a physical, chemical, biological, or microbiological property or characteristic that should be within an appropriate limit, range, or distribution to ensure the desired product quality. This is the important first step.
Next, we want to use designed experiments to determine what are our critical manufacturing process parameters that affect those critical quality attributes. We'll refer to these as CPPs, critical process parameter, because ICH Q8 defines a critical process parameter as a process parameter whose variability has an impact on a critical quality attribute and therefore should be monitored or controlled to ensure the process produces the desired quality. Then, once you've determined your CQAs and your CPPs, then you want to find a really good prediction model for your CQAs in terms of your critical process parameters. Once you've done all of that, you can use the Design Space Profiler to determine a good design space for your product.
Let's talk a little more specifically about the Design Space Profiler and JMP. The goal of the Design Space Profiler is to determine a good design space by trying to find the largest hyper rectangle that fits into the acceptable region that's defined by your critical quality attribute specifications applied to that prediction model that you found. Once you found that hyper rectangle, it will give the lower and upper limits of your critical process parameters that determine a good design space.
The problem is that that acceptable region is usually non linear, and finding the largest hyper rectangle in a non linear region is a very, very difficult mathematical problem. Because of that, we wonder how does the Design Space Profiler actually determine Design Space then? Well, instead of trying to find the largest hyper rectangle mathematically, we use a simulated approach. What it does is it generates thousands of uniformly distributed points throughout the space defined by your initial CPP limits. Then it uses that prediction model that you found to simulate responses for your CQAs. Note, because your prediction model is not without error, you should always add response error to your simulations.
Once you've got your simulated set, it calculates an in-spec portion, accounting the total number of points in that set that are in-spec for all your CQAs from all the points that are within the current CPP factor limits. This is easiest to see by actually looking at an example and going to JMP and looking at the Design Space Profiler. That's what we're going to do next.
We're going to look at an example of a pain cream study. The goal of this study was to repurpose a habit- forming oral opioid drug into a cream that provides the same relief as the oral drug. T he first thing that we needed to do was determine our critical quality attributes for this drug. We determined that there were three of them entrapment efficiency, vesicle size, and in- vitro release. We also needed to determine what are the spec limits that assure quality. That's what these numbers are.
Next, we ran experiments to determine which of our manufacturing process factors affect these critical quality attributes. It turns out there were three of them. They are emulsifier, lipid, and lecithin, and these are the initial factor limits for these CPPs.
Next, we used custom design and Fit Least Squares to find response surface models for our three critical quality attributes in terms of our three critical process parameters. Once we did all of that, now we're able to go to the Design Space Profiler and JMP to determine a design space for this pain cream. Let's go back to JMP.
I'm going to open up my pain cream study. T his was my response surface model design created in JMP's DOE platform. I've got my design in terms of my three critical process parameters here, and these are my three critical quality attribute responses here. The important thing I want to point out is that for each of these critical quality attribute responses, I've saved spec limits as column properties. T hat is because the Design Space Profiler has to know what the spec limits are for your critical quality attributes. So if you don't enter them as column properties, you'll be prompted to enter them once you launch the Design Space Profiler, unless you've added them here.
I've already saved my response surface models as a script. I'm going to run that script. It launches Fit Least Squares, and I have it set up to automatically show the Prediction Profiler. This is the same Prediction Profiler that you're probably used to seeing. I have my three responses, my critical quality attributes here, my three critical process parameters, my factors here, and I can explore the model as usual. But now I want to try to figure out a design space for my manufacturing process.
Now I can easily do that by going to the production profiler, little red triangle menu, and several down. I see there's a new option for Design Space Profiler, and if I select that right below the Prediction Profiler, the Design Space Profiler will appear.
As I noted, if I hadn't already had spec limits attached to my responses, it would prompt me for that. But now I can see that it's brought them in from my column properties. You can see right down here. It's also brought in an error standard deviation. These values are coming from the root mean squared error of my Least Squares models. Y ou can see here, RMSE is here, is the same value for in-vitro release as the error standard deviation here. Of course, you can change these, you can even delete them. But we highly recommend that you have some error for your predictions since your predictive models are not perfect, not without error.
Okay. The first thing you might notice about this profiler is that it looks a little different in that each factor cell has two curves instead of the usual one curve. That's because we're trying to find factor limits. W e're trying to find an interval, we're trying to find the operating region, the design space where we're optimizing our operating region. The blue curve— we have a legend to help us— this represents the in-spec portion as the lower limit changes, and the red curve represents the in-spec portion as the upper limit changes.
You can see how if I were to change the upper limit of emulsifier, it would increase my in-spec portion. That would be a good thing. That's how that works. Also the in-spec portion, you don't see the value over here on the left like you usually do, but it's right over here to the right of the cells. It's initially, 79.21% of my points are in-spec and that's in-spec for all of the responses to all of the CQAs. If you want to see the individual in-spec portions, you can find them down here next to the specific response.
Also, you can notice this volume portion is telling me that I am currently using all of my simulated data and that's because the factor limits are set at their full range initially. To be able to change the factor limits or try to change the operating region, you can either move the markers as usual or you can enter different factor limit values here in this table or right here below the cells or you can use these buttons . I really like these buttons. If I click on Move Inward, it's going to find the biggest increase in in-spec portion. It's going to find the move that gives me the biggest increase. It's going to find the steepest upward path . Move Outward would do the opposite. It would find the steepest path downward.
If I click Move Inward, notice that my emulsifier lower limit has increased from 700 to 705, and my in-spec portion has increased to 81.95. If I click it again, now my lecithin lower limit has increased from 30 to 31, and my in-spec portion has gone up to 84.5. I can keep doing this.
But before I keep doing this until I find the desired in-spec portion that I like— and I'm happy with the factor limits, I think it's a reasonable operating region— there are several options in the Design Space Profiler menu that I like to look at. The first one is make and connect to random table. W hat this does is it creates a new random table of uniformly distributed points. You always want to add random noise. It's going to use the same random errors we used before. I'm going to click Okay. Now, I get this table of 10,000 new random points, and they are color- coded. The ones that are marked as green are in- spec, the ones that are red are out of spec, and the ones that are selected are within my current factor limits, my current operating region.
It's useful to look at the table, but I really like to look at these graphs that are produced by some of these saved scripts. If I run Scatterplot Matrix Y, it will give me a response view of all my data . The shaded region that's green here is the spec limits. T hen I also like to look at the Scatterplot Matrix X, which gives me the factor space view. It's nice if I can look at them both at the same time. While I'm altering my factor limits, if I click on Move Inward again, you can see how the points change . I find it even more useful. You also see how the factor space changes. I find it even more useful to hide all the points that are not in my current operating region, then I don't even have to look at them.
Now, as I keep clicking on Move Inward, you can see how that operating region is shrinking. If you only want to be concerned with the out- of- spec points, you can click on Y Out of Spec, and that will only show the out -of- spec points that are occurring. Notice that my in-spec portion, as I keep moving my factor limits in, is increasing .
I'm going to keep going until I either hit 100% or my operating region looks like something I can't that isn't feasible, that I just won't be able to attain. I'm going to keep clicking Move Inward. Things still look good. Move Inward, just going to keep clicking it. Okay, I hit 100, and I still think that these factor limits represent an operating region that I think I should be able to attain.
To be able to look at that further, I can send the midpoints of these factor limits to the original profilers, see what that looks like. I think that looks pretty good. I can also send the limits to the simulator in the Prediction Profiler, and I can decide to use different distributions. I actually think that my critical process parameters follow normal distributions. I'm going to select this Normal with Limits at 3 Sigma . It turns on the simulator, and it sets my distributions to normal, and it figures out the mean and standard deviations for these limits with Sigma, 3 Sigma.
Of course, you can change all these values as you think seems fit for your own situation, for your own manufacturing process. You can change the distribution, you can change the mean cedar deviations. I'm just going to leave it, and I'm going to see what simulating, what the normal distributions looks like. It looks really good. You can see my defect rate. When I keep hitting Simulate, it's often 0.
I also like to simulate to the table to be able to just get a view of what my capability analysis would look like just as a sanity check. I f you come down here, you can simulate the table, and it's going to use these normal distributions for the critical process parameters. It's going to use the same errors for your predictions as we used before.
I'm going to click Make Table, and when I do that, it automatically creates some scripts. One of them is distribution. If I run that, I can very easily look at my capability reports because I saved my spec limits as column properties. I see that the capability looks, at least for the simulated data, it looks really quite good. So I'm pretty happy with this, even though this is just on the simulated data. Of course, I need to check the real data, but I'm really happy with what I'm seeing so far. I think I'm going to use these limits as my design space.
Now, just to note before I go further, I have a good situation here, but let's say that you didn't have a good situation where your in-spec portion wasn't where you wanted it to be, and you really can't adjust your factor limits anymore. You could do what- if scenarios by changing your spec limits or your errors if you think that is something that could reasonably happen. But I have a good situation, and I'm happy.
I am going to use this option Save X Spec Limits, and that's going to save these factor limits back to my original data table, to my critical process parameters, so I can save these factor limits. When I do that, when I go back to my original table, you can see that those factor limit settings have been saved as Spec Limits to my critical process parameters.
I find it really helpful to be able to look at this design space in terms of the contour responses and the acceptable region. I've already saved my predictions as formulas and I have a script saved to run the Contour Profiler. I'm going to run that . This is going to give me my contour responses for all combinations of my factors, my critical process parameters. I don't know if you can see the faint rectangles, but that is my design space as defined by those factor limits that got saved as spec limits on my critical process parameters. The shaded colored areas, these are my spec limit response contours.
You can see that my design space is nicely within an acceptable region for all these contours . It's even further in. It's not touching them. That's because we added that error end for our predictions. I'm really happy with this.
Okay, let's get back to PowerPoint. I just want to give you a few takeaways about the Design Space Profiler before we wrap up.
First of all, that in-spec portion that we saw in the Design Space Profiler shouldn't be taken as a probability statement unless you believe that your factors, your critical process parameter factors, actually follow a uniform distribution because that's what was used to distribute them. Also, the Design Space Profiler is not meant for models that have a large number of factors or very small factor ranges because of the simulated approach that it takes.
It's also recommended, as I've mentioned several times, to always add random error to your responses because your prediction models are not without error. And finally, I just wanted to make a statement that even though this was motivated by pharmaceutical industry, it really is applicable much further than that. In any case where you want to find an optimal operating region and you want to maintain flexibility and quality, then this can be helpful.
There were many things about the Design Space Profiler I didn't have time to show. I really hope that you will check it out. Any questions?