CMC, SVEM, Neural Networks, DOE, and Complexity: It’s All About Prediction - (20...

I want to thank the JMP steering committee and the JMP organizers

for inviting Phil and myself to come and present our exciting talk

on CMC , SVEM , DOE , and Complexity : It 's All About Prediction .

Want to start by thanking Dr . Tiffany Rao , she 's been involved with the planning

and numerous conversations for the work that we 're going to present today .

Going to do an overview , tell you who Lundbeck is , who I work for ,

and then provide the background

for the DOE that we 're going to talk about ,

which is process development for a biologic drug .

Our case study and what I 'm doing in traditional

for what I 've started to do for development

is start with the first step of doing DSD for mid -late stage development ,

then follow that with a second step

of doing augment with a space -filling design .

Then we are hoping to prove to you today that for analysis that SVEM

allows us to have better prediction for all of this work and allows us to have

better timelines for our work that we 're doing .

Lundbeck is located …

We 're headquartered in Copenhagen ,

we 're over 6 ,000 employees in over 50 countries ,

and we are striving to be the number one in brain health .

The part of the company that I work with is the CMC biologics

and we 're basically located in the Copenhagen area

and in the Seattle area where I 'm located .

Let 's talk about the background for the DOE that we 're going to present today .

The process that we want to develop for drug substance , for these biologics ,

we start with a cell of vials , we take those out of the freezer ,

we then expand in shake flasks , go bigger into culture bags ,

maybe a seed bioreactor , then to a production bioreactor .

That production bioreactor goes approximately two weeks .

We have complex nutrient feeds ,

we have PH control , temperature control , there 's the base that we 're adding .

Once we finish that 14 -day production , we need to figure out a way

to get the cells that are secreting our molecule into the supernatant .

How do we separate the cells from the product ?

That harvest can be a centrifuge , it can be depth filtration .

Then we pass it on to our downstream colleagues .

They first usually do a capture step where they 're getting rid

of most of the host cell proteins , the host cell DNA .

But then we need to do two polished steps where we 're then saying ,

"Okay , what are the product -related impurities ?

Maybe there 's not the full molecule there , so we have to get rid of those ."

Then finally , we have to make sure , through ultra filtration and diofiltration

that we can transfer into the buffer

that it 's going to be when it is transferred for the patient 's use

and it 's also at the right concentration .

You can imagine , every step along this way ,

there are many factors ,

there are many knobs that we can turn to control this process ,

make sure that it 's robust

and we 're making the same product every time .

When we 're focused on treating the patient ,

we also want to focus on the business .

We can 't put all of our development resources for every molecule .

We want to right -size the research that we 're doing

at the right stage of the product .

There 's many things that could kill a product ,

but if we can develop this in the right time and the right space

using these tools from JMP ,

we can shift this development timeline to the left

and we can also reduce the amount of resources

and the cost to the company .

If we 're first getting a molecule ,

that 's when you 're going to start looking at your categorical factors .

We might be doing the cell line screening .

We want to make sure that we have the right cell line

that 's going to last all the way through commercialization .

For the downstream group , they may be looking at resins

for both upstream and downstream ,

looking at medias and buffer components and the formulations of those .

That 's when you 're making sure that you have the right thing ,

that 's going to keep you going through your development pathway .

But then once you 're in the clinic ,

now you want to really start to gain understanding of the process parameters .

Our strategy is to start with a development screening design

and we want to be bold in our level settings at this stage

and I 'll talk a little bit more about that later ,

for the late stage development .

Then we can build on what we learned from the Definitive Screening Designs

by augmenting those designs with space -filling or other designs

so that we really understand that design space .

What 's different that we 're hoping to show now

than traditional walks through this pathway

is that in the past , we 've been throwing out

the factors that we 've said aren 't important .

But with modern designs and modern ways of doing analysis ,

we can keep all of the factors and all of the work that we 've done so far

and gain better understanding of the whole process ,

especially with biologics that are quite complex .

Before I pass the baton to Phil , I just wanted to talk one more about …

Let 's see if I can …

I 'm going to minimize this screen just for a minute so I can show you this .

This is an experiment that I did to prove the power of DOE for my boss .

The full data set was an OFAT for PH , and the response was tighter .

We wanted to do very many different levels

in a wide range because he wasn 't sure at the time

that we were going to be able to pick what the optimized level was .

But what I wanted to show him was that ,

"Okay , we did this experiment , we have all of this data .

We were able to model where the optimized condition was ,"

and that 's shown in blue ,

and that turned out to be the correct case .

When we tested the model , that was the optimized condition .

Let 's pretend now that we 're starting , we don 't know that data .

If we had picked a conservative range setting for our experiment ,

our noise to signal would be quite high

and so we would have missed finding the optimized spot .

But if we had picked a wider range in our settings

and still with only three points ,

the model still would have chosen the optimized spot .

What I 'm going to challenge the subject matter experts

when you 're designing your DSDs is really be bold in your range setting .

You will still find the optimized spot

and you have to have some knowledge of your process so that you can complete

the design of experiment

and have all of the runs at least have enough signal

that you can measure and then subsequently model .

Once you learn from your Definitive Screening Designs

more about your design space , you can come back

and then you can be internal to that space .

That 's when you augment with a space -filling design .

Now I 'm going to pass the baton to Phil

and he 's going to take you through the analysis .

Okay , thank you . Thank you , Patty .

We 're going to talk about a very nice

and somewhat complicated experiment that Patty and her team run .

They do a lot of great work and they 're big advocates of DOE and JMP

and I 'm very happy they let me get to play with them sometimes .

It 's fascinating work .

But before I get into the actual analysis ,

I wanted to talk about a few relevant concepts

that members of the audience may

or may not be familiar with , and that includes complexity .

It 's a really hot topic out there .

Talk about what is actually prediction .

That is a muddled concept to many people .

Then from there , I 'll launch into talking about

how we analyze prediction and how we did with Patty 's experiment .

Complexity, a fellow named Daniel Finelli

from London School of Economics, written much about this

and he calls it "the elephant in the room " that statistics and many ,

what he calls "metasciences ," are ignoring and they 're ignoring it at their peril .

I won 't get into a lot of detail .

You can look him up on the internet , he has a lot of videos and papers .

But complexity is a huge problem .

It is staring science and statistics

and data science and machine learning in the face and it needs to be dealt with .

At present , we 're not really dealing with it directly in statistics .

By the way , there are now whole applied math programs

based on studying complex systems .

My bottom line is , complexity is real .

Complexity requires new thinking .

We really have to rethink DOE and analysis .

You 're going to see that for complex systems,

and we also have to understand something else , systems theory 101 is

complex systems are defined by their interactive behavior .

In point of fact , main effects are actually even misleading .

You have to somehow be experimenting

in a manner that you can capture this interactive behavior ,

and you 're going to see current strategies fall short of that goal .

Patty 's already mentioned the CMC pathway .

Nowhere is this problem of complexity more obvious than in bioprocesses .

You have complex combinations of biology and chemistry ,

and interactions are everywhere .

When I talk to scientists in biotechnology ,

they know right up front we 're dealing with really complex interactive systems .

But first , I need to point out prediction .

If you 're working in CMC development work , it 's all about prediction .

The ICH guidelines that are used by scientists in the CMC development work

don 't specifically say prediction ,

but if you read what they say , it 's all about prediction .

Basically , you 're building processes to manufacture biologics ,

and with the new cell and gene therapies ,

these processes are becoming hopelessly complicated .

I personally rely heavily on the scientists to explain it to me ,

and they 're the people who really make all the decisions .

I 'm the helper , and I 'm very happy to be there as part of it .

But it 's all about prediction .

That is not how many scientists and even statisticians ,

have viewed CMC work .

By the way , this applies to all areas of science .

I 'm focused with Patty on the CMC development pathway ,

but prediction is important .

What is prediction ?

It 's muddled . It 's not clearly defined in disciplines .

Here 's what it really is and how I define it .

It 's a measure of how well

models that you develop interpolate over a design region .

In other words , we 're going to fit a model to what we call a training set ,

and then we need some way of knowing how that model would apply

over the whole design region .

In CMC work , especially late stage , that is very important .

You be able to do that , as many of you know .

You really have a training set to fit the model .

That training set in no way can evaluate prediction .

I know there 's a common belief

you can evaluate prediction on training sets .

You simply can not .

You must have a test set .

Also I 'll talk a little bit about the fact in dealing with scientists ,

and a lot of it in chemistries and biologics .

Again , I do a lot of it in biotechnology ,

but also in other areas like battery technology , material science .

It is becoming very obvious .

The kinetics are complicated .

They 're constantly changing over design regions .

The kinetic behavior that you see around the boundaries

is often very different from what 's happening on the interior .

Why does this matter ?

Well , the classic approach to response surface ,

even including optimal designs , relies upon what I call boundary designs .

Almost all of your observations are around the boundaries of the design region .

In point of fact , whether people want to hear it or not ,

the central composite design ,

commonly used in response surface ,

is about the worst design you could think of for prediction .

The interior of the space is empty .

If you fit these models on the boundary ,

and then you predict what 's happening on the interior ,

it 's not prediction , it 's speculation .

You don 't know . You have no data .

I 'm going to show you in the case study ,

you 're probably going to reach some wrong conclusions .

The boundary regions , indeed , often behave very differently ,

and we have a need to reconsider our approach to designs .

Another issue

in response surface and statistics

is this ubiquitous use of full quadratic models .

They are not sufficient to model complex response surfaces .

In fact , they 're far from it .

Unfortunately , I get a lot of pushback

from statisticians who claim it is good enough .

My answer is , "Well , if you actually use designs

that had sufficient interior points ,

you 'd quickly discover they don 't fit well at all .

Again , trying to measure prediction

on the interior of a design region using boundary designs is futile .

By the way , my good friend , the late John Cornell and Doug Montgomery ,

published a paper on this in 1998 , and I 'll be polite , they were ignored .

It was actually somewhat nastier than ignored by the statistics community .

They showed in the paper that full quadratic models

are just not sufficient to cover a design region .

Patty mentioned SVEM , self -validating ensemble modeling .

It 's an algorithm .

I 'm one of the co -developers with Dr . Chris Gottwald of JMP ,

a person I hold in very high regard .

I won 't get into the algorithm by the way ,

there are references at the end where you can go and learn more about it .

It has been talked about at discovery conferences actually ,

going all the way back to Frankfurt in 2017 .

But SVEM is an algorithm that allows you to apply machine learning methods .

Machine learning methods are all about predictive modeling .

Believe me , people in that field know a lot more than you may think

about prediction and apply them to data from small sets like DOE 's .

I won 't get into SVEM .

It 's a whole new way of thinking about building predictive models ,

and I think it 's in its infancy ,

but it 's already proving very powerful and useful in biotechnology .

Let 's get to the experiment .

This is actually a hybrid experiment that Patty and her team created .

There are seven factors and there are 13 responses .

But due to time constraints , I 'm only going to focus on four ,

and even that 's going to be hard to get it all in .

The data and the experiment are highly proprietary .

I do thank Lundbeck and Patty for actually allowing us to use

an anonymized version of this design .

I have a lot of case studies , some of them similar to this ,

and the people who own the data

wouldn 't even let me discuss it if I anonymized it .

That was very nice of them .

I think we have a really important story to tell here .

This is a hybrid design .

It 's comprised of a 19 -run Definitive Screening Design

around the boundaries .

Then it has 16 space -filling designs on the interior .

There are center points in both parts of the design .

How would we analyze this ?

Well , what I want to do is discuss the strategies of analysis that are used ,

the algorithms that are used , and make comparisons to SVEM .

I 'll tell you in advance , SVEM is going to do very well .

Then we 'll talk about some of the issues with the models themselves

and how we use them .

I 'm going to do what most people currently do .

I 'm going to take the boundary points , the DSDs ,

fit models , and then apply them to the space -filling designs as a test set

and see how well my model interpolates .

Step two , I 'll reverse the process .

I 'll fit models to the space -filling points ,

and then I 'll use the DSD as a test set and see how well my model

actually extrapolates a little bit to the boundaries .

Three is a common strategy used in machine learning .

I 'm going to use a holdback test set .

I 'm going to take the 35 runs and break them up .

I did this in a way to make them both equivalent as much as I could

into a training set containing both SFD and DSD points ,

and then also a whole back test set that has a representation of both .

Then finally , step four , what many people would automatically do .

I 'll just fit models to the whole data set .

In general , I don 't recommend this because there 's no way to test the model .

I will say up front ,

because we do have a lot of space -filling points on the interior ,

I 'm more comfortable with this approach than I am in practice .

But these , I find , are the four basic strategies that would be used .

How do I analyze it ?

Well , if you have a DSD ,

people like to use Fit Definitive Screening ,

I 'll look at it , it only applies to DSDs .

Honestly , it 's not really a predictive- modeling strategy ,

nor do they claim it is .

But I find people seem to use it that way .

I 'll use Forward Selection .

If you know what the AICc statistic is , we 'll do that in GenReg , in JMP 17 .

Then we 'll look at something they have in GenReg that 's very nice .

That is the SVEM algorithm .

I 'm going to use that with Forward Selection .

Then I 'm going to look at something people may not know .

It 's a hidden gem in JMP .

Something called Moving Average in the Stepwise platform .

John Saul put it there many years ago .

I think he was being very insightful .

Then we 're going to talk about SVEM and Neural Networks .

Basically , no software does this .

I have worked with the Predictum ,

some of you know Wayne Levin and Predictum to develop an add -in to do this .

It 's currently the only software available that does this .

The SVEM add -in was used to do the Neural Networks .

I won 't get into the add -in particularly ,

I 'll just quickly show people where these things are .

Then finally I said the fourth strategy was used to hold data set

because I get asked about this all the time .

I just threw in some K -cross Fold validation to use

with the SVEM methods and some of the other methods .

Those are the methods we 'll use

and for methods like Fit Definitive ,

Forward Selection and Moving Average methods ,

we 'll assume a full quadratic model as that is the tradition .

The other methods , again , we 're going to use a Neural Network

which is more flexible .

There are four responses, and this is really important .

I didn 't randomly select them .

There are four of them and they vary in complexity .

Again , I 'll admit this is subjective .

There is no internationally approved measure of complexity

and this is based upon the ability to model the responses .

Again , there are 13 responses .

Typically , in CMC pathway work , there are 10 -20 , maybe more ,

most of them critical quality attributes .

They are important

and they vary within the experiment from some are fairly low in complexity ,

some are very high , very difficult to model .

Frankly , in those cases ,

Neural Networks are basically your only option .

So pay attention to this because this complexity

turns out to be very important in how you would go about modeling .

Then the question is if I 'm going to evaluate prediction ,

well , how do I do that ?

Remember , I prefer prediction be on an independent test set

with new settings of the factors .

That 's how we judge interpolation .

Well , something called the Root Average Square Error

or RASE scores is very common .

This is the standard deviation of prediction error .

Again , it 's commonly used to judge how well you predict .

Smaller is better , obviously ,

but there is a problem with it that we 've particularly uncovered ,

especially in simulations .

Models with low RASE scores often have substantial prediction bias in them .

In prediction , there really is still a bias -variance trade -off .

So how do we evaluate bias ?

Well , there 's no agreed upon approach to that either .

But the easiest way and the most visual way

is actual by predicted plots on a test set .

Ideally , if you were to fit a slope

to the actual bi -predicted plot , I 'll show an example .

The ideal prediction equation that a slope would be one with an intercept of zero .

The farther the slope is from one , the greater the bias .

For purposes of demonstration , I 'm going to set

a specification of 0 .85 -1 .15 with a target of 1 for the slope .

If you can stay within that range ,

then I 'd say you probably have acceptable amounts of bias .

In reality that happens to be more of a subject matter issue .

Then finally I said , "Well , you can fit a slope to the actual bi -predicted plot .

There 's an additional problem ."

The predictor is the predicted values .

They have a lot of error in them .

So this is actually an errors and variables problem ,

which is not commonly recognized .

But JMP 17 has a really nice solution .

It 's called the Passing -Bablok modeling algorithm

and it 's been well -established , especially in the biopharma .

This fits a slope , taking into account errors in X , the predictor .

So how does it work ?

Well , it fits a slope .

If you look on the left ,

you 'll see the slope is about 0 .5 . We have strong bias .

There 's a lot of prediction bias .

What I really like in the application in JMP , they give you the reference line .

The dashed blue line is the ideal line slope of one , intercept of zero .

On the left , our predictive model is showing a lot of bias .

It 's systematically not predicting the response .

To the right , is a case where there 's actually a small amount of bias

in general , that would be acceptable .

By the way , these were picked as one 's models

that had relatively low overall RASE scores .

These are called the Passing -Bablok slopes

and they are integral to how I evaluate prediction ,

the overall RASE and the slopes .

What I 'm going to do at this point ,

I 'm going to actually go over to JMP , if you don 't mind .

I 'll make a quick change in the screen here

and I 'll make this as big as I can for everybody .

Overall in this exercise ,

I fit close to 140 models and I did them all individually and evaluated them .

Yes , it took quite a while

and I 'm going to show a graphic to try to summarize the results

for the different methods .

I 'm going to open a Graph Builder script .

I 'll make this as big as I possibly can for everyone .

I 'm using some local data filters , to define the display .

Notice we have four training scenarios .

I 'll start with where the DSD is the training set .

We fit models to the boundary

and then we evaluate them and how they predicted

the space -filling design points .

Y2 is the easy response .

I expected all approaches to do well , they did .

Notice I set these spec limits and that 's 0 .85 -1 .15

all fell within that allowable region .

Two of the methods that did well ,

I particularly liked the moving average ,

so it did pretty well .

None of them had a slope of exactly one .

The DSD points don 't exactly predict what 's going on

in the space -filling design points , but they all did relatively well .

Now we 'll go to moderate complexity .

Now you start to see some separation .

It 's getting harder to model the surface .

Again , I 'm using this interval of 0 .85 -1 .1 .

I 'm looking on the y -axis at the RASE score standard deviation of prediction .

On the x -axis , I 'm looking at slope .

For Y1 , using the DSDs to predict

the space -filling design points as the test set .

The only models that really performed well were the Neural Networks with SVEM .

By the way , the code is NN is Neural Network ,

H is number of hidden nodes .

We have models with varying levels of hidden nodes

and I simply evaluated RASE scores and slope .

We go to more complexity .

Now Y3 has high complexity .

It is hard to model .

The lowest RASE scores were the methods you see on the lower right ,

but you can see there 's substantial prediction bias .

I felt overall the best combination of low -bias

and RASE score were Neural Networks , particularly one with 27 hidden nodes .

Then finally number four is high complexity .

We fit the model to the DSDs and applied it to the space -filling points .

I didn 't think any of the models did great .

All of them showed some prediction bias .

Maybe the best performance was a Neural Network with 12 hidden nodes .

It had the lowest RASE score , but still , there were some issues with bias .

So that 's one strategy .

Well , what if I were to do the opposite ?

I fit the model to the space -filling points

and then apply them to the boundary DSD points .

Again , let 's start with the easiest case .

Y2 really does . It 's a pretty simple response .

Actually , the SVEM method in GenReg using SVEM and Forward did very well .

The next best I thought was a Neural Network with 10 .

Remember, there 's a little bit of extrapolation going on here .

Finally , Y1 with moderate complexity .

Again , only the Neural Networks did well .

As we go up in complexity , increasingly just the Neural Networks are working.

You 'll find similar results for the other approaches .

I won 't show all of them , they 're covered in the notes .

But the general conclusion by the way , is

that when you use the boundary points as a test set

or you use the space -filling designs as a test set and try to predict the other ,

they 're just not doing as well as they should .

In other words , as I said earlier , the boundary points ,

the DSD points and the space -filling design points ,

there are differences in their kinetic behavior that we 're not picking up .

The only way we 're going to pick it up

is to actually fit models over the whole design space .

We did do that by the way .

I should just quickly show you .

I used the whole data and we fit models and we actually did pretty well .

I didn 't show the Passing -Bablok slopes .

I will just quickly do a little more work with JMP for those who are interested .

The Passing -Bablok slopes can be done in Fit Y by X .

I will admit we wrote a script

and added it to the predictive add -in to do this in Fit Y by X ,

but you can easily do it yourself .

Here , and I 'll pick one of the cases , is the DSD data and I 'll pick Y1 .

How did we do fitting models ?

If you look in the menu , there 's the Passing -Bablok .

I strongly suggest you look at it .

A lot of regression problems are errors in variables .

How did the method do it overall ?

I want to explain something else .

The orange points are the DSDs , the boundaries .

The blue points are the space -filling design points .

Here I fit models to the DSD

and the Passing -Bablok slopes are being fit to the space -filling design points .

Overall , the best performance was turned in by the DSDs .

There 's one of them here .

It 's Saywood 6 .

Another one that had … I forgot what it was .

Let me widen this out for you .

Nineteen .

Notice the slope is close to one ,

but you can clearly see there is some bias .

In other words , you can see an offset between the fitted slope

and the ideal slope , the dashed blue line .

This is pretty typical overall .

I 'll just very quickly show you .

If you have JMP Pro and you want to do SVEM using linear models ,

just go to Fit Model , Recall .

This is a full quadratic model .

You could do others .

Go to GenReg and then under estimation methods .

There 's SVEM Forward .

There 's SVEM Lasso .

These work very well .

From a lot of work in these methods ,

I still find SVEM Forward gives you the best results .

The Lasso tends to give you a lot of biased results

on test sets in particular .

If you 're interested in model averaging , if you have JMP standard ,

just going to hit recall again , just go to the Stepwise platform .

Didn 't do it . Stepwise .

I won 't run it .

It will take too long because model averaging uses best subsets regression .

It 's time -consuming , but it 's there .

Again , Neural Networks with SVEM ,

you have to have the Predictum add- in to do that .

There 's a link to it if you 're interested .

At this point ,

I 'm going to not do too much more analysis .

Again , you can go through and look at the various slopes

for the various responses

and you can see many of these methods resulted in highly biased slopes .

In other words , the DSD points and the space -filling designs are too different .

We 've really got to understand we need to fit models

over the entire design region .

At this point , I 'm going to just finish up .

By the way , there is enough material here,

and I do have basically many talks that are combined into here .

I apologize , but I think there 's an important message here .

By the way , I 'm just showing slides with the Passing -Bablok slopes .

Then finally , I want to just give you some final thoughts .

I think we really need some new thinking in statistics .

We don 't have to throw out everything we 've been doing .

I 'm not saying that .

The most important is we are in the era of digital science .

Digital chemistry , digital biology , digital biotechnology are here .

They 're not tomorrow . We 've got far more automation .

Lots of great ,

especially in biotechnology , pilot and bench scale

devices that scale nicely , where we can do lots of experiments .

The problem is complexity .

We need to think differently .

Machine learning methods via SVEM

are very important for fitting these complex systems .

We need to get away from the response surface approaches

that really haven 't changed .

Maybe we 've got computers and some new designs .

I think DSDs are really very clever .

We have optimal designs , but they suffer from the fact they 're boundary designs

and people keep insisting on full quadratic models .

That 's a mistake , as I 've tried to show briefly in the talk ,

and you will be able to download the talk ,

you can see how poorly these methods generally did with the complex responses .

As far as I 'm concerned , we need new types of optimal designs .

At a minimum , these need to accommodate a lot of factors .

Patty , by the way , without getting into details , has run a DSD …

Not a DSD . You did space -filling design with 18 runs .

Given they have Amber Technology available ,

if you know what that is , they can do it .

Why do we need that ?

Because these systems are interactive .

We need to stop thinking they 're a minor part of the equation .

Main effects do not describe the behavior of a complex system .

Its interactivity is what drives the behavior .

We need to cover the interior of the design region .

Yes , we would like to cover the boundaries too .

We don 't want to be specifying a model .

Optimal designs require you specify what is usually a full quadratic model .

We need to get away from that .

Space -filling designs , by the way ,

are optimal designs that do not require a model be specified .

But they 're not the total answer .

We need to cover the design space .

We need to give the user a lot of input

that would be scientists on how they distribute the points .

The work of Lu Lu and Anderson -Cook point the way .

I won 't have time to get into that .

That 's another topic .

We need to be able to easily combine our design with other data .

That includes engineering runs , GMP runs ,

even models from partial differential equations and simulations .

Especially if you want to get into digital twins ,

you 've got to be able to do that using what I call meta models .

Then finally , Patty mentioned this , so I wanted to bring it up .

The standard practice in design of experiments , assuming

that somehow you 've got to screen out factors

is actually a really high -risk , no -reward strategy in complex systems .

You will regret it .

You will someday , at a later stage , come back and have to redo experimental work .

I 've seen this time and again .

In complex systems ,

this idea that there are active and inactive factors is simply wrong .

They all matter at some level somewhere in the design space .

Frankly , with our modern tools , you don 't need to do it anyway .

Also , something else people do reflexively reduce linear models .

We 've shown in our research in SVEM .

Also , a nice paper by Smucker , Edwards ,

and we showed reducing models degrades prediction .

Why ? Because you 're making your model stiffer and stiffer ,

it 's not going to interpolate well .

I will stop at this point and there are some references at the end .