Re: DoE when each run is costly and time consuming ( low throughput process opti...

Report Inappropriate Content

Hi,

I am looking to perform a DoE to optimized process conditions inside a perfusion bioreactor (with yeast). The throughput on the system I am using is essentially 1 and each run would take 1-2 weeks. Because of these constraints I was thinking about performing one run and varying my factors during the run. I can monitor my responses (product titer and cell specific productivity) by sampling and obtaining instantaneous metrics or even daily volumetric yields. I am looking to optimize conditions like dissolved oxygen (DO), Temperature, pH, which are pretty easy to change mid run, but other factors like media composition is harder because the change doesn't happen instantaneously.

The problem I think I might run into with this type of design would be that a change in the cells state from one step could impact the steps down the road. I was wondering if anyone has used JMP to design this kind of experiment before and has any tips the best way to go about planning it. Alternatively, If anyone has other ideas on how to perform this type of low throughput DoE in a bioreactor, I am open ears to new ideas.

Thanks for the help!

Victor_G · Feb 26, 2025 03:33 AM

Hi @Ben2,

I'm not familiar with the experimental setting of low throughput bioreactor you described, but here as some questions, remarks and suggestions that could help you :

What is your objective ? System understanding/exploration and/or optimization ? How accurate is your measurement system ? How "noisy" are your experiments ?
Depending on the emphasis you have on the objectives (understanding/optimization), you could use a DoE approach (to understand which factors do contribute to the improvement/response) or a Bayesian Optimization approach using model-driven sequential experimentation if your primary objective is to optimize your system. Note however that depending on the dimensionality of your experimental system (number of factors and levels) and how "noisy" are your experiments, Bayesian Optimization may not be suited for highly-noisy systems and/or high-dimensional systems, as the model may have hard time figuring out the "directions" to optimize your system.
If you intend to vary factors levels during the experiment and you fear that changes order may affect the outcomes ("a change in the cells state from one step could impact the steps down the road"), there may be two options available :
- Saving the order of the steps as a new uncontrolled factor in the datatable that could be used in the analysis, but it may not be optimal regarding robustness of the design regarding time trend/changes order. You could have correlations between your time trend and your factors, and not be able to differentiate which of the two is responsible for the response(s).
- Include time/order as covariate (and include both linear and quadratic effects) in the design : It is a special case of DoE with covariates ("time-trend robust designs") that you can find in "Optimal Designs of Experiments : A Case Study Approach" (chapter 9) by Peter GOOS and Bradley JONES. More info on these discussions :
  Covariates in defined order in custom design
  Incorporate Time lag in DoE
Also depending on the change ease of factors (for example, you mention "dissolved oxygen (DO), Temperature, pH, which are pretty easy to change mid run, but other factors like media composition is harder because the change doesn't happen instantaneously."), you could maybe use a Split Plot Design structure, that could enable to create one whole plot/"macro factor" like media composition on which you can do several trials and record the responses. Note that this idea could be used in addition of the previous idea of time-trend robust design, with "macro factor"/whole plots in which you do several experiments where the change order is taken into account as a covariate.

I hope these few questions, remarks and suggestions will make sense and help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

Ben2 · | Posted in reply to message from Victor_G 02-26-2025

Hi Victor,

Thank you for your response! first to answer your questions:

1) the objective is not to optimize the system itself but optimized the conditions of the system for a particular recombinant protein I am expressing.

2) The measurement of the system is fairly robust for things like DO, pH and Temperature as they are controlled by feedback loops, but things like media composition would change over time depending on how the cells are consuming nutrients (media is constantly being fed throughout). the product response is a more limited measurement because it needs to be physically sampled by a person

I think I will probably use time as a covariate factor. Would it make sense to also include other process conditions that I am not studying as covariate factors? I am not entirely sure how relevant a Bayesian Optimization would be considering I really only have 1 experiment and am trying to use different steps in the experiment (varying factors) to optimize (but you can tell me if I am wrong). In terms of a split plot design is that something I mainly do in the analysis once the experiment is completed? and just to confirm are you saying to look at something like media as a whole opposed to the individual components that make it up?

Thank you for your questions and suggestions, this was very helpful!

Victor_G · | Posted in reply to message from Ben2 02-26-2025

Hi @Ben2,

Yes, in my opinion it would make sense to use other process conditions, either as covariate factors or as uncontrolled factors (to enable the analysis with these factors, even if not included in the design generation).

I don't know about Bayesian Optimization, it might be helpful if you can consider the order/time dependance of the factor changes (and not only the factor changes/values) to optimize quickly the factors.

Split-plot is a situation where the randomization is constrained by experimental conditions (see Designs with Randomization Restrictions), which are reflected both in the design and analysis. I don't know how relevant this setup can be if you only have one experiment (and factors that vary in this experiment only), but if you were to consider several experiments, you could build a split-plot design, with the whole plot being the run considered (with a specific yeast type or any factor that could not change during the run), and the "easy-to-change" factors / subplots being the covariate factors that can change during the run, with the order/time considered.

Hope this response makes some sense,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

Ben_BarrIngh · | Posted in reply to message from Ben2 02-25-2025

Hi @Ben2 ,

I have experience in doing bioreactor optimisation - one of the challenges you are going to come across is varying stirring, pH control, aeration in the bioreactors mid run is going to create confusion when it comes to deciding on the 'best' conditions for the overall batch. Changing the conditions mid-way through will produce different stressors on the cell activity and may cause a longer impact in the way that the fermentation proceeds.

One thing to consider, is if you can scale down (i.e. to shake flasks) for the media optimisation, with the assumption that the optimal media composition is the same at small scale and the bioreactor scale - this gives you the experimental breadth and availability of experiments to do a DoE approach. Then you can move to controlling the bioreactor settings alone (stirring, pH, aeration) in a DoE setting where each run is a testing a combination of factors, i.e. I will set DO% to 40% at 24h, 50% at 48h and so on.

Happy to comment more on this if you have questions.

Thanks,

Ben

“All models are wrong, but some are useful”

Ben2 · | Posted in reply to message from Ben_BarrIngh 02-26-2025

Hi Ben,

Thank you for your response!

Just to clarify, the reactor I am operating is a perfusion reactor that has a constant feed of media coming in and a constant stream of "spent" media going out (containing the secreted protein for expression). This differing in feeding strategy between the reactor and shake flask is part of the reason I would like to look at it at the bioreactor scale. As well the cell density obtained in our bioreactor is significantly higher then what is achievable in shake flasks.

I very much agree with this statement ("Changing the conditions mid-way through will produce different stressors on the cell activity and may cause a longer impact in the way that the fermentation proceeds. "), but don't have the option to perform the optimization in independent experiments. I was going to try and allow a transition step to bring the cells into a steady state before actually collecting data for the "run" (step in 1 experiment) which I think can minimize some of the burden. If you have other suggestions on how to minimize the effects of the early steps on the later steps I would be all ears!

Thank you again for your feedback it is very helpful!

Ben_BarrIngh · | Posted in reply to message from Ben2 02-26-2025

Hi @Ben2 ,

Apologies I missed that it was continuous perfusion - yes that's a good idea to let them return to the steady state for the yeast and then running your conditions after each 'steadying' - one thing to consider is the length of time you would have each 'run': would you run each iteration for a set time period (i.e. 72h), or a set goal (i.e. cell density of X, product amount of Y)?

Just to add to your idea more, try to consider a way to measure the longevity/stress state of the cells when they are brought back to steady state, what tests can you perform at each steady state to say 'this is the same health as the last run' - that could be OD600 values, dead/live ratios or testing for certain metabolites (if you're lucky enough to have access to something like a HPLC or Flex monitor to test). From this you can then decide on whether you want to continue with this batch of cells, or if its better off to scrap them and take a new cryo/seed to continue with.

Bayesian optimisation is definitely a good option for the iterative approach, but you could also consider something like a scoping design to get you started.

You mentioned DO - is that a target to control or is it better to split that into the controlled rate of stirring and aeration? I typically think of DO as a product of the cells response to the bioreactor conditions changing rather then a parameter you control (although you can argue that target DO values for the production phase that you control to keep above a certain level is a controlled parameter).

Hope that helps!

Ben

“All models are wrong, but some are useful”

statman · Feb 26, 2025 02:52 PM

Wow, that sounds fun! I'm NOT an SME for your process, so I'd have to discuss with you to truly understand the situation and constraints. But, here are some of my thoughts:

1. Plan on iterating. Don't try to assign and optimize everything with 1 experiment. Think about what knowledge you need to efficiently continue your investigation. It isn't very effective to have a great model that worked yesterday.

2. Spend sufficient time identifying noise (Factors that you will not be willing to control in the future, either because you don't have technologies, don't have the monies are consider controlling them inconvenient). Then what strategies will you use to maximize inference space while not sacrificing design precision (e.g., blocking, repetition, split-plots).

3. I think you may have an opportunity to use split-plots (not necessarily in the traditional way). I highly recommend you read:

Box, G.E.P., Stephen Jones (1992), “Split-plot designs for robust product experimentation”, Journal of Applied Statistics, Vol. 19, No. 1

These can be extremely efficient in sequential step situations,

4. Design multiple experiments (easy to do in JMP). For each experiment, compare and contrast what each experiment will give you (e.g., what potential knowledge will you gain (design and noise resolution, linearity, repeatability)) to the resources required. Predict ALL possible outcomes of each experiment and be prepared to handle any of them (e.g., you run the experiment and the performance metrics don't change or you create a lot of variation in the performance metrics, but none of it is assignable to the factor effects, etc.)

5. Lastly, and most important, regardless of the outcome, no one knows the best experiment á priori! Reflect on what you learned about the process of experimentation and draw on those experiences to design the next experiment better.

"All models are wrong, some are useful" G.E.P. Box

DoE when each run is costly and time consuming ( low throughput process optimization in perfusion bioreactors)

Re: DoE when each run is costly and time consuming ( low throughput process optimization in perfusion bioreactors)

Re: DoE when each run is costly and time consuming ( low throughput process optimization in perfusion bioreactors)

Re: DoE when each run is costly and time consuming ( low throughput process optimization in perfusion bioreactors)

Re: DoE when each run is costly and time consuming ( low throughput process optimization in perfusion bioreactors)

Re: DoE when each run is costly and time consuming ( low throughput process optimization in perfusion bioreactors)

Re: DoE when each run is costly and time consuming ( low throughput process optimization in perfusion bioreactors)

Re: DoE when each run is costly and time consuming ( low throughput process optimization in perfusion bioreactors)