JMP Add-In Builder for Automating Microbial Growth Characterization (2024-US-30M...

Microbial growth assays using microtiter plate readers are commonly used in the biopharmaceutical industry for screening media components. These assays can generate large amounts of data very quickly; thus, automated approaches to the de-bottleneck the data analysis step are critical to enable faster decision making. At Kerry, retrieving, cleaning, and formatting the growth assay data is a manual process that is time-consuming and error-prone.

In this work, a series of scripts were created to improve the data analysis approach by automating data formatting, clean-up and visualization. JMP’s workflow builder and add-in builder were used to capture scripts and create an easy-to-use, click-based approach to reduce data analysis time and automate redundant analyses for high-throughput microbial growth assays. The add-ins reduced data analysis time by 80% from 20 to 4 minutes per raw data file. Add-ins will be made publicly available for other JMP users needing to de-bottleneck their data retrieval, clean-up, and analysis for high-throughput microtiter plate assays.

All right. Hello. My name is Kyle Probst. I'm a Senior Scientist at Kerry, based out of Beloit, Wisconsin, which is our North America headquarters. Today, my presentation will cover the use of JMP Add-in Builder for automating microbial growth characterization. I have about eight years of experience working with JMP and really excited to do this today because this is my first time presenting for Discovery Summit.

A brief outline here just to give you a better idea of what I'll be talking about today. Introduction, getting into why we did this work, why it's important, the objective, the specific thing we were after here, and then giving you a little bit of background information on Kerry, pharma cell nutrition, what microbial growth assays are, our approach to doing those, and then talking a little bit about our existing data analysis approach. Then we'll get a jump, and I'll do a demo of the add-in itself to give you a clear idea of what we did here. Then we'll go into conclusions.

All right. Screening of raw materials and media nutrients is a critical step for bioprocess development. High-throughput screening assays or HTS assays using microtiter plate readers is commonly used in the pharmaceutical industry. These assays can generate large amounts of data very quickly. So data processing and analysis is a bottleneck.

Automated approaches to the bottleneck, the data analysis step, are critical to enable faster decision-making. We know time equals money, especially in the industry. At Kerry, retrieving, cleaning, and formatting the HTS assay data is a manual process that is very time-consuming and error-prone. In this work, we use JMP to de-bottleneck the HTS data analysis, enabling faster decision-making.

I'm just going to go ahead and put it right up front here. This is the starting raw data that we're working with. It comes in as an Excel spreadsheet. Then what we're looking to turn this into, the final product, the processed data, are these two figures here. We have one figure that is representing growth kinetics or growth curves. Then we have area under the curve, which is a metric that is used to represent growth performance. That's what we're looking at getting out of this. This is the final product. This is what we're going to try to achieve today.

The objective here was to develop a series of add-ins to enable a flexible data formatting and cleanup approach to reduce data analysis time and automate redundant analysis for HTS microbial growth assays. The JMP features we use were Add-in Builder, Workflow Builder, the Recoding Tool, Graph Builder, and Dashboard.

A little bit of information about Kerry now. Kerry is the place where I work. It's an Irish food ingredient company, and we are the world leader taste, preservation, health, and biopharma ingredients. At Kerry, we focus a lot on our customers, supplying them with ingredients, so they can create better tasting, more nutritious food and beverages, along with health-promoting supplements and life-saving drugs.

I work in the pharma area of the business, and our pharma business has been around for over 100 years. It is made up of three pillars: excipients, acetates, and cell nutrition. I work in the cell nutrition area of the business. This includes products like our protein hydrolysates and yeast extracts that we sell as ingredients as food for cells that are used in biopharma and biotechnology.

At Kerry, we provide complex media supplements. Those are the protein hydrolysates, which are also known as peptones for anyone who's taken a microbiology course, and yeast extracts. These are partially digestive raw materials that are rich in nitrogen. These are often used as the primary source of nitrogen in media formulations because they provide lots of important nutrients to the cells, things like peptides, free amino acids, vitamins, minerals, et cetera.

When we talk of cell nutrition, we are talking about the food needed to sustain malian, insect, and microbial cells that are used in the bioscience industry. These are our ingredients here. Then these are some examples of cell types, as well as the products that they're making that are used to improve our way of living in everyday life, like insulin, like antibodies, like vaccines, probiotics, even antibiotics. Those are just some examples.

We have over 100 different protein hydrolysates and yeast extracts available to create formulations here at Kerry, and we can screen and optimize these ingredients using our in-house strains and cells and to create formulation prototypes based on our customer needs. Hopefully, this gives you a better idea of what I do at Kerry and what cell nutrition is all about.

All right. At Kerry, we do microbial growth assays very routinely, and that's used for screening some of our cell nutrition ingredients for microbial applications. Like I mentioned, we have a lot of ingredients available to us to use. Screening really allows us to test many products and quickly to help us select only a few of the best candidates for formulation optimization. Growth assays use microtiter plate readers, and these are workhorses for our screening step.

With our protocol, we can test up to 40 different ingredients in a single assay. We can also evaluate multiple microbial strains, and we can also do DOE-based approaches such as definitive screening designs and even two-level screening designs like Plackett-Burman or fractional factorial design.

Then as an example, we might have a customer who is looking for a plant-based protein hydrolysate that supports the growth of Lactobacillus acidophilus. That's an important probiotic strain. We would then first start screening multiple plant-based protein hydrolysates to find one or a few that support the best growth of this microorganism.

Then this is an example of the growth assay workflow. Our products are formulated into a liquid media broth. Most of our products are dry powders, so they must be hydrated. We can also add other nutrients if needed, which would be that's specific for the microorganism being grown. Then we have an inoculum. This is the live active cells. We often prepare this ahead of time to ensure the cells are actively growing and healthy before we use them.

Then the broth and the inoculum are pipetted into the well plate here. This is the well plate. In this case, the well plate contains 100 individual wells where we add around 100 to 200 microliters. We often do this using a multi-channel pipette, which is shown here at the top.

We're dealing with microtiter amounts. For our bioscreen assay, the bioscreen C, which is the instrument that we use, we can run two of these plates at once. Normally, we also run five technical replicates or five wells per treatment. We can run up to 40 at a time. When the plate is placed into a plate reader, the plate reader is a bioscreen, and that can control temperature and agitation. Most bacteria are grown at 37 degrees C for 24 to 48 hours, just as an example.

This instrument automatically measures optical density over time. We use an optical density of 600 nanometers, which is also known as OD 600. That's a standard metric in the industry for quantifying microbial growth. Then as the cells replicate and grow, the broth becomes turbid, and that's proportional to the OD 600. As the cells grow, the OD 600 increases, which is plotted over time to give a growth curve here at the top. You have cell density or OD 600 on the Y axis, time on the X axis. Different nutrients will lead to different levels of growth, which we measure, and we can visually see from the growth curve.

If you're wondering what a growth curve is, the growth curve is the microbial kinetics that describe our populations of microorganisms, like bacteria or yeast grow over time. When the conditions are right, like having enough food or the right temperature, or the right environment, these cells can divide and multiply rapidly. We have a few different phases here that we look at.

Number one is the lag phase. When microbes are first introduced into a new environment, they don't start multiplying right away. They need a little bit of time to get used to their surroundings and prepare for growth. That's what the lag phase is. Then, once the microbes are ready, they start multiplying quickly. Each cell divides into two, four, then eight, and then so on, leading to a rapid increase in population. That's the exponential or log phase of the growth curve.

Then eventually, the food and space will start to run out or waste products will build up, and that will slow down the growth. That leads us into stationary phase. The number of new cells being created actually balances out with the number of cells dying, so the population stays constant. The growth curve allows us to visualize growth performance. That way we can compare, and quickly select ones based on these criteria here.

Additionally, another metric that we measure is called the area under the curve. That's used to summarize the curve into a single number based on the region below the curve. Under the same conditions and time, the larger the region or area of the curve, the greater the overall growth. Hopefully, that's pretty clear.

Then this brings me into our existing data processing approach, the approach that we've been using required both the use of Excel and JMP. Excel was used to manually edit and calculate time points, the area under the curve, as well as zeroing the OD 600 data, which I'll show you here a little bit later. Then we would kick it over in a jump, and that would be used for doing some of the manual recoding and figure editing. This is all done from scratch each time we did it, so a bit cumbersome.

Then day to day, this process was done a little bit slightly different because we were doing it from scratch each time, so it wasn't consistent. Then due to this, we often would find subtle errors or differences across data sets, thus it was error-prone. In addition to this, it was pretty time-intensive. It'd take about 20 minutes to process one raw data set. The improved approach, which is what we did here in this paper in this effort, which I'll be showing you here shortly, took advantages of JMP's Workflow Builder to capture scripts from redundant analysis and to build a series of add-ins.

One of the things that was really nice about the Workflow Builder is that I have very limited experience in scripting. The Workflow Builder allowed me to essentially copy and paste scripting from the established workflow that we had and then just create add-ins from that. Super easy. I should mention, not all the scripting was obtained from the Workflow Builder. I must admit, I received some help from our friend Jerry Fish, who is our JMP systems engineer for the more advanced scripting in this work. But overall, the improved approach reduce the time of analysis to four minutes and 80% reduction in time, which is really great for us.

All right. I'm going to back out of here really quick, and I'm going to pull up the raw data file. That way you can see what we're looking at here. I already flash this up on the screen earlier. The raw data file comes in an Excel format. You have time in your first column, then each of the subsequent columns here are for each well that's on the well plate.

We also include a well plate map along with the data. This is something that we manually add into the file. But this allows us to understand what was put into each well. This is giving us a nice map or a nice key that we can refer to. For example, strain A and media 1 were added into wells 1, 21, 41, 61, and 81, and the one through five just represent replicates. We use this first row here, and that will be getting strain A and media 1, and then so on and so forth. Hopefully, that gives you a better idea of how we approach doing this.

All right, I'm going to close this out, and then I'm going to get into JMP, and we'll go through the add-ins. If you're not familiar with add-ins, definitely take a look. It's just a way for you to add additional things into your JMP software that you can run. There's a whole bunch of them out there that people have created. I've got quite a few in here, but I'll be focusing on the bioscreen data. We're going to go ahead and get started here.

The first thing we're going to do is we're going to do a general file open. This is the Excel file that I just showed you. I'm going to hit open here and make sure that you have all files selected. That way it can pick up things like Excel. Hit open, and then it opens the Excel Import Wizard. I'm going to import both the well plate map and the raw data, the 600-nanometer data. Everything's set up the way I want it here. I'm just going to hit Import. That's going to give me two different data tables. I'm going to focus on the data kinetics here first.

The first thing that I want to do here is I want to add a time sequence column. Because the way that this time is formatted, I can't plot it over time. I want to have this in a decimal format, and that's what that add-in did. I'm also just going to get rid of this and then delete this column because I don't want it. I want the time in a decimal format with hours.

The next thing we're going to do, we're going to row subtract for OD correction. We're going to blank out our initial data. When we're running different types of the media, the media can have different starting OD values. You can see here that some of these are different. We want to put these all on the same playing field. We want to essentially zero out the starting OD. We're just going to take these numbers in this first row and subtract them from each row here. Essentially, you're zeroing your detector, putting everything on the same playing field.

I'm going to hit row subtract, and that does that. All right. Then the next thing we're going to do is we're going to stack the data. I'm going to select the columns here. It's going to shift, hold shift, select all of the columns except for the time column. Then I'm going to stack these and that creates new data table. I also recoded well number too. That went on in the background.

This is good to go for now. I need to come back into my well plate map, and I have to do a little bit of formatting in here to get it to where JMP likes it. I need to stack all of these replicates here into one column. I have a stack well plate here. That gives me this updated well map data file. Then I can use this table here as a tool, as a map, to recode my stack data here that has all my OD data values in. That's what I'm going to do right now.

I have a recode by well plate. I have strain as one of my recoding factors. I have replicate. Those are pretty much always consistent in our experiments. The other thing, I have different treatments in your treatment. One, two, three, four. For this example, we've only got one treatment, I could have more than one treatment. That's why I have that in there to give some additional flexibility.

All right, this is done and ready to go for additional processing. I'm going to come back to my raw data file here, and we're going to calculate the area under the curve next. The area under the curve has its own set of add-ins within it. The first thing I'm going to do is I'm going to calculate a row cumulative sum.

When we calculate the area under the curve, the way we do that is we just sum up each row, just doing a cumulative sum of each row, then we're going to take the final value, and that will be our area under the curve. That just generated a whole 200 additional columns here. I'm going to highlight them, and then I'm just going to select the very bottom row because that is what I'm after. I'm after I use this data here. The 14.145 is the area under the curve for well one.

I've got an add-in here. It'll allow me to subset this as well as do some reformatting. I'm going to close this out here. I got the well number and the area under the curve. The last thing I need to do here for this data set is I just got to recode this as well by the well map, the same thing we did before. Strain, then we've got our replicate. Then the last thing that we got to do is treatment one.

This table is done now. Both my tables are done. The next step is I want to combine the area under the curve table to the stack data table. I want to combine these two tables together. I actually have an add-in for that as well. Here we go.

The reason why I wanted to combine these two together is because I'm going to create some figures, and I want the figures to be dynamically linked with one another when I put them in a Dashboard. In order to do that, I need to have these both in the same data table.

I'm going to create two different figures now. One of them is growth curves, and I've got an add-in for that. I'm just going to go ahead and run that. This is something that I created in Graph Builder, where it's got our OD 600 on the Y axis, time on the X axis, and then you can look at the growth curves for each strain. I'm going to go ahead and minimize this. I'm going to also create an area under the curve, Graph Builder here. Looks like this, where we have area under the Curve here on the Y axis and then the different media components on the X axis. This is a bar chart with a standard deviation. That's the error bar is here across the technical replicates.

Then the last thing we're going to do here is we're going to run a Dashboard. The Dashboard is set up to where I've got my growth curve figure, area under the curve figure, and then I've got a data filter here, too, which gives me the ability to interact with my data a little bit more. I'm going to look at strain 1 here. One of the really cool things about this Dashboard set-up here is that I have the ability to interact with my data now. For example, if I wanted to see what growth curve was media 14, it automatically highlights it here, which is really quite slick.

Additionally, I could sit here, I could order my area under the curve from lowest to highest, and I can even select the top three candidates here, which are these right here, which we may want to do some further our optimization work with. You can see how quickly we got from our raw data to visualizing our data to being able to make a decision with it.

One last thing here. I've got a little bit of time left, and I did want to show you an example of how to use the Workflow Builder to create an add-in. I'm just going to do it with my data set that I have here. I'm going to close out a couple of these figures here. I'm going to use this table here. I'm going to open a new workflow right here. You go under File, New, Workflow. In order to be able to use the Workflow Builder, which captures Script for what you're actually working on in JMP, it records your script. That way you can actually take that script, and you can use it for whatever you want. In this case, we're going to set up an add-in for it.

I'm going to hit record, I'm going to minimize this. What I'm going to do is I'm just going to build another graph here. I'm going to plot OD on my Y axis, time on my X axis. I'm going to have these be curves. I'm going to put my treatments as my overlay, and then I'm just going to rename this "growth curves". Let's change the font here real quick. Make it big so you can read it. I'll change the font here and make it bold, a little bit bigger. Same thing with time. Let's make it bold, a little bit bigger. Then let's change the legend here. I'll just call it "Media". Okay, that looks good.

In order for the Workflow Builder to capture this script when you're working in Graph Builder, I realized you have to close out the graph. I'm going to go ahead and do that. I'm going to come back to the Workflow Builder, and it has captured my script here. Very convenient, especially for somebody like myself who doesn't have much scripting knowledge.

What I'm going to do, I'm just going to highlight and copy this, and then I'm going to stop this, and then I'm going to actually open... Excuse me, I'm going to open this in a different screen. I'm going to open my add-in here real quick. That way we can put this example in here. In order to open the Add-in Builder, you have to select this little downward triangle here.

This is the Add-in Builder that I've been working with. I'm just going to go under my menu items here. I was practicing this earlier. I forgot to delete it. I'm going to create a new example now. I'm just going to add it at the end here underneath Dashboard, and you can just click Command. I'm just going to paste this. Just as a simple demonstration, okay? I'm going to hit Save.

When I go back to JMP, I'm going to use this combined kinetics and AUC table. If I click on my add-ins, bioscreen data, looks like I didn't name it, but it's okay. It just run menu item one. This should generate the figure that I just created. Let's see if it does it. It does. There you go. Perfect. That's just an example of how to use the Workflow Builder and to capture scripting. That way you can put it into an add-in. Very useful for things that you are doing routinely that become redundant, like making figures.

Okay. I'm going to finish up here. I got a couple more slides here. You were able to get a good feel for what we were using JMP here for, the Add-in Builder. We were able to successfully simplify and reduce the time to process our bioscreen assay data here. We had an 80% reduction in time. What does this mean? If you were able to perform 120 bioscreen assays. The data analysis time would be reduced from one week or 40 hours to one day or eight hours. Pretty significant.

Not every assay we run is the same. The click-based series of scripts gives us the flexibility needed to process data from lots of different experimental designs that we use. The Workflow Builder really is a game-changer. What I showed you all the day really didn't take me much time to build at all. I think less than 10 hours total to make all those add-ins. Like I've said before, I have limited knowledge in scripting. This made it very easy and approachable to people like myself that are a bit novices in the coding realm.

For future work, we're definitely not done here. I think there certainly is an opportunity to further improve our add-ins and analysis approach. I think it would be cool to set up a fully automated approach for some of the more routine analyses we do to try and target even shorter data processing. Then we're already doing it right now. I'm already developing other add-ins for other routine data analyses beyond just the growth assays I showed you here. Taking full advantage of that saves me a ton of time in the work that I do in the laboratory and for my data processing. Really, really convenient.

Then I also plan to combine the add-ins in this work with some work that my colleague Jerry Fish is presenting on, which is using the JMP Pro Functional Data Explorer to quantify different parameters from the growth curves. Things like the growth rate, the lag time, the time to stationary phase, and the OD or cell density in stationary phase, are all very important parameters to us when we look to try to make decisions with our data.

Very excited to be teaming up with Jerry to be able to combine this work that we've done here with some of the exciting stuff that he's doing as well. More on that later. I included a link here and his title of his presentation, which he'll be presenting as well for the JMP summit. Very cool stuff. Certainly, check it out if you have the opportunity to.

Okay. That brings me to close here. I wanted to share with you all my contact information in case you had any questions about the work that we had done here, or if you just want to connect with me and talk about microbiology or some nutrition. I love talking about those things. This is really great to be able to present to you all today. Thank you for your attention, and hopefully, I can connect with some of you all sometime soon. All right. Thank you.

Presented At Discovery Summit 2024

Presenters

Skill level

Intermediate

Beginner
Intermediate
Advanced

JMP Add-In Builder for Automating Microbial Growth Characterization (2024-US-30MP-1719)

Presenters

Skill level

Files

Automation and Scripting

Data Blending and Cleanup