Data Automation and Visualization for Accelerating Cell Therapy Development (202...

Cell therapy and analytical development workflows involve complex data that must be processed and visualized to provide meaningful insights. Our team at Takeda Pharmaceuticals U.S. Inc. aims to develop a streamlined, automated solution that enables lab users to visualize and analyze cell therapy data effectively. By automating the generation of processed data tables and dynamic dashboards, our system presents key metrics (such as recovery, fold of expansion, and cell culture parameters) in an intuitive format. This solution allows researchers in Cell Therapy and Analytical Development to quickly assess experimental progress and make data-driven decisions, thereby accelerating both research and product development in the field.

Hello, everyone. I'm Maria Reyna Fernandez. I'm a research investigator in cell therapy at Takeda. I'm excited to be here at the Discovery Summit to share how we built an automated data ecosystem using JMP Live to accelerate the cell therapy development.

Across the pharma industry, we've seen an urgent motivation towards digitalization, automation, data integration, and real-time decision-making. Today, I will show you how our team has applied these principles in cell therapy, building a JMP Live ecosystem that transforms raw data into actionable insights and ultimately speeds up the path from lab to patient.

Before we dive in, a few disclosures. I'm an employee of Takeda, and I own company stock. The views I share today are my own and may not reflect those of Takeda. I also want to note that I have no affiliations with any vendors of the technologies I will describe today.

That said, my perspective is very much grounded in being both a user and a developer for these workflows. I will share some of the challenges we faced and worked and what we learned along the way.

Our cell therapy workflows generate complex data sets. Traditionally, this data was processed manually, slowing analysis and increasing risk of error. The goal of this project was simple but ambitious, to automate the entire pipeline from raw data ingestion to visualization. By integrating relational databases with JMP Desktop and JMP Live, we created a system that delivers clean, validated, and interactive data views, allowing scientists to make decisions faster and with greater confidence. This presentation is really about showing you the journey, the messy reality of manual data, and then the streamlined and collaborative world we build with JMP.

Here's the roadmap of today's talk. I walk you through the challenges of manual workflows, the automated data pipeline, how we calculated and visualized by scheme metrics, and our approach to quality control, so the impact of our teams and what we see coming next.

Cell therapy and analytical development groups generate huge volumes of data. Each experiment can involve multiple devices, multiple time points, and multiple scientists. Historically, Excel was our workhorse, but Excel wasn't designed for this scale of complexity. It's a fantastic tool for many things, but when you're handling experiments across months with thousands of data points in multiple teams, Excel becomes fragile.

What we really needed was a system that was automated, so scientists don't have to do the repetitive work. Traceable, so every data point can be linked back to its source. Collaborative, so people across the organization can see and use the same version of the truth. That's where JMP Live, combined with the database, became our solution.

About the current challenges, let me pause here and paint a picture of what life was like before the system. Imagine a scientist running an experiment. They got data coming from multiple devices. At the end of the day, all that information lives in different places in different formats. What happens next? They open Excel, and then they start copying, pasting, maybe they're converting units, renaming columns, manually recalculating the fold of expansion. This could take hours, and sometimes days, just to get to the point where they can just start thinking about the science.

Now, add the human factor. The scientists often build their own Excel macros and formulas. Two people working on the same experiment might use slightly different logic or update different versions of the same file. There's also situations where there are three different final versions of an Excel circulating by email. Which one was the truth? That uncertainty is very stressful and risky in cell therapy experiments.

Here's the bigger issue. Cell therapy is a field that moves fast, patients are waiting. Every delay in analysis slows the development of therapies that could make a real difference. This is not unique to Takeda. Across pharma, companies are realizing that manual, Excel-driven workflows are a bottleneck.

The industry has been undergoing a digital transformation, moving away from local files on people's computers. There were integrated automated systems that ensured data is clean, standardized, and shareable. In many ways, our project wasn't just about solving a local pain point, it was more about aligning with this bigger wave of digitalization in pharma. Harnessing automation to reduce human error, building systems that are scalable across teams, and most importantly, creating a culture where scientists can trust the data that they see, and they can just focus on the science and not in Excel files.

You can think of this in a way that the old approach was like trying to build a bridge by hand with mismatched tools. The new approach is like having a very well-organized factory where every part arrives and ready to assemble. Every step is tracked. That's the leap that we made with this digitalization. I will show you how we did it in our experimental data collection system and the work that we did with the device connection.

To tackle this, we built an end-to-end experimental data collection system. This system enables scientists to design experiments and record data in real-time. It's very nice because it's a structure around unit operations, like transduction, expansion, harvest, and formulation. Every experiment has a unique ID that links back to the design, ensuring traceability.

Devices are also connected. For example, cell counters are connected so that the raw data is transformed and analyzed automatically every hour. This is saving about 45 minutes per assay. If you multiply that across dozens of assays per week, that's days of saved time per month. What I really want to highlight here is the traceability. Every data point is tied to an experiment idea. We can always go back and know exactly where it came from and under what conditions.

Capturing the data is only step 1. The real challenge comes after. We're turning raw outputs into something scientists can actually analyze and use. For this, we built a five-step pipeline. To do this, we have the first step, which is the ingestion, pulling raw tables directly from our database, cleaning, and then merging, combining experiment metadata with device readouts, restructuring, shaping the tables into wide or long formats, depending on the analysis that we want to do. Finally, the validation step, which is applying rule-based checks to catch missing values or duplicated matches.

Now, what makes this possible in practice is the JMP Scripting. Instead of relying on manual steps or macros hidden in Excel files, we develop reusable JMP scripts that automate each of these processes. For example, a script that runs automatically to reshape data into the correct format for the dashboard, so scientists don't have to touch the raw files. Or another script that checks every new data set against the predefined rules. Once the scripts are written, they can be reused across experiments.

Rather than everyone inventing their own way of cleaning data, we now have a standardized and automated pipeline, but directly in JMP. Now why does this automated ground-link process in JMP scripts really important? In the old work, every scientist was doing their own way of cleaning and analyzing data in Excel. That meant that every scientist had different formulas, different formats, and sometimes different results for the same experiment. It was very, very difficult to translate all these formats for even the same experiment that followed the same steps in the laboratory.

With JMP Scripting, we standardize the process. The scripts do the heavy lifting behind the scenes, so checking values, reshaping the tables, and applying the same rules consistently every time. What we thought as a result of all this effort is the consistency and traceability. Also, scientists no longer spend hours doing the manual cleanup, and they can trust the output is analysis ready and accessibility. This is very key because scientists don't need to know how to code to benefit from this automated pipelines. Once the scripts are in place, any user can open the JMP output and immediately work with clean and validated data.

It's really important, and we already see this value in our teams because one of our team members put it very nicely. She said that before, she was spending a lot of time preparing the data, and she was spending more time preparing data than analyzing the data. Now she just can skip to the science and really think about how to improve the process in the lab. That's really the essence of why we want to automate all this process, because it turns data processing from a bottleneck into an enabler.

Once our data has been processed and validated through JMP scripts, the next step is to make it accessible for the organization. We use JMP scripts to query and pull these tables directly from the database into JMP. These scripts handle the connections, define which fields to extract and ensure the correct version of the data is retrieved.

Once inside JMP, the data is automatically pushed into JMP Live and this creates the updated interactive table that anyone with the right access can use without needing to run the queries themselves. We also introduce a hierarchy to keep things organized.

We have folders where we organize by team, by product, by program. It's really nice because the key advantage is that everything is automated and organized, so scientists don't have to manually export or upload data. As soon as new results enter into the database, JMP Live refreshes with updated tables. Because it's a structure by folders, people just navigate directly to the data that matters most to them. Instead of data have been scattered across folders or different program systems, emails, chat, it just lives in JMP Live. From the database to JMP scripts, and then those tables are ready to use, and the scientists can access to them in JMP Live.

Another great capability in JMP Live is the dynamic in the data table. This is one of the most powerful aspects of this system because they refresh automatically as new data enters in the database. They include filters for material, a summary of calculations, and the filters can be by filtering sample ID or experiment ID or day that this experiment was performed. They're also formatted, so they're instantly usable for the team's reports. This ensures that consistency, the structure is identical across experiments. This is really nice because the scientists can make cross-study comparisons easier.

Also, something that we also noticed, and from the scientists' feedback, is that improved efficiency. The scientists don't have to spend hours formatting or exporting data manually. Again, everything is traceable. Every table links back to the metadata in the database. That's really nice from the data perspective.

Another aspect that we did in this project here at Takeda was the automated key metrics pipeline because we're using the data table that we just discussed in the previous slide. Initially, we use a JMP add-in as a proof of concept to calculate the key metrics like recovery and full of expansion. Then we tested it across different experiments. Experiments that had different unit operations and different number of days or conditions. We wanted to make sure the accuracy and consistency of these calculations that were in the code of this JMP script.

These scripts were designed to reflect the experiment design store in our data collection system. Every calculation aligns with the actual process step. For each experiment, the script automatically calculates the key metrics for each day of the process, from transduction through expansion to harvest. As I mentioned, the metrics include recovery, fold of expansion. Also, we had self-culture parameters like viability, density, and other parameters.

Because the scripts are tied to the experiment design, the calculations were adapting dynamically to the process. If an experiment has extra sampling points or different unit operations, the metrics were calculated without manual intervention. Every calculation is automated and linked to the relational database that we use. Results are immediately available in the JMP Live dashboard. It was very easy and great for the scientists because they were running the experiment, and by the time that these tables were updated, they were able to see the visualization of these results. Also, time saving, no manual calculations. Scientists can see how the metrics evolve day by day, and the scalability.

Once the key metrics are automatically calculated by the JMP scripts that we developed, the next step is making them actionable for scientists. All these calculated metrics are now fed into a JMP Live dashboard. We added additional visualizations like trend lines and time course plot, so scientists can track metrics day by day throughout the cell therapy process.

Users can explore trends across the entire process. For example, seeing how cell viability changes during expansion across multiple runs, or how fold of expansion barriers between different donors. This has really good advantages from the scientists and also from data point because it has real-time insights. Dashboards automatically refresh as new data is added in the database, so the teams are always seeing the latest results. This is really great. It's streamlining the process very nicely for the scientists.

Also in JMP Live, in this dashboard, they can also do the cross-experiment comparison and see the trends and the difference and just everything in just one place and visualize together the multiple runs. This is, in essence, the dashboard that caused the loop from experiment-designed, automated calculations, and visualization, so scientists can explore their data and detect trends.

The other dashboard in analytical development, the data that we deal is cytotoxicity assays, flow cytometry, immunophenotyping. This data can be complex and multidimensional. Our goal with this dashboard is to make sense of all this complexity quickly. What we implemented with this interactive graphs was to allow scientists to observe cytotoxicity potency and immunophenotype profiles immediately. Instead of having just static tables, they can see patterns, pics, and different services across experiments.

Another very cool capability that JMP Live have is the advanced filtering. Users can just isolate the data they want to see by method run ID, simple ID parameter or ratio. This is great because this makes it possible to ask very specific questions like, "How did donor A perform under condition X compared to donor B?" Another advantage that we observe is the trend visualization across experiments as well for the analytical development team. Instead of just looking at one run in isolation, they can compare experiments over time.

As I mentioned before, there's multiple benefits in this dashboard and the scripting ensure this dashboard turned complex and very high-dimensional data into very actionable insights, helping the team and the scientists to quickly answer critical questions about self-potency or phenotype behavior.

About the data quality that we tried with this script is that high-quality data is essential for making trustworthy decisions in both cell therapy and analytical development. It's not enough to just process and visualize the data. We also want to make sure that it's correct. The same tables that scientists use to explore results also include quality checks. These checks validate key metrics against reference values or device outputs.

For example, we have data quality checks for the sample ID. We want to make sure that the scientist is typing the sample ID correctly in the device, because also that sample ID is going to the database table. We also want to check the viability that is making sure it's within the 2% of the cell counter's average. Also, for the live cell count, that is compared against the output volumes to detect inconsistencies.

Another thing that we really like about these data quality checkpoints is that scientists can immediately see potential typos and mismatches or missing data. They can just identify exactly where the corrections are needed and either request the change for the data team or if there's something that they can do on their end, they can make the change.

We want to make sure that all the calculations are being made correctly and that the script is actually calling the value from the correct sample ID. This is great to include in this JMP Live data flow that we have because one of the advantages is the transparency. Scientists understand where discrepancy exists instead of just being told where's the problem.

Also, there's the speed of correction. These visible errors can be corrected at the source very quickly, preventing downstream mistakes. Also, when you have a table of a lot of sample IDs, like 30 sample IDs, of course, it's taking more time and effort to identify what's the small typo that was entered in the lab.

It could be even a very small typo, like one letter, and then the script won't give the most accurate number. With this data quality script, we're trying to identify and to correct all this data before driving conclusions with this dashboard. In essence, our goal was to create these quality control scripts and include it as part of the scientist workflow and not as a separate step. By integrating all these QC checks into the JMP work, we make it easier to catch, and understand, and correct errors while still providing real-time access to clean and actionable data. Also, it's a changing culture. We're relying more in data tables, and new systems, and new devices. It's definitely very important to help the scientists to navigate this transition into these new ways of working.

To wrap up by integrating JMP Live with our relational database, we created an automated, traceable, scalable data pipeline. This has the reduced manual effort, enable real-time insights, and accelerate both research and product development. Looking ahead, we were exploring predictive analytics in machine learning to make the system even smarter, because from the perspective of the descriptive dashboard, we want to also predict decision support.

The ultimate vision is a future where data doesn't only inform scientists, but also anticipates their needs and guiding them toward the best next steps in their experiments. What's happening here is not just an internal efficiency project, it's part of a larger movement across pharma, embracing digital platforms, automation, and data science to accelerate therapies to patients.

I want to say thank you to my colleagues at Takeda, who partnered with me on this journey. Thank you to the JMP team for providing the tools that made this possible. Thank you all for a view here at Discovery Summit for your attention and curiosity. It's been an honor to share our story of how automation and digitalization are transforming the way that we do science. Thank you.

Presented At Discovery Summit 2025

Presenter

Maria Reyna Fernandez

Skill level

Intermediate

Beginner
Intermediate
Advanced

Data Automation and Visualization for Accelerating Cell Therapy Development (2025-US-PO-2512)

Presenter

Skill level

Automation and Scripting

Data Exploration and Visualization