Automating Statistical Comparison Analyses with a JMP Add-In (2025-US-PO-2477)

As a statistician working at a CDMO, I frequently conduct comparability studies on diverse data from various departments. These studies involve comparing methods, scales, sites, years, and more. Our general approach includes visual comparisons, variance tests, t-tests, and TOSTs. However, analyzing numerous variables and data sets can be time-consuming, particularly as TOSTs require manually entering practical equivalence values for each variable, which significantly adds to the workload.

To streamline this process, we've developed a comparability script/add-in that automates the analysis and adapts to different data sets. Users can input parameters to select study variables, define test and reference groups, and specify practical equivalence values for TOSTs, whether as a multiplier of standard deviation or a specific value. The script generates comparison visuals automatically, conducts variance tests to guide appropriate t-tests, and performs TOSTs. It produces a JMP report containing the selected data, a summary table of test results, and analytical output. This automation and standardization have significantly reduced the time required for comparability studies across various projects.

 

 

Hi, my name is Abby Collins, and I am a statistician at FUJIFILM Biotechnologies. Today, I'm going to be presenting on our comparison analysis add-in that I've built for us to use to do some statistical analysis. Some motivation for this add-in. Here at FUJIFILM, we are a contract drug development and manufacturing organization. Part of the development side of our work includes doing a scale-down model analysis or a small-scale model analysis. What that analysis does is it determines equivalency between our large manufacturing scale data to a small scale so that we can further develop and study the process and critical quality attributes on a large scale size.

One way that we determine equivalency between those two scales is by statistical analysis. The purpose of this add-in and the motivation behind it come from that analysis. This add-in allows us to do all the statistical analysis that is done for the scale-down model.

Moving on to the background. You can see here I have an image of a 2,000-liter bioreactor, and that's going to be from the commercial manufacturing large-scale data. You can see how much bigger that is in proportion to this tiny little 250-milliliter bioreactor. That's going to be what we would develop our processes in the lab. So we need to determine equivalency between these two scales.

In order to do that, statistically, we do have a requirement of three large-scale data points needed to do the analysis. We still are early in the development stages when we do this analysis. We often don't have large data sets. We do have to have at least three. And then we always recommend that we reevaluate and reuse this add-in as more data becomes available, as we move along the process of development and manufacturing.

Moving on to the actual add-in. You can see over on the data structure image here, I have a screenshot of a data set that would work well to use this add-in. The only requirements needed to actually run the add-in is having a run identification or row identification column, which you can see is highlighted in yellow, and having a group identification column as well. Therefore, your data needs to be grouped by those two grouping levels. But other than that, the add-in is robust to any column names, your group level names. That doesn't matter. You run ID, group names, the column names. None of that matters. The script is robust to all of that. It's just required that your data is stacked properly and that you at least have those two columns of information in addition to your actual data.

Moving on to running the add-in. The first prompted window that you'll get is this SSM equivalence testing window. What this is, is it's just a place for you to input how your data looks so that the script can recognize and analyze it. As for your row or run identification column, for me, in this example, that would be run number. What column contains your run type or your grouping variable? That would be scale here. And it asks for your reference group within that grouping variable, and that's going to be our large-scale data. And the test group for what we want to compare it to is our scale-down, small-scale data. So that would be small. The column that contains the first parameter for actual equivalence testing, or what's your first numeric column, and that was going to be column 3.

Moving on to the next prompt. The next window that pops up is ask the user to identify which numeric columns in your data set that you would like to actually run the analysis on. It lists all numeric columns in your data set, and you can select one to however many you would like. This is nice because the add-in will actually produce a subset specific to just those selected columns. So if you start off with a really big data set, you don't have to worry about actually subsetting it down. You'll be able to do that here.

Actually running the add-in, what is the add-in doing in the background? In the background, this little flowchart shows what the add-in is doing. This top box here after the user input section, we see we get summary statistics by scale for your data. I won't talk too much about that. I'll show that when we get to the output. And I'll also talk more about this bottom box here, where it says the Six Sigma scatter plot. We'll talk through that in the output. But one flowchart in the middle here, that's what I would like to talk about here.

In the background, the add-in is testing for equal variance between your two groups. It does that by looking at the Brown Forsythe test, the Leven test, and an F-test. If two out of three of those tests have P-values of greater than 0.05, then the add-in moves forward by assuming that you have equal variation between groups. It therefore moves on to doing a Pooled t-test, Welch's test, and a TOST or two one-sided t-tests with pooled variance. Likewise, if you actually fail two out of three of those tests, which means that you have a P-value of less than 0.05, then the add-in moves forward by assuming that you have unequal variance between your two groups, and therefore, it would produce the results for a standard t-test, Welch's test, and a TOST with assuming unequal variance.

Now, if you've ever run a TOST in JMP, you know that you need to have your practical equivalence decided and put that into the equivalence testing platform. However, for this add-in, it's automatically producing TOST results, and it's using a practical equivalence of three times the standard deviation of your reference group. For us, that's going to be three times the standard deviation of our large-scale data for each parameter that's included in the analysis.

For us, where we're at in early development, we don't really have a strong, scientifically driven, practical equivalence for us to use quite yet at this point in most cases, sometimes we do. But this is a good starting point for us to use the three times the SD, since we don't often have that information, and that's what this add-in does. It will use that and automatically produce TOST results. After that, the add-in makes a summary table of all those test results, along with all the actual JMP output that goes with those tests as well.

Now, I will go ahead and demo the add-in and talk about the results.

Here is the data table that you saw in the poster. I'm going to go to my add-ins. I'm going to hit Comparison Analysis and fill in this first window. What is my run ID column? That's going to be run number. What is my run type group? That is scale. And my reference group is my large-scale data. My test group is small-scale. And the first column that I'd like to test is column 3. Typing in upper, lower case, it is sensitive to that. So you do need to make sure that you capitalize if it's capitalized in the data set. I'm going to hit Okay. Maybe I don't want to include Parameters 5 and 6. So I'm going to hold down control and select everything other than those. I'm going to hit Okay.

What's nice is that we now have a JMP report of all the output for this comparison analysis. Up top, we have a subset of the data that was selected from me in this analysis. We have every parameter except Parameters 5 and 6 are missing because I didn't select those. After that, we have the summary statistics for the selected parameters by our scale. So we have our large and small-scale data. The default right now for the add-in is including the min, max, mean, standard deviation, and sample size per group. After that, you get a nice summary table of all the comparison results that are done in this add-in. You can see the first three columns here, and test group, we have fail count, and we have percent pass. These three columns of information go with that Six Sigma scatter plot that you can see here.

Looking at this plot, the green area is the Six Sigma region. If you're in the industry, you may be familiar with that term, but that's just three standard deviations above and below our large-scale data, which is our reference group. And that creates this green-shaded area. You can see when I scroll down, there is a red line. That's just the average of your small-scale data or your test group. We have a blue line for the average of our large-scale data. We have the individual data points plotted, and we have an overall average black line plotted as well. Test group is the number of data points that you have in your test group. That's my small-scale data. The fail count is the number of data points that you have outside of your Six Sigma range for your small-scale data. And therefore, the percent pass is the proportion of data points that are within your Six Sigma. Since we have 7 out of 8 included in this green-shaded area here, our percent pass is 88% based on this scatter plot.

Moving on to the next column of information, it's our test assumption. This is where the add-in in the background is running those equal variance tests. It says equal variance if we pass two of the three equal variance tests. And if we fail, you'll see it says unequal variance. From here, we have the appropriate t-test P-value that goes with that test assumption. So if we have equal variance, then we have the P-value that corresponds to a Pooled t-test. And if we have unequal variance, that's going to be the P-value from your standard t-test. Then we will turn that into a column that hides a pass or fail, where it pass is if we have a P-value of greater than 0.05, saying that we don't have a mean difference of zero between those two groups. And if we have a P-value of less than 0.05, then that would be a fail because we would have a significant difference in means between those two groups. Similarly, we have the P-value for the Welch's test, and also pass/fail based on that P-value as well.

After that, we have our TOST assessment results. We have equivalent and not equivalent. Those are the only options that were produced by the default in JMP. If I scroll down here after we look at our summary results, we can actually see that all of the fit Y by X information to get those test results is still in this journal if you scroll down. Here we have our standard plot with our mean diamonds, and we have the Tukey-Kramer circles plotted as well. We have our t-test information, the variance test information, the Welch's test. If we scroll down, we have our equivalence test or TOST results here as well. We have our confidence interval, our blue-shaded equivalence region, the red fail region, and all that information for all the parameters included, is all within this report as well.

By default, the add-in does keep open. If I minimize this, it has the individual fit Y by X platform window open. It has the individual platform where these scatter plots were created open as well so that you can save these to the data table. Here, this DT subset is produced, which is the subset of your data with just the parameters that you selected. I chose to leave that in and leave these things open from the add-in so that you can save these to the data table and save the data table if you would like to come back or if your data updates, you wouldn't actually need to rerun the add-in. You'll have it all saved right here. I'll minimize these and back to the presentation.

One other thing I wanted to talk about for our specific output. What we do is create this final results table. You can see here, all I did was make a combined or make a data table, right-click on the summary results, and make that into a data table. Then you can see we have a statistical conclusion and a scientific conclusion. We determine a statistical conclusion based on these statistical results. It's going to be green if we have a 100% pass of all of the tests that were chosen to be included in this analysis.

Now, if any test for some reason fails, statistically, we would just turn that red and say we're failing somewhere. The only time where we would see a yellow is if we pass everything except we have an inconclusive TOST result. We define an inconclusive TOST as having your confidence interval, like in this example here, overlap in both the equivalent and non-equivalent region. So, because we're not 100% in the pass or 100 % in the fail, technically, for us, we chose to define that as an inconclusive result. That's where it would be yellow. Otherwise, if we have an inconclusive but fail anything else, then you would have red right here.

You can see we have a red in this example because we don't have a 100% pass based on that scatter plot information, and we have an inconclusive TOST. That's why this one's red. After that, we send this over to our scientists. We review everything with our scientists, all the output. And then they will take their scientific knowledge about the attribute that's being studied or the process, and they will use the statistics and their scientific knowledge to make the ultimate conclusion of equivalency for our Scale Down Model analysis.

Some future work and where we're at with the add-in right now. What I'm currently working on is allowing the users to input a practical equivalence for the TOST. As I mentioned, the default right now with the add-in is to use three times the standard deviation of your reference group data as your practical equivalence region. However, if that's not always the case, and you would like to either use a different multiplier of the standard deviation, or if you actually have a true practical equivalence that you would like to enter, I'm working on making that the option where you can put that in right at the beginning of the add-in instead of having to manually go through and do the TOST individually. That option is in the works right now.

The other thing I would like to add is I like to decode the summary table so that we can have that inconclusive TOST result pop up right in our summary table instead of right now, the default being equivalent and not equivalent, and having to go in and change that. That's also in the works right now. I would also just like it to be more customizable for our scientists who use the add-in. If there are other tests they would want to include or be able to make it select a subset or have different visuals, I'd like to be able to take their feedback and incorporate that as well. There's always room for improvement in making the output nice and pretty. That's something that's also always in the works.

In conclusion, this add-in has been really useful for us by reducing the amount of time that it takes us to do this statistical analysis. If you've ever done TOST in JMP, you know that you manually have to go in and enter your practical equivalence, select your variance type, if you want a control group, and you need to do that for each parameter that you're doing the analysis for.

By having an automatic TOST and automatically producing summary tables and getting all these nice plots already produced just in one click has really been great for us in reducing time and improving our efficiency of this analysis. It's also been a great tool for new users in JMP and for scientists to be able to do statistical analysis on their own. They might not be super strong JMP users, but they get trained how to use the add-in, and they interpret the output, and it's very easy to use by just clicking the button. So it's been really great for us, and hopefully, we get these improvements going on in the future and spread the awareness of statistics and create more add-ins. Thank you.

Presenter

Skill level

Intermediate
  • Beginner
  • Intermediate
  • Advanced

Files

Published on ‎07-09-2025 08:58 AM by Community Manager Community Manager | Updated on ‎10-28-2025 11:41 AM

As a statistician working at a CDMO, I frequently conduct comparability studies on diverse data from various departments. These studies involve comparing methods, scales, sites, years, and more. Our general approach includes visual comparisons, variance tests, t-tests, and TOSTs. However, analyzing numerous variables and data sets can be time-consuming, particularly as TOSTs require manually entering practical equivalence values for each variable, which significantly adds to the workload.

To streamline this process, we've developed a comparability script/add-in that automates the analysis and adapts to different data sets. Users can input parameters to select study variables, define test and reference groups, and specify practical equivalence values for TOSTs, whether as a multiplier of standard deviation or a specific value. The script generates comparison visuals automatically, conducts variance tests to guide appropriate t-tests, and performs TOSTs. It produces a JMP report containing the selected data, a summary table of test results, and analytical output. This automation and standardization have significantly reduced the time required for comparability studies across various projects.

 

 

Hi, my name is Abby Collins, and I am a statistician at FUJIFILM Biotechnologies. Today, I'm going to be presenting on our comparison analysis add-in that I've built for us to use to do some statistical analysis. Some motivation for this add-in. Here at FUJIFILM, we are a contract drug development and manufacturing organization. Part of the development side of our work includes doing a scale-down model analysis or a small-scale model analysis. What that analysis does is it determines equivalency between our large manufacturing scale data to a small scale so that we can further develop and study the process and critical quality attributes on a large scale size.

One way that we determine equivalency between those two scales is by statistical analysis. The purpose of this add-in and the motivation behind it come from that analysis. This add-in allows us to do all the statistical analysis that is done for the scale-down model.

Moving on to the background. You can see here I have an image of a 2,000-liter bioreactor, and that's going to be from the commercial manufacturing large-scale data. You can see how much bigger that is in proportion to this tiny little 250-milliliter bioreactor. That's going to be what we would develop our processes in the lab. So we need to determine equivalency between these two scales.

In order to do that, statistically, we do have a requirement of three large-scale data points needed to do the analysis. We still are early in the development stages when we do this analysis. We often don't have large data sets. We do have to have at least three. And then we always recommend that we reevaluate and reuse this add-in as more data becomes available, as we move along the process of development and manufacturing.

Moving on to the actual add-in. You can see over on the data structure image here, I have a screenshot of a data set that would work well to use this add-in. The only requirements needed to actually run the add-in is having a run identification or row identification column, which you can see is highlighted in yellow, and having a group identification column as well. Therefore, your data needs to be grouped by those two grouping levels. But other than that, the add-in is robust to any column names, your group level names. That doesn't matter. You run ID, group names, the column names. None of that matters. The script is robust to all of that. It's just required that your data is stacked properly and that you at least have those two columns of information in addition to your actual data.

Moving on to running the add-in. The first prompted window that you'll get is this SSM equivalence testing window. What this is, is it's just a place for you to input how your data looks so that the script can recognize and analyze it. As for your row or run identification column, for me, in this example, that would be run number. What column contains your run type or your grouping variable? That would be scale here. And it asks for your reference group within that grouping variable, and that's going to be our large-scale data. And the test group for what we want to compare it to is our scale-down, small-scale data. So that would be small. The column that contains the first parameter for actual equivalence testing, or what's your first numeric column, and that was going to be column 3.

Moving on to the next prompt. The next window that pops up is ask the user to identify which numeric columns in your data set that you would like to actually run the analysis on. It lists all numeric columns in your data set, and you can select one to however many you would like. This is nice because the add-in will actually produce a subset specific to just those selected columns. So if you start off with a really big data set, you don't have to worry about actually subsetting it down. You'll be able to do that here.

Actually running the add-in, what is the add-in doing in the background? In the background, this little flowchart shows what the add-in is doing. This top box here after the user input section, we see we get summary statistics by scale for your data. I won't talk too much about that. I'll show that when we get to the output. And I'll also talk more about this bottom box here, where it says the Six Sigma scatter plot. We'll talk through that in the output. But one flowchart in the middle here, that's what I would like to talk about here.

In the background, the add-in is testing for equal variance between your two groups. It does that by looking at the Brown Forsythe test, the Leven test, and an F-test. If two out of three of those tests have P-values of greater than 0.05, then the add-in moves forward by assuming that you have equal variation between groups. It therefore moves on to doing a Pooled t-test, Welch's test, and a TOST or two one-sided t-tests with pooled variance. Likewise, if you actually fail two out of three of those tests, which means that you have a P-value of less than 0.05, then the add-in moves forward by assuming that you have unequal variance between your two groups, and therefore, it would produce the results for a standard t-test, Welch's test, and a TOST with assuming unequal variance.

Now, if you've ever run a TOST in JMP, you know that you need to have your practical equivalence decided and put that into the equivalence testing platform. However, for this add-in, it's automatically producing TOST results, and it's using a practical equivalence of three times the standard deviation of your reference group. For us, that's going to be three times the standard deviation of our large-scale data for each parameter that's included in the analysis.

For us, where we're at in early development, we don't really have a strong, scientifically driven, practical equivalence for us to use quite yet at this point in most cases, sometimes we do. But this is a good starting point for us to use the three times the SD, since we don't often have that information, and that's what this add-in does. It will use that and automatically produce TOST results. After that, the add-in makes a summary table of all those test results, along with all the actual JMP output that goes with those tests as well.

Now, I will go ahead and demo the add-in and talk about the results.

Here is the data table that you saw in the poster. I'm going to go to my add-ins. I'm going to hit Comparison Analysis and fill in this first window. What is my run ID column? That's going to be run number. What is my run type group? That is scale. And my reference group is my large-scale data. My test group is small-scale. And the first column that I'd like to test is column 3. Typing in upper, lower case, it is sensitive to that. So you do need to make sure that you capitalize if it's capitalized in the data set. I'm going to hit Okay. Maybe I don't want to include Parameters 5 and 6. So I'm going to hold down control and select everything other than those. I'm going to hit Okay.

What's nice is that we now have a JMP report of all the output for this comparison analysis. Up top, we have a subset of the data that was selected from me in this analysis. We have every parameter except Parameters 5 and 6 are missing because I didn't select those. After that, we have the summary statistics for the selected parameters by our scale. So we have our large and small-scale data. The default right now for the add-in is including the min, max, mean, standard deviation, and sample size per group. After that, you get a nice summary table of all the comparison results that are done in this add-in. You can see the first three columns here, and test group, we have fail count, and we have percent pass. These three columns of information go with that Six Sigma scatter plot that you can see here.

Looking at this plot, the green area is the Six Sigma region. If you're in the industry, you may be familiar with that term, but that's just three standard deviations above and below our large-scale data, which is our reference group. And that creates this green-shaded area. You can see when I scroll down, there is a red line. That's just the average of your small-scale data or your test group. We have a blue line for the average of our large-scale data. We have the individual data points plotted, and we have an overall average black line plotted as well. Test group is the number of data points that you have in your test group. That's my small-scale data. The fail count is the number of data points that you have outside of your Six Sigma range for your small-scale data. And therefore, the percent pass is the proportion of data points that are within your Six Sigma. Since we have 7 out of 8 included in this green-shaded area here, our percent pass is 88% based on this scatter plot.

Moving on to the next column of information, it's our test assumption. This is where the add-in in the background is running those equal variance tests. It says equal variance if we pass two of the three equal variance tests. And if we fail, you'll see it says unequal variance. From here, we have the appropriate t-test P-value that goes with that test assumption. So if we have equal variance, then we have the P-value that corresponds to a Pooled t-test. And if we have unequal variance, that's going to be the P-value from your standard t-test. Then we will turn that into a column that hides a pass or fail, where it pass is if we have a P-value of greater than 0.05, saying that we don't have a mean difference of zero between those two groups. And if we have a P-value of less than 0.05, then that would be a fail because we would have a significant difference in means between those two groups. Similarly, we have the P-value for the Welch's test, and also pass/fail based on that P-value as well.

After that, we have our TOST assessment results. We have equivalent and not equivalent. Those are the only options that were produced by the default in JMP. If I scroll down here after we look at our summary results, we can actually see that all of the fit Y by X information to get those test results is still in this journal if you scroll down. Here we have our standard plot with our mean diamonds, and we have the Tukey-Kramer circles plotted as well. We have our t-test information, the variance test information, the Welch's test. If we scroll down, we have our equivalence test or TOST results here as well. We have our confidence interval, our blue-shaded equivalence region, the red fail region, and all that information for all the parameters included, is all within this report as well.

By default, the add-in does keep open. If I minimize this, it has the individual fit Y by X platform window open. It has the individual platform where these scatter plots were created open as well so that you can save these to the data table. Here, this DT subset is produced, which is the subset of your data with just the parameters that you selected. I chose to leave that in and leave these things open from the add-in so that you can save these to the data table and save the data table if you would like to come back or if your data updates, you wouldn't actually need to rerun the add-in. You'll have it all saved right here. I'll minimize these and back to the presentation.

One other thing I wanted to talk about for our specific output. What we do is create this final results table. You can see here, all I did was make a combined or make a data table, right-click on the summary results, and make that into a data table. Then you can see we have a statistical conclusion and a scientific conclusion. We determine a statistical conclusion based on these statistical results. It's going to be green if we have a 100% pass of all of the tests that were chosen to be included in this analysis.

Now, if any test for some reason fails, statistically, we would just turn that red and say we're failing somewhere. The only time where we would see a yellow is if we pass everything except we have an inconclusive TOST result. We define an inconclusive TOST as having your confidence interval, like in this example here, overlap in both the equivalent and non-equivalent region. So, because we're not 100% in the pass or 100 % in the fail, technically, for us, we chose to define that as an inconclusive result. That's where it would be yellow. Otherwise, if we have an inconclusive but fail anything else, then you would have red right here.

You can see we have a red in this example because we don't have a 100% pass based on that scatter plot information, and we have an inconclusive TOST. That's why this one's red. After that, we send this over to our scientists. We review everything with our scientists, all the output. And then they will take their scientific knowledge about the attribute that's being studied or the process, and they will use the statistics and their scientific knowledge to make the ultimate conclusion of equivalency for our Scale Down Model analysis.

Some future work and where we're at with the add-in right now. What I'm currently working on is allowing the users to input a practical equivalence for the TOST. As I mentioned, the default right now with the add-in is to use three times the standard deviation of your reference group data as your practical equivalence region. However, if that's not always the case, and you would like to either use a different multiplier of the standard deviation, or if you actually have a true practical equivalence that you would like to enter, I'm working on making that the option where you can put that in right at the beginning of the add-in instead of having to manually go through and do the TOST individually. That option is in the works right now.

The other thing I would like to add is I like to decode the summary table so that we can have that inconclusive TOST result pop up right in our summary table instead of right now, the default being equivalent and not equivalent, and having to go in and change that. That's also in the works right now. I would also just like it to be more customizable for our scientists who use the add-in. If there are other tests they would want to include or be able to make it select a subset or have different visuals, I'd like to be able to take their feedback and incorporate that as well. There's always room for improvement in making the output nice and pretty. That's something that's also always in the works.

In conclusion, this add-in has been really useful for us by reducing the amount of time that it takes us to do this statistical analysis. If you've ever done TOST in JMP, you know that you manually have to go in and enter your practical equivalence, select your variance type, if you want a control group, and you need to do that for each parameter that you're doing the analysis for.

By having an automatic TOST and automatically producing summary tables and getting all these nice plots already produced just in one click has really been great for us in reducing time and improving our efficiency of this analysis. It's also been a great tool for new users in JMP and for scientists to be able to do statistical analysis on their own. They might not be super strong JMP users, but they get trained how to use the add-in, and they interpret the output, and it's very easy to use by just clicking the button. So it's been really great for us, and hopefully, we get these improvements going on in the future and spread the awareness of statistics and create more add-ins. Thank you.



Start:
Thu, Oct 23, 2025 04:00 PM EDT
End:
Thu, Oct 23, 2025 04:45 PM EDT
Ped 06
Attachments
0 Kudos