Simulating Sterility Breaches from Non-Parametric Data - (2023-US-PO-1400)

Sterility breaches of pre-filled syringes of a drug product are not directly measured but are known to be a function of syringe dimensions, plunger movement and fill weight. Fill weight is dynamically controlled so a non-parametric fit in the JMP Distribution platform was used to fit a Kernel Density based on real-world data. JSL was used to simulate 10 million iterations based on the non-parametric fit, along with plunger movement simulations based on dimension specifications and measured frictional forces. Process time for the simulations were reduced over three-fold by using invisible tables, simplifying the output and eliminating saved formulas.



My  name  is  Briana  Russo, and  I'm  a  senior  statistician

at  the  Center  for  Mathematical  Sciences at  Merck.

Today  I'll  be  going over  simulating  sterility  breaches

with  non-parametric  data.

At  Merck,  we  often  deliver  our  liquid formulated  drugs  in  prefilled  syringes.

A  group  at  Merck  that  specialized

in  that  came  to  me  asking  me to  see  if  I  could  simulate

if  there  is  any  risk  to  sterility  breaches in  them,  depending  on  historical  data

and  some  different  scenarios they  wanted  to  look  at.

There  was  two  interesting  parts  of  this

that  I  wanted  to  go  over  today  in  my coaster  and  discuss  a  little  further.

The  first  was  some  of  the  historical  data,

specifically,  the  fill  weight was  non-normally  distributed.

When  filling  the  syringes,  it's  not necessarily  processing  to  a  target.

It's  able  to  move  within  a  range

and  even  drift  outside  of  that  range for  a  bit  before  being  corrected.

That  often  results in  some  heavy  tailing  of  the  data,

which  you  can  see  in  the  bottom  left  here.

That's  an  example  of  that.

We  wanted  to  make  sure that  we  were  capturing  that  heavy  tailing,

because  obviously  that's  where the  highest  risk  is  going  to  be.

The  other  interesting  part  that  goes specifically,  into  some  JSL  scripting

is  that  I  was  dealing  with  a  large  number of  iterations  asked  for  by  the  customer

because  they  were  looking  for  10  million per  scenario  because  that's  the  order

of  magnitude,  they  were  expecting to  create  the  syringes.

I  was  able  to,  during  the  project, discover  some  techniques

to  reduce  the  processing  load  on  JMP

that  was  able  to  significantly  reduce the  process  time  for  when  I  was  running

the  simulations  and  prevent  any  crashing or  anything  like  that  from  memory  issues.

I'll  touch  on  both  of  those  things.

But  first,  I  wanted  to  go  into  a  little bit  about  more  background

on  the  prefilled  syringes and  what  we  were  looking  at.

As  I  mentioned, we  have  the  fill  weight  data.

That's  the  amount  of  liquid that's  filled  into  the  syringe.

That  again,  I  wanted  to  look at  non-parametrically

using  a  density  function.

I  was  able  to  find  that that  was  very  easy  to  do  in  JSL.

I'll  show  how  I  did  that.

Then  the  other  aspect  was the  plunger  insertion  depth.

How  deep  is  the  plunger  being  inserted and  how  close  is  that  to  the  liquid  fill?

Then  the  dimensions of  the  prefilled  syringe.

There  is  some  variability from  the  manufacturer,

I  wanted  to  make  sure that  was  being  captured.

There  were  two  key  outputs, and  they  were  a  yes  or  no  output  for  each.

The  first  was,  we  want  to  make  sure that  we  were  maintaining  a  gap

between  the  liquid  fill  and  the  plunger.

Because  if  we  don't, then  we're  going  to  be  getting

liquid  up  on  the  plunger, and  that  could  be  a  sterility  risk.

We  wanted  to  make  sure  that  the  air  gap length  was  always  greater  than  zero.

The  other  one  was  we  also  don't  want that  air  gap  to  be  too  big

because  when  we're  shipping  the  syringes, say,  on  an  airplane,

they  might  be  exposed to  lower  atmospheric  pressures,

which  can  cause the  plunger  to  move  up.

If  it  moved  up  too  much, it  could  go  beyond  a  sterile  barrier

that  was  created when  the  plunger  was  inserted.

We  don't  want  it  to  go  too  low. We  don't  want  it  to  go  too  big.

But  there's  a  lot  that  goes into  the  plunger  movement,

not  only  the  air  gap, which  is  a  function  of  the  dimensions

of  the  plunger and  how  deep  the  plunger  was  inserted

and  how  close  it  is  to  the  fill.

But  again,  also  different atmospheric  pressures

and  the  cross  sectional  area, so  the  dimensions  of  the  syringe.

There's  a  lot  of  different  inputs

and  different  sources of  variability  potentially

to  that  plunder  movement.

I  wanted  to  be  able to  simulate  all  of  those.

That  meant  that I  knew  that  my  data  table  and  JMP

that  I  wanted  to  simulate  into was  going  to  be  very  big.

The  first  change  that  I  was  able to  make,  to  make  these  simulations

a  lot  more  efficient  was  actually  just opening  up  the  historical  data

that  I  was  going  to  use, the  data  table  I  was  going  to  use

as  being  invisible.

This  made  it  so  JMP didn't  have  to  render  the  table,

this  potentially  massive  table I  was  going  to  create

and  was  able to  really  reduce  process  time

and  also  prevent  jump from  crashing  at  times,

it said,  the  memory of  my  laptop  was  exceeded.

Once  I  opened  up the  historical  data  as  invisible,

I  then  would  add  enough  rows  to  that  just blank  rows  to  get  me  to  10  million,

because  obviously  my  historical data  wasn't  that  big.

But  I  wanted  to  make  sure  that  the  data table  had  10  million  rows,

so  then  I  could  go  ahead and  simulate  10  million  iterations.

Specifically, what  I  did  for  the  non-parametric  aspect

of  the  data  was  I  fit  the  data in  the  distribution  platform  in  JMP,

and  then  I  was  able  to  just  very  easily use  the  fit  smooth  curve  function

to  save  simulations from  that  non-parametric  data

to  10  million  iterations.

Super  simple  and  easy  way  to  fit essentially  kernel  density  function

simulated  values.

The  other  two  things that  really  improved  my  simulation

was,  as  I  mentioned, there  was  a  lot  of  different  calculations

that  I  was  doing  within  a  data  table and  different  scenarios  over  20  different,

for  example,  plunger  depth targets  we  wanted  to  look  at.

As  part  of  my  JSL  script,  I  wanted to  be  looping  over  different  scenarios.

But  if  I  was  just  going  to  create  a  column that  then  referenced  previous  columns

in  a  loop,  that  could  cause reference  issues

for  each  iteration  of  the  loop, because  I  would  end  up  with  essentially

all  of  the  new  columns having  the  same  formula

because  they'd  all  just  end  up referencing  whatever  the  last

iteration  of  the  loop  was.

To  prevent  that, if  I  wanted  to  use  a  formula

for  the  column,  I  would  then need  to  delete  the  formula.

Again,  very  inefficient.

One  very  simple  and  easy  way that  I  could  get  around  this

was  instead  of  saving a  formula  for  a  new  column,

just  use  set  each  value.

This  means  that  JMP  didn't  need to  save  the  formula  at  all.

It  eliminated  that  issue with  the  looping  reference

and  then  also, again,  reduced  process  time.

The  final  improvement  that  I  made was  by  really  working  with  my  customer

in  this  case,  and  really  figuring  out what  exactly  they  needed,

I  was  able  to  streamline  things  a  lot.

Because  initially,  I  was  just giving  them  the  kitchen  sink.

Giving  them  distributions  and  histograms of  every  single  parameter  and  output,

which  they  thought  was  interesting but  was  not  really  worth  the  effort

and  worth  the  process  time.

What  they  really  just  wanted  was  what  is the  %  failure  rate  for  these  two  outputs?

I  was  able  to  make  delivering that  a  lot  more  efficient

by  eliminating  the  need of  opening  up,  say,

a  distribution  platform and  trying  to  fit  10  million  rows.

Instead,  I  just  made  sure that  any  sterility  breach,

I  just  created  a  column where  if  a  sterility  breach  occurred,

it  was  a  one,  if  it  didn't,  it  was  a  zero.

Then  it  was  very  easy  to  just calculate  the  column  mean  to  give

the  percentage  of  failure  for  any  scenario and  directly  output  that  to  a  journal.

That  way,  the  journal  also  wasn't having  to  be  massive  because  it  was  saving

so  much  information  from  the  data  table because  it  was  creating  graphs  from  it.

Overall,  initially  in  this  project,

I  was  able  to  deliver  it, but  by  using  the  platform  outputs,

visible  tables,  and  save  formulas, it  was  taking  at  least  three  hours.

Often,  I  was  letting  it  run  overnight,

so  I  don't  know  the  exact  timing, but  at  least  three  hours.

By  simplifying  the  output  alone, so  going  directly  to  the  journal  instead

of  saving  from,  say, the  distribution  platform  and  JMP,

I  was  able  to  get  this  down to  an  hour  and  49  minutes.

Then  just  those  two  simple  changes

of  making  sure that  the  data  table  was  invisible

and  saving  values  instead of  saving,  the  formula  got  me  down

to  52  minutes despite  the  volume  of  calculations

that  were  being  needed  to  be  made.

Overall,  it  can  be  very  simple

and  easy  to  simulate non-parametric  data  within  JMP

using  these  data  tables  and  using  the  fit, smooth  curve  function.

Then  also,  if  you  are  simulating really  big  data  sets in  JMP,

if  you  are  simplifying  the  output, if  you're  making  sure

that  JMP  isn't  rendering  things it  doesn't  need  to  or  calculating

and  saving  things  it  doesn't need  to,  it  can  actually  be  very  efficient

in  creating  the  simulations and  giving  you  the  outputs.

In  this  particular  case, using  those  techniques,

I  was  able  to  reduce  my  simulation time  over  a  three-fold.

That's  all  I  have. Thanks  for  listening.