cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
JMP Pro 17 Remedies for Practical Struggles with Mixture Experiments (2022-US-45MP-1140)

Andrew Karl, Senior Management Consultant, Adsurgo LLC
James Wisnowski, Principal Consultant, Adsurgo LLC
Heath Rushing, Principal Consultant, Adsurgo LLC

 

In mixture experiments, the factors are constrained to sum to a constant. Whether measured as a proportion of total volume or as a molar ratio, increasing the amount of one factor necessarily leads to a decrease in the total amount of the other factors. Sometimes also considering unconstrained process factors, these experiments require modifications of the typical design and analysis methods. Power is no longer a useful metric to compare designs, and analyzing results is far more challenging. Framed in the setting of liquid nanoparticle formulation optimization for in vivo gene therapy, we use a modular JSL simulation to explore combinations of design and analysis options in JMP Pro, highlighting the ease and performance of the new SVEM options in Generalized Regression. Focusing on the quality of the candidate optima (measured as a percentage of the maximum of the generating function) in addition to the prediction variance at these locations, we quantify the marginal impact of common choices facing scientists, including run size, space-filling vs optimal designs, the inclusion of replicates, and analysis approaches (full model, backward AICc, SVEM-FS, SVEM-Lasso, and a neural network used in the SVEM framework).

 

 

Hello,  everyone, and  welcome  to  JMP  Discovery.

Andrew  Karl  and  Heath  Rushing.

I  have  a  presentation  today that's  going  to  highlight

some  of  the  features  in  JMP  17   Pro that  will  help  you  out

in  the  world  of  mixture  models.

Many  of  you   are  involved  in  formulations,

and  to  be  honest,   that's  what  we've  been  doing  a  lot  lately.

We're  a  lot  like  ambulance  chasers, where  we'll  just  go  after

the  latest  thing    that customers  are  interested  in.

But  that's  what  we're  seeing  a  lot  lately, is  folks  that  are  doing  mixture  models

that  are  actually  quite  complex and  more  so  than  we'd  ever  know.

We  just  decided   we  would  do  some  deeper  investigation

with  some  of   the  new  techniques that are out,

and  that  JMP  17   performs  for  us.

Andrew,  would  you  like  to  get  started and  maybe  give  a  little  bit  of  background

on  the  whole  idea   of  what  a  mixture  model  is,

and  some  of   the  other  techniques?

Okay,  so let's  start  out with  a  nice,  easy  graph.

Let's  take  a  look  over  here at  the  plot  on  the  left.

We're  in  an  experimental  setting,

so now  I  suppose   we've  got  two  factors,

Ingredient  A  and  Ingredient  B, and  they  range  from   0-1.

If  there's  no  mixture  constraints, then  everything  in  the  black  square

is  a  feasible  point in  this  factor  space,

and so  our  design  routine  is  going to  give  us  points  somewhere  in  this  space.

However, if  there's  a  mixture  constraint

where  these  have  to  add  up  to  one, then  only  the  red  line  is  feasible.

We  want  to  get   a  conditional  optimum

given  that  constraint, and  we  want  to  end  up

somewhere  in  that  line   for  both  our  design  and  our  analysis.

If  we  move  up   to  three  mixture  ingredients,

A,  B  and  C,   all  able  to  vary  from   0-1,

then  we  get  a  cube  for  that   0-1  constraint  for  each  of  them.

But  with  the  mixture  constraints, that  takes  the  form  of  a  plane

intersecting  a  cube, and  that  gives  us  this  triangle,

so  only  this  red  triangle is  relevant  out  of  that  entire  space.

If  we  have  four  dimensions, if  we  have  four  mixture  factors,

then  that  allowable  factor  space is  actually  a  three- dimensional  subset,

a  pyramid  within  there.

Looking  back  to   the  three- mixture  setting.

See  this  triangle? That's  the  allowed  region.

Well,  that's  why JMP  gives  us  these  ternary  plots.

For these ternary plots,

what  JMP  will  do  is,  if  you  have   more  than  three  mixture  factors,

is  you'll  have  two  factors   shown  at  a  time,

and  the  third  axis  will  be the  sum  of  all  the  other  mixture  factors.

We  can  look  at  these  ternary plots,  rather  than  having  to  have  a  pyramid

that  we're  looking  throughout.

We  have  to  decide, do we  want  a   Space Filling Design

or  an  optimal  design?

Now,  normally  in  a  non-mixture  setting, we'd  normally  use  an  optimal  design,

and  for  the  most  part,  we  wouldn't consider  a   Space Filling Design.

There's  a  few  reasons that  we  want  to  consider

a   Space Filling Design in  mixture  settings.

Often  in  the  formulations  world, if  you  go  too  far,

there's  a  little  bit  more  sensitivity to  going  too  far  in  your  factor  space,

making  it  too  wide, then  your  entire  process  fails.

S uppose  that  happens over  here  where   X2  is.

Suppose  it  fails   everything  below   0.1.

You're  going  to  lose  a  good  chunk   of your runs  because  the  optimal  design

tends to  put  most  of  your  runs on  the  boundary  of  the  factor  space,

so  you're  going  to  lose  these.

You're  not  going  to  be  able  to run   your  full  model

with  the  remaining  points, and  you're  not  going  to  have

any  good  information  about   where  that  failure  boundary  is.

For  the   Space Filling Design, if  you  have  some  kind  of  failure

below 0.1,  you're  losing a  smaller  number  of  points.

Your  remaining  points   still  give  you  a   Space Filling Design

in  the  existing  space   that  you  can  use  to  fit  the  model  effects,

and now  we're  going  to  be  able to  model  that  failure  boundary.

Also  in  the  mixture  world, we  often  see  higher  order  effects  active:

interactions   or  curvatures,  polynomials,

than  we  might  see in  the  non- mixture  setting.

If  we  don't  specify  those, because  any  of  these  models  are  optimal,

conditional  on  the  target  model  we  give, so  if  we  don't  specify  an  effect  operator

for  the  optimal  model, we  might  not  be  able  to  fit  it

after  the  fact, because  they  might  be  aliased

with  other  effects  that  we  have.

These  space filling  runs  act  as something of  a  catch- all  of  possible  runs,

so  there's  a  couple  of  reasons

that  we  might  want  to  consider space  filling  runs,

but  we  want  to   take  a  look  analytically.

What's  the  difference in  performance  between  these,

after  we  run  through  model  reduction, not  just  at  the  full  model,

because  we're  not   just going to be  using  the  full  model.

That's  the  design  phase  question.

When  we  get to  the  analysis,

whenever  you're  looking  at these  mixture  experiments  in  JMP,

JMP  automatically turns  off  the  intercept  for  you,

because  if  you  want  to fit the  full  model

with  all  of  your  mixed  remain  effects, you  can't  include  the  intercept.

You'll  get  this  warning

because  the  mixture- made effects  are  constrained  to  one,

and  they  add  up  to what the  intercept  is,  so  they're  aliased.

Also,  if  we  want to  look  at  higher  order  effects,

we  can't  be  like  a  response  surface where  we  have  pure  quadratic  terms.

We  have  to  look  at these  Scheffé  Cubic  terms

because  if  we  try to  look  at  the  interactions

plus  the  pure  quadratic  terms, then  we  get  other  singularities.

Those  are  a  couple of  wrinkles  in  the  analysis.

However,  going  forward with  the  model  selection  methods,

Forward  Selection  or   Lasso, which  are  the  base  methods

of  the  SVEM  methods that we're  going  to  be  looking  at,

we  want  to  consider  sometimes  turning  off this  default  'no  intercept'  option.

What  we  find  is  for  the   Lasso method  we  actually  have  to  do that

in  order  to  get reasonable  results.

After  we  fit  our  model, now  we  want  to  do  model  reduction

to  kick  out irrelevant  factors.

We've  got  a  couple  of   different  ways  of  doing  that  in  base  JMP .

Probably  what  people  do  most  frequently is  they  use  the  effect  summary

to  go  backwards  on  the  p- values, and  kick  out  the  small  p- values.

But  this  is  pretty  unstable because  of  the  multicollinearity

from  the  mixture  constraint, where  kicking  out  one  effect

can  drastically  change  the  p-values of  the  remaining  effects  in  the  design.

What  this  plot  shows  is if  we  go  backwards  on  p- values,

what  is  the  largest p-value  that's  kicked  out

and  we  see  some jumping around   here  and  that's  from  that  effect.

Given  that  kind  of volatility  in  the  fitting  process,

you  can  imagine if  you  have  small  changes

in  your  observed  responses, maybe  even  from  assay  variability,

or  any  other  variability, just  small  changes  can  lead

to  large  changes in  the  reduced  model

that  is  presented  as a  result  of  this  process.

That  high  model  variability is  something  that'd  be  nice

if  we  could  average  over  in  some  way,   in  the  same  way  that  maybe

with  Bootstrap  for est we  average  over  the  variability

of  the  CART  methods or  the  partition  methods

and  that's  what  the  SVEM  methods would  be  looking  at  doing.

In  the  loose  sense, they're  kind  of  the  analog for that,

for   the  linear  methods  we're  looking  at.

Also  we  can   go to  Step wise and  look  at  min AICc,

which  is  maybe  the  preferred  method.

In our  last  slide  of the show  today  we'll  be  taking  a  look  at,

for  the  base  JMP  users, AICc  versus  B IC  versus  p-value  selection

with  our  simulation  diagnostics.

Give  credit  to  a  lot  of  the  existing work  and  leading  up  to  the  SVEM  approach.

These  are  some  great  references that  we've  used  over  the  years.

Also  want  to  thank  Chris  Gottwalt  for  his patience  and  answering  questions

and  sharing  information  as  they've discovered  things  along  the  way.

That's really  helped  set  us  up  to  be  able to  put  this  to  good  effect  in  practice.

Speaking  of  in  practice, where  have  we  been  using  this  quite  a  bit

over  the  years  is  the  setting  of  liquid nanoparticle  formulation  optimization.

What  is  lipid  nanoparticle  formulation? Well, a lipid nanoparticle,

if  you've  gotten  any  mRNA  COVID  vaccines Pfizer,  Moderna,  then  you've  gotten  these.

What  these  do  is  you  take  a  mixture of  four  different  lipids,  summarized  here,

and  they  form  these   little  bitty  bubbles.

Then  those  are  electrically  charged,

then  they  carry  along  with  them a  genetic  payload,

mRNA,  DNA,  or  other  payloads that  either   act as  vaccines

or  can  target  cancer  cells.

The  electric  charge between  the  genetic  payload,

and then  the  opposite  charge  in  the nanoparticle  is  what  binds together.

Then  we  want it  to  get  through  the  cell and  then  release  the  payload  inside.

The  formulation  changes  depending on  what  the  payload  is.

A lso  sometimes  we  might  change the  type  of  ionizable  lipid,

or  the  type  of  he lper  lipid  to  see which  one  does  better,

so  we  have  to  redo  this process  over  and  over.

For  the  most  part, the  scientists  have  found  that  these

ranges  of  maybe 10-60% for  the  lipid  settings

and then  a  narrower  range  of   1-5%.

That's  for  the  most  part the  feasible  range  for  this  process.

That's  been  explored  out, and  that's  what  the  geometry

we  want  to  match here  in  our  application  is.

We  want  to  say, given that  structure that  we're  doing  over and over,

do we  have  an  ideal analysis  and  design  method  for  that?

A lso  we  want  to  set  up  a  simulation that  if  we're  looking  at  other  structures,

other  geometries  for  the  factor  space, maybe  we  can  generalize  to  that,

but  that's  going  to  be our  focus  for  right  now.

Given  that  background,

I'm  going  to  let  Jim  now  summarize the  SVEM  approach  and  talk  about  that.

Yes,  thank  you.

This  particular   discovery  presentation,

what  we're  going  to  do is  a  little  bit more centered  on   PowerPoint,

unfortunately,  because  the  results  are really  what  this  is  all  about

for  the  simulations  that  we've  done.

But  this  particular  session, I  will  show  you  some  of  the  JMP  17

that  we  have  new  capability.

If  I  go  and  I  want  to set  up  a   Space Filling Design...

Now,  previously  we  weren't  able to do  mixture  models  with   Space Filling Designs

right from  out  of  the  box,  if  you  will.

We  certainly  could  put constraints  in  there,

but  now  what  we  want  to  do is  show  you  how  you  can  do

a  Space F illing  Design with  these  mixture  factors.

This  is  new  that  I  now  have these  come  in  as  mixture,

which  is  good because  it  now  carries

all  of  those  column  properties with  it  as  well.

One  thing  worth  mentioning   right  now is

the  default  is  you're  going to  get  10  runs  per  factor,

so  40  runs  in  a  DoE   typically  is  good

and  we  are  happy because  our  power is  well  above  90%

or  whatever  our  criteria  is.

But  that's  not  the  case [inaudible 00:09:54]  these  mixture  models

because  there's  so many  constraints  inherent  in  it.

What  that  is  telling  me,  unfortunately, is  even  if  I  were  to  have  40  runs,

I'd  still  only  have  5%  power from  doing  Scheffé  Cubic

and  even  if  it's  main effects  only, there's  only  20%  power.

Power  now  is  not  really  a  design  criteria that  we're  going  to  look  to

when  we  do  these  mixture  models.

Now,  typically  in  our  applications,

unfortunately,  we  don't  have the  luxury  of  having  40  runs.

In  this  case  we'll  do 12  runs  and  see  how  that  comes  out.

We'll  go  ahead  and  make   that Space  Filling  Design,

and  you can see that it's  maybe evenly  spread  throughout  the  surface.

Of  course,  we  do  know that  we're   bounded  with  some  of  these  guys  here

that  we  can only go from  1 -5%  on  the  polyethylene  glycol.

What  I  want  to  do now is  just  go  to  fast  forward.

Now  let's  say  I've  run  this  design and  I'm  ready  to  do  the  analysis.

This  is  where  SVEM  is really  made  huge  headway

and  if  you  listen  to some  of  Chris  and  Phil  Ramsey's  work

out  there  on   JMP  community, you'll  see  this  is  a  step  change.

This  is  a  game  change in  terms  of  your  analytical  capability.

How  would  we  do  this  in  16?

In  JMP  16  what  we'd  have  to  do is  we'd  have  to  come  through,

and actually  it's  worth  going   through the  step  just  because  it  gives  you

a  little  bit  of  insight, though  the  primary  mission

of  this  briefing or  this  talk  is  not  SVEM,

it  will  give  us an  idea  of  what's  going  on.

What  we  can  do  here  is  we  can  go  ahead and  we'll  make  the  Auto validation  Table.

This  is  the  JMP  16  methodology.

What  you'll  note  here  is  we've  gone  from  12  runs  to  24.

We  just  doubled  them and  you  see  the  Validation  Set.

The  training  may  be    the first 12, validation  the  next  12.

That's  what's  added and  then  we  have  this  weight.

This  is  the  Bootstrap  weight and  it's   fractionally  weighted.

What  happens  is we  will  go  ahead  and  run  this  model

and  come  up  with  a  predicted  value,

but  then  we  need  to  change  these  weights and  then  keep  doing  this  over  and  over

for  our Bootstrap, much like a  random  forest,  idea  for  the  SVEM.

Now  what  is  useful is  to  kind  of  take  a  quick  look.

What  is  the  geometry   of  these  weights?

We  can   see  they're  anti- correlated,

meaning if in   the  training  set  that I'm  low,

I'm  probably  going to  be  high  in  the  validation  set.

This  is  kind  of  a  quick little  visual  of  that  relationship.

Now  I'm  ready  to  go   do  my  analysis  in  JMP  16.

It  would  be  analyzed  and  we'd  just  do  our  fit  model.

Of  course,  we  want a  generalized  regression

and  we'll  go  through and  do  a   Scheffé  Cubic  here,

because  it's  a  mixture.

But  here's  where we  have  to  add  in  the  step,

we  put  the  validation set as  your  validation  column

and  then  this  validation  weight is  going  to  be  that  frequency.

Now  I  can  run  this.

By  the  way  we  could,

in  many  of  our  instances   we're  not  normal,  we're  log  normal,

we  could  put  that  in  right  there.

Here  we  have  our generalized  regression  ability

to  go  ahead  and  run  this  model and   voila,  there  are  the  estimates.

What  we  would  do  then is we come here  under  the  Save  the  Prediction  Formula.

Then  here  is  one  run.

Okay,  so  we  got  one  run.

You  can  see  that  the  top  is  15.17,  and  we  actually  saw  15.23,

so  not  bad  in  this  model,

but  we  would  do  this  over  and  over.

We  used  to  do  it  about  50  times  or  so.

But  with  JMP  17, now  this  whole  process  is  automated  to  us.

We  don't  have  to  do  this 50  times  and  then  take  the  average

of  these  prediction  formulas.

We're  able  to  go  directly  at  it.

If  I  come  back  to  my original  design  here  with  the  response,

I  can  get  right  at  it.

By  the  way,  this  is  showing that have  another  constraint  put  in  here.

A  lot  of  times  we  have  the  chemists and  biochemists  like  to  see  that

to  make  sure  that  the  ratios based on  molecular  weights  are  within  reason.

Not  only  do  we  have the  mixture  constraints,

we  also  have a  lot  of  other  constraints.

I'm  working  with  a  group where  we  have  maybe  15  different  ingredients

and  probably  30  constraints in  addition  to  the  mixture  constraints,

so  these  methods   work  and scale up,

probably  is  the  best  way  to  say  it, pretty  well.

Now  this  is  17, so  17  I  can  get  right  at  it.

I'm  going  to  go  ahead  into  Fit  Model, and  I'll  go  ahead  and  do  a   Scheffé Cubic.

From  here,  what  we're  able  to  do is come  into  a  generalized  regression.

In  this  case,  we  don't  need to  worry  about  these  guys  in  here.

We  can  change  it  to log normal  if  we  so  desire.

One  of  my  choices  in  the  estimation

instead  of  Forward  is in  fact  SVEM  Forward,

so  I  do  SVEM  Forward and  I'm  going  to  go  do  200.

You'll  see how quickly  they  have  this  tuned.

Really the  only  thing  you  can  do in  advanced  controls

is  check  whether  or  not you  want  to  force  a  term  in.

I  hit  Go  and  instantaneously  I've  done 200  Bootstrap  samples  for  this  problem.

Of  course, I  now  can  look  at  the  profiler

and  that  is  the  average  of  the  200  runs.

That's  kind  of  my  end  model,  if  you  will.

Of  course,   with  Prediction  Profiler,

there  are hundreds  of  things  you  can  do from here

and  Andrew  will touch on  a  couple  more  of  those.

But  two  other  things worth  noting  here,

I'll  save  the  Prediction  Formula as  well  and   take  a  look  at  that  guy.

When  I  look  at  the  Prediction  Formula,

I'll  note  that  it  is  in  fact already  averaged  out  for  me  here.

This  is  the average  of the 200   different  samples  that  are  out  there.

With  that,  that  is  the  demo,

and  we'll  go  back  to looking  at  the  charts  there  to   say,

"Well, What  is  it  that  we're  seeing in  terms  of  the  results  of  SVEM?"

Andrew,  if  you  want  to  pull  up  that  slide.

This  is  maybe  a  quick  visual.

You  can  see  that if  I  look  at  those  first  three,

in  this  case,  red  is  bad.

What  we're  looking  at  here is  the  nominal  coverage.

This  is  a  mixture  model

at  the  predicted optimum  spot  of  performance.

We  can  see  that  the  standard Step wise  guys  are  not  doing  too  well.

That's  the  backward  and  forward  AIC.

This  is  the  coverage  rates.

We'd  like  it  to  be  a  nominal 5%  error  that  we  don't  actually  see

the  true  response  in  the  prediction or  actually  the  confidence  interval.

In  other  words, when  we  looked  at  the  profile

that  we  just  saw, it gives us  a  prediction  or  confidence  interval

was  the  true  value,

which  we  know  because we're  playing  the  simulation  game,  right?

We  know  what  the  true  value  is, what  percentage  of  time  was  that  in  there?

We  can  see  that  we  don't  do as  well  with  our  classical  methods.

The  full  model, putting  all  the  terms in,

Lasso  does  pretty  well at  a  10%  rate  or  so,

but  it's  not  until  we  get   the SVEM methods  here  that  we  start  seeing

that  we're  truly  capturing and  getting  good  coverage.

A  good  maybe  picture to  keep  in  your  mind,

that  we  are  way  outperforming  some  of  the  other  methods  out  there

when  it  comes  to  the  capability.

Now,  in  terms  of  how  this  simulation,

what  we're  focusing  on  here  is  a  little bit  different  than  what  you  may  think of,

in  terms  of  a  simulation  where   we're looking  at  a  model  and  saying,

"How  well  did  we  do  with  this particular  method?"

We  could  measure  that by  how  the  actual  versus  the  predicted  is

and  then  we'd  get   some sort of  a  mean  squared  error.

We  do  track  that  value,

but  we  find  in  our  work, we're  much  more  concerned  about  finding

that  optimal  mixture,  if  you  will, with  the  optimal  settings  that  achieve  us

a  maximum  potency or  minimizes some side-effects

or  helps  us  with this   [inaudible 00:18:58]

That's  going  to  be  called the "percent  of  max"  that  we're  looking  for.

We're  going  to  use  that as  our  primary  response  here

in  terms  of  being  able  to evaluate  which  methods  outperform  others.

It's  not  really  going  to  be,

how  far  away  am  I from  the  location  of  the  optimal  value?

It's  how  far  is that  response that  I  predicted  as optimum

how  far  is  that  from  the  actual  optimum?

That's  going  to  be our  measure  of  success.

The  way  this  will  work  is I'll  be   asking  Andrew  a  few  ideas  here

in  terms  of  what typically  comes  up  in  practice.

I  saw  the  geometry   he showed  me  early  on

and  the  optimal  design was  always  hit  the  boundaries.

What  if  I  like  things  that  we call  more  mix,  right?

You  have  more  mixed  stuff in  the  middle,

space fill,  which  is  better.

If  I  do  use  an  optimal  design,

it  defaults  to  D  maybe, but  what  about  I  and  A?

Then  how  about the  age- old  DOE  adding  center points?

Is  that  smart? Or  is  one  center point?

Or  how  about  replicates?

We've  already  discussed how  we're  not  being  helpful,

so  what  is  a  helpful  measure of  a  good  design?

That's  the  design  piece, but  also  the  analysis  piece is,

is  there  a  particular  method that  outperforms  everyone,

or  are  there  certain  areas that  we  should  focus  on

using  Lasso  and  others,

that  we  should just  use  SVEMs  for  selection?

These  are  practical  questions that  come  up  from  all  of  our  customers,

and  we'd  like  to  share  with  you

some  of  the  results that  we  get  from  the  simulation.

Andrew,  you  want  to  give  us

a  little  bit  more  insight into  our  simulation  process?

Yeah,  thanks,  Jim.

Before  I  do  that,

I  just  want  to  point  out  one  tool

that  we've  made  heavy  use  of in  the  analysis  of  our  results,

and  unfortunately,  we  don't  have  time to  delve  into  the  demo,

but  it  has  been  so  useful is, within  the  profiler

to  look  at  the  output  random  table for  these  mixture  designs

and  to  look  at  the  responses  especially,

we  frequently  have  potencies by  side  effects.

We  have  multiple  responses,

that we want to   balance  that  out with  the  desirability  function,

and  then  we're  going  to  look at  the  individual  responses  themselves.

When  we  output  a  random  table, we  get  a  Space  Filling  Design,

basically  not  a  design, but  we  fill  up  the  entire  factor  space,

and  we're  able  to  look at  the  marginal  impact

of  each  of  the  factors over  the  entire  factor  space.

For  example, for  the  ionized  lipid  type,

what  we'll  frequently  see  is, we  can  see  that  maybe  one  has

a  lower  marginal  behavior over  the  entire  space.

But  since  we're  wanting  to  optimize,

we  care  about  what the  max  of  each  of  these  is,

and  one  of  these will  clearly  be  better  or  worse.

We're  looking  at  the  reduced  model.

After  he  fits  them, we'll  go  to  the  profiler  and  do  this.

We  can  still  get the  analytic  optimum  from  profiler,

but  in  addition  to  that,

this  gives  us  more  information outside  of  just  that  optimum.

What  we  might  do  here  is for  candidate  runs,

because  we  always  running our  confirmation  runs

after  these  formulation and  optimization  problems

is  we  might  run the  global  optimum  here  for  H 102,

we  might  pick  out the  conditional  optimum  for  H 101

and  see  which  one does  better  in  practice.

Also,  looking  at  the  ternary  plots,

if  we  color  those  by  desirability or  by  the  response,

we  can  see  the  more or  less  desirable  regions

of  that  mixture  space,

so  that  can  help  us as  we  either  augment  the  design

or  either  include  additional  areas in  the  factor  space,

or  to  exclude  areas.

I  can't  do  much  more with  that  right  now,

but  I  wanted  to  point  that  out

because  that's  a  very  important  part of  our  analysis  process.

How  do  we  evaluate some  of  these  options

within  this  type  of  geometry of  a  factor  space?

We  built  a  simulation  script that  we  have  shared  on  the  JMP  website,

and  it  allows  us  to   plug  and  play for  different  sample  sizes  in  total,

how  many  runs are  in  the  design?

We  have  a  true  form  choice

that  gives  us  the  true  generating  function behind  the  process,

a  design  type, either  space  filling  or  optimal.

The  optimal  design  now  is  going  to  be of  a  certain  minimum  size

based  on  the  number  of  effects that  we're  targeting.

Do  we  have  a  second -order  model, a  third -order  model,

a   Scheffé Cubic  model?

What  do  we  have?

Normally,  whenever  you  build  a  model and  custom  design  in  JMP,

it  writes  a  model  script  out  to  your  table and  then  you  use  that  to  analyze  it.

Well,  something  we've  explored is  allowing  a  richer  model

than  what  we  get, what  we  target,

and  are  we  able  to  use these  methods  with  SVEM

and  get  additional  improved  results,

even  though  we  didn't  originally  target those  effects  in  the  design?

The  short  answer  there  is  yes.

That's  something  else we  want  to  consider,

so  we  allow  ourselves with  the  effective  choice

to  include  additional  effects.

We  can  look  at  the  impact of  adding  replicates  or  center points

and  that  custom  DWI  dialogue to  enforce  those.

How  does  that  affect  our  response?

Because  any  of  the  summaries that  you  get  out  of  the  design

and  out  the design  diagnostics are  beginning  targeting  the  full  model,

either  with  respect to  prediction  variants,

D-optimal, you're looking at standard  errors  for  the  parameters.

But  what  we  really  care  about is  how  good  is  the  optimum

that  we're  getting  out  of  this,

so  that's  what  we're  going  to  take a  look  at  with  these  simulations.

For  the  most  part in  these  LMP  optimization  scenarios,

a  lot  of  times, we'll  come  across  two  situations.

The  scientists  will  say, "I've  got  about  12  runs  available,

and  maybe  it's  not that  important  of  a  process,

or  the  material  is  very  expensive,

and  I  just  need  to  do  the  best  I  can with  12  runs.  That's  what  I've  got."

Or  it  might  be  something  where  they've  got a  budget  for  40  runs,

and  they  can  fit a  full  second -order  model

plus  third  order  mixture  effects,

and  we  want  to  try to  characterize  this  entire  factor  space

and  see  what  the  response  surface   looks  like  over  the  whole  thing.

Those  are  the  two  scenarios  we're  going   to  be  targeting  in  our  simulation.

Jim,  I  think  you  had  some  questions

about  performance under  different  scenarios.

What  was  your  first  question  there?

I  did.

I  guess  when  I  think  about a  12- run  scenario  here,

and  if  I  just  go  with  the  default, I'd  get  a  D -optimal

and  it  would  be main  effects  only.

I  recognize  I  could  do the  space  filling  like  I  just  did,

but  my  question  is, if  I  do  the  default,

which  one  of  the  analysis  methods would  be  preferred?

Or  is  there  one?

Okay,  so  taking  a  look  at  that.

For  the  D- optimal  design, as  a  general  rule,

it's  going  to  put  almost  all  of  its  runs along  the  boundary  of  the  factor  space

and  it's  not  going to  have  any  interior  runs

unless  you  have  quadratic  terms or  something  that  requires  that.

With  a  12 -run  design,

there's  90  degrees  of  freedom  required to  fit  all  the  main  effects  here.

We've  got  a  few  degrees of  freedom  for  error,

but  mostly  we're  only  targeting the  main  effects  here.

How  do  the  analysis  methods  do?

Is  there  any  difference in  the  analysis  methods?

What  we  do,  and  all  of  these we're  going  to  summarize,

we  show  the  percent  of  max for  all  of  our  simulations  that  we  do,

and  so  we  can  see  that  distribution for  each  of  the  analysis  methods,

all  for  this  12- run, D-optimal  design  target  effects.

Then  we  also  show any  significant  differences  between  these,

and  we're  just  using  students' T.

We're  not  making a  two  keys  adjustments, so  keep  that  in  mind

whenever  you're  looking at these  significant  values.

The  winner  here  is  our homemade  SVEM  neural  approach

because  it's  not  restricted to  only  looking  at  the  main  effects,

they  can  allow  some   additional  complexity  in  the  model,

and  so  it  wins  here.

Now,  don't  get  too  excited  about  that because  this  is  about  the  best

that  we've  seen  SVEM neural  do is in  these  small  settings.

But  if  we  are  running  more  than  one candidate  optimum  going  forward,

then  maybe  we  can  include  a  SVEM  neural, but  in  general,

we  wouldn't  recommend only  sticking  with a  SVEM  neural

just  because  it  tends  to  be  more  variable, have  heavier  low  tails.

What  are  the  other  results?

We  see  the  losers  in  this  application

or  anything  that's  doing the  single  shot  model  reduction

because  all  these  effects are  significant  in  the  model,

and  any  time  we  pull  one  of  them  out, we  are  going  to  get

a  suboptimal  representation of  our  process.

That's  why  in  this  case the  full  model  does  better  than  those.

But  what's  interesting  is the  SVEM  linear  approaches

are  able  to  at  least  match that  full  model  performance.

We're  not  losing  anything by  using  SVEM  in  this  application,

so  that's  a  nice  aspect where  we  don't  have  to  worry

about  the  smaller  setting.

Are  we  hurting  ourselves  at  all by  using  AICC?

Now,  something  else  we  tried  here is  given  the  same,

you've  only  got  12  runs.

You're  only  targeting  the  manufacture

and  the  D-optimal  criteria in  the  custom  DOE.

What  if  we  allow  the  fit  model to  consider  second -order  effects

plus  there are  mixture  effects,

so  which our  model  then was  targeted  to  do?

What  happens,  and  we  see  this  JMP  here,

this  SVEM  linear  methods  are  able to  utilize  that  information

and  give  us  better  percent  of  max, get  optimal  candidates,

and  those  are  our  winners  here  now is  these  SVEM  linear  methods.

What  we  see  is  that  interestingly,

the  base  methods  for  these   SVEM  approaches,  Ford  method,

or  Ford  selection  or  the  Lasso are  not  able  to  make  use  of  that,

only  the  SVEM  is, so  that's  a  nice  property.

They  actually  beat  out  Neural,

which  is  nice  because  now  these are  native  to  JMP  17

and  they  don't  require as  much  computation  time

or  manual  set  up  as  in  their  own.

What  we  start  to  see  here is  the  theme  that  we're  going  to  see

throughout  the  rest  of  the  day is  that  any  of  these  Lasso  approaches

with  no  intercept are  going  to  give  us  sub -optimal  results

because  without  the  intercept

and  the  penalization doesn't  work  right  in  Lasso,

so you  actually  want  to  turn  off the  default  option  of  no  intercept

if  you're  going  to  try to  use  SVEM  Lasso

or  even  just  Lasso without  an  intercept.

Okay,  so  I  guess  it  looks  like SVEM  Neural  did  well  there.

But  again,  that  is  not  native.

We  can't  do  that  with  JMP  17  Pro, that's  not  in there .

We  can,  we  have  to  have  a  manual   [inaudible 00:29:01]  scripted.

Yeah, it's  not  a  manual  option.

Okay,  this  is  good,

but  I'm  also  a  fan of  the  Space  Filling  Design,

so  how  does  that  play  out in  terms  of  the  analysis  methods?

For  the  Space Filling  Design, you  can  see  rather  than  having

all  the  points  along  the  exterior, along  the  boundary,

now  we  fill  up  the  interior  space

for  both  the  mixture  factors and  the  process  factors,

which  sometimes  in  practice what  we'll  do  is,

we'll  take  the  process  factors

and  round  these to  the  nearest   0.25  or  0.5

or  whatever  granularity  works  best  for  us, but  this  is  what  it  looks  like.

In  terms  of  the  results, how  do  they  perform?

Now  what  we're  going  to  do is  compare

the  concatenation  of  the  design  approach

along  with  the  analysis  method and  see  which  these  do  best.

Looking  at  now  still  allowing the  richer  second  and  third -order  model

for  the  selection  methods and  see  which  one  does  best.

When  we  look  at  the  comparison,

the  winners  are the  SVEM Linear  approaches,

Lasso  only  with  the  intercept, not  without  the  intercept,

and  the  D- optimal.

Again,  behind  the  scenes, you  have  to  remember,

now  you're  assuming for  this   D-optimal  approach

that  your  positive  model  is  true over  the  entire  factor  space

and  you've  got  constant  various over  that  factor  space.

If  you're  worried  about  failures along  the  boundary,

then  that's  something  else  to  take  into account,  and  it's  not  built  into  this.

You  have  to  consider  that.

But  if  you  are  confident,

maybe  you've  run  this  before and  you're  only  making  minor  changes,

then  the  way  to  go  is  the  D- optimal with  the  SVEM  approaches.

Down  here,  the  losers  are  the  Lasso, with  no  intercept.

We 're going to  avoid  those,

and  you  can  see those  heavy  tails  down  here.

Not  the  SVEM  Lasso, just  the  Lasso.

Actually  here's  the  SVEM  Lasso with  no  intercept  down here.

Yeah.

They  all  get  these  Fs, so  they  all  fail.

-Conveniently,  [crosstalk 00:30:48] . -Yeah

Okay.

What  often  will  come  up,

whether  it's  designed  up  front where  we've  done  our  12  runs,

and  the  boss,

she  has  some  more  questions and  we  have  more  runs.

If  we're  going  to  do  five  more  runs,

how  does  that  impact some  of  these  results?

When  you  say  five  runs, not  a  follow -up  study,

but  your  build  is  study either  12  or  17  runs

in  a  single  shot  right  now is  what  you're  considering,  right?

Yeah,  exactly.

Okay,  so  yeah, we  can  look  at  the  marginal  impact

because  there's  a  cost  to  you for  those  extra  five  runs.

What's  the  benefit of  those  five  extra  runs?

Using  the  design  analysis you  could  use,

look  at the  FDS  plot and  your  FDS  plus  means  lower,

reflecting  smaller  prediction  variants.

Power  is  not  that  useful for  these  mixture  effects  designs.

We  don't  care  about  the  parameters.

We  want  to  know,  how  well  would  we  do with  optimization?

That's  where  the  simulation's  handy, we  can  take  a  look  at  that.

How  does  your  distribution of  your  percent  of  max  change

as  you  go  from  12  to  17  runs?

Interestingly,  there's  no  benefit for  the  single  shot  40  ICC

to  having  17  versus  12  runs.

Now,  again,  right  now  we're  looking at  the  percentage  of  max.

If  you  look  at  your  error variance,

your prediction variance is  going  to  be  smaller,

and  there  might  be some  other   [inaudible 00:32:09] ,

but  mainly  your  prediction  variance

is  going  to  be  smaller if  you  look  at  that.

But  really,  we  don't  care  that  much about  prediction  variance.

We  want  to  know, where  is  that  optimum  point?

Because  after  this,  we're  going  to  be  running  confirmation  runs

and  maybe  in  replicas  at  that  point

to  get  an  idea  of  the  process and  assay  variance  then.

But  right  now,

we  are  just  trying  to  scout  out the  response  surface

to  find  our  optimal  formulation,

so  with  that  goal  in  mind, there's  no  benefit  for  a four-day  ICC.

Now  for  the  SVEM  methods,

we  do  see there  is  a  significant  difference

and  we  do  get  a  significant  improvement

in  terms  of  our  average percent  of  max  we  obtain,

and  maybe  not  as  heavy  tails  down  here.

But  now  you  need  to  know is  that  you  need  to  decide,

is  that  practically  significant?

Do you  want  to  move from  90%  to  92%  mean  percent  of  max

in  this  first  shot  with  five  extra  runs?

You  have  to  do  your  marginal  cost

original  benefit  analysis  there as  a  scientist

and  decide  if  that's  worth  it.

Just  looking  at  it  here, what  I  think  might  be  useful

because  you  have  to  run confirmation  runs  anyway

is  if  we  run  the  12 -run  design,

you  can  then  run a  candidate  optima  or  two

based  on  the  results  we  get,

and  then  plus a  couple  of  additional  runs  maybe

in  a  high -density  region for  what  that  looks  good,

or  even  augment  out your  factor  space  a  little  bit,

and  then  you're  still  running a  total  of  17  runs,

but  now  we're  going  to  have  even a  better  sense  of  the  good  region  here,

so  that's  something  to  consider.

Something  else  we  can  see

from  running  the  simulation with  17  runs  is,

let's  look  at  the  performance

of  each  of  the  fitting  methods within  each  iteration,

and  there's  actually a  surprisingly  low  correlation

between  the  performance

of  these  different  methods within  each  iteration.

We  can  use  that  to  our  benefit

because  we're  going  to  be  running confirmation  runs  after  this,

so  rather  than  just  having  to  take one  method  and  one  confirmation  point,

one  candidate  optimal  point,

if  we  were  to,  for  example, look  at  these  four  methods

and  then  take  the  candidate  optimum from  each  of  them,

then  we're  going  to  be  able  to  go  forward with  which  one  everyone  does  best.

We're  looking  at  the  maximum  of  these.

Rather  than  looking at  a  mean  of  92%  to  94%,

now  we're  looking  at  a  mean of  about  97%  with  a  smaller  tail

if  we  consider multiple  of  these  methods  at  once.

Okay,  very  useful.

Let's  now  put  our  eyes toward  the  40 -run  designs.

Very  good  information  in  terms of  my  smaller  run  designs.

Now  with  40,  how  does  it  play  out in  terms  of  these  analysis  methods?

Are  we  going  to  see  consistent  behavior with  what  we  saw  in  the  12 -run  design?

Then  how  about  the  Space  Filling  versus the  optimal  design,  D-optimal?

-I'd  be  interested  in  that. -Okay.

Well,  first  take  a  look at  the  D-optimal  design,   40  runs,

and  now  we're  targeting all  of  the  second -order  effects,

the  third -order  effects,  mixture  effects,

and  we're  targeting  all  the  effects

that  are  truly  present in  our  generating  function,

and  we  still  see  that  we're  loaded  up on  the  boundary  of  the  factor  space

with  the  optimal  design,

and  then  if  we  were  going to  see  now  with  the  space  filling  design,

we're  going  to  see  now we're  filling  up  the  interior

of  the  factor  space  for  the  mixtures

and  for  the  other continuous  process  factors.

Let's  see  what  the   performance  difference  is.

First  of  all, focus  on  the  space filling  design,

which  analysis  methods  do  best?

And  same  as  we  saw in  the  12 -run  example  of  the  SVEM  linear,

Ford  selection  with  that  intercept, Lasso with  the  intercept  does  the  best.

The  worst  case  you  can  do is  keeping  the  full  model,

or  then  trying  SVEM or  single  shot  Lasso   with  no  intercept

and  the   D-optimal  setting, same  winners,  which  is  reassuring

because  now  we  don't  have to  be  worried  about,

"Well,  we're changing  our  design  type. Now  we  got  to  change  our  analysis  type."

It's  good  to  see  this  consistency across  the  winners  of  the  analysis  type.

The  full  model  doesn't  do  as  poorly  here with  the  optimal  design,  I  think,

because  the  optimal  design is  targeting  that  model

and  the  losers  here  are  still the  Lasso  with  no  intercept.

Then  Neural  is  really  falling  behind  here, behind  the  other  methods.

Now  let's  compare  the  space  filling to  the  D-optimal  designs,

and  we  can  really  see

the  biggest  difference  here is  within  the  full  model,

the  space  filling  designs  are  much  worse than   the   D-optimal  design.

Anytime  you're  doing design  diagnostics,

that's  all  within  the  context of  the  full  model.

For  your  D -optimality  criteria,

your  average  prediction  variance, that's  all  there.

A  lot  of  times when  you  run  those  comparisons,

you're  going  to  see a  stark  difference  between  those

and  that's  what  you're  seeing  here.

However,  in  real  life,

we're  going  to  be  running a  model  reduction  technique.

With  SVEM,  even  the  single shot  methods  improve  it.

But  especially  with  SVEM  here,

it  really  closes  the  difference  between the  space  filling  and  the  optimal  design,

and  we  see  pretty  close  to  medium, and  slightly  heavier  tail  here.

But  now  you  can  look  at  this  and  say.

"Okay.  I  lose  a  little  bit with  space  f illing  design.

But  if  I  have  any  concerns  at all about  the  boundary  of  the  factor  space,

or  if  I'm  somewhat  limited in  how  many  confirmation  points  I  can  run

and  I  want  to  have  something that's  going  to  be  not  too  far  away

from  the  candidate  optimum that  I'm  going  to  carry  forward,

then  those  are  the  benefits of  the  space  filling  design."

Now  we  can  weigh  those  out. We're  not  stuck

with  this  drastic  difference between  the  two.

Again,  that's  based  only  versus the   D-optimal  design .

I  guess  a  lot  of  times  in  our  DOE  work,

we  like  to  maybe  look at  the  I-optimality  criteria

and  even  the  A has  done  really  well  for  us.

In  particular, it  spreads  it. It's c ertainly  not  space  filling,

but  at  least  it  spreads  it  out a  bit  more  than  the  D -optimal.

Do  we  have  any  ideas how  those  I  and  A  optimal  work?

Yeah,  we  can  swap  those  out into  simulations.

One  thing  we've  always  noticed,

I  love  the  A -optimal  designs in  the  non- mixture  setting.

It's  almost  my  default  now.

I  really  like  them.

But  in  the  mixture  setting, whenever  we  try  them,

even  before  the  simulations, if  we  look  at  the  design  diagnostics,

the  A -optimal  never  does  as  well as  the  D  or  the  I -optimal,

and  that  bears  out  here in  the  simulations,

that's  the  blue  here  for  the  optimal, gives  us  inferior  results.

Rule  of  thumb  here  is,

don't  bother  with  the  A-optimal  designs for  mixture  designs.

Now  for  D  versus  I -optimal, we  don't  see  any...

In  this  application for  this  generating  function,

we  don't  see  any  difference  between  them.

However,  a  reason  to  slightly  prefer the   D-optimal  is,

there  tends  to  be  some  convergence  issues for  these  LNP  settings

where  you've  got  to  peg  over  the  one  5%

and  you're  trying  to  target a   Scheffé Cubic  model  in  JMP,

so we've  noticed  sometimes some  convergence  problems

for  the   I-optimal  designs and  it  takes  longer.

The  D -optimal, if  there's  not  much  of  a  benefit,

then  it  seems  to  be  the  safer  bet to  stick  with  the   D-optimal.

Now  we  weren't  able to  test  that  with  the  simulations

because  right  now  in  JMP,

you  can't  script  in  Scheffé Cubic  terms  into  the  DOE

to  build  an  optimal  design.

You  have  to  do  that  through  the  GUI.

We  weren't  able  to  test  that, see  how  often  that  happens,

but  that's  why  we've  carried  forward

D-optimal  in  these  simulations and  we  stick  with  those.

If  you  want  to  in  your  applications, you  can  try  both  D  and  I

and  see  what  they  look  like

both  graphically and  with  the  diagnostics,

but  the   D-optimal  seems to  be  performing  well.

Okay,  I  guess  just  keep  pulling  the  thread a  little  bit  further  is,

a  lot  of  times we'll  try  some  type  of  a  hybrid  design .

Why  don't  we  start  out  with, say,  25  space  filling  runs,

and  then  augment  that with  some   D-optimal  criterion

to   make  sure  that  we  can  target the  specific  parameters  of interest?

Does  that  work  out  pretty  well?

Yeah,  we  can  simulate  that and  we  take  a  look.

Either  we've  got...

This  is  the  same  simulated  function,

generating  function  we've  been  looking  at for  you  to run  the  D-optimal,

for you to run the  space  filling,

or  a  hybrid,  where  we  start  out with  25  space  filling  runs

and  then  we  go  to  augment  and  load in  building  15  additional  runs  targeting

the  third  order  model, and  what   we  see  is  that  now,

we  have  no  significant  difference in  terms  of  the  optimization

between  the  40-run D-optimal and  the  hybrid  design,

But  in  the  hybrid  design,

we  get  the  benefit   of  those  25  space filling runs.

We  get  some  interior  runs  protection  to  fit  additional  effects

and  protection  against  failures   along  the  boundary.

It's  a  little  bit  more  work   to  set  this  up.

We'll  do  this  for  high  priority  projects

because  only  for  those because  of that extra  cost  and  time.

But  it  does  appear  to  be   a  promising  method.

Right.

Practically  you  think  about   where  your  optimal  is  going  to  be,

there's  a  good  chance   it  could  be  in  that  interior  space

that's  not  filled  in  the  D-optimal   along  the  boundaries.

I  guess  just  maybe  going  back,

revisiting  the  ideas   of  what  if  I  had  a  center  point,

what  if  I  had  a  point   that  I  could  replicate?

Again,  maybe  on  the  40- run  design,

if  I  had  five  more   things, so  just  any  other  little  nuggets

that  we  learned   along  the  way  with  these?

Well,  this  comes  up  a  lot because  now  textbook  will  tell  you

to  add  five  to  seven  replicate  runs.

The  scientists  are  going  to  kick  you  out if  you  try  to  do  that.

A  lot  of  times  we  have  to  make the  argument

to  add  even  a  single  replicate  run

because  it  has  advantages   outside  of  the  fitting

because  now  you  get  a  model [inaudible 00:41:09]

and  just  graphically  we  can  use   that  as  a  diagnostic,

we  can  look  at  that  air  variance  relative to  the  entire  variance

from  the  entire  experiment.

It's  very  useful  to  have,

and  so  it's  going  to  be  nice to  have  an  argument  for  you  to  say  that,

"Okay,  we're  not  hurting your  optimization  routine

by  including  even  a  single  replicate  run."

That's  what  we  see  here for  the  40-run  example

by  forcing  one  of  these   to  be a  replicate  within  custom  design.

We  are  not  getting   a  significant  difference  at  all

in  terms  of  optimization.

It's  neither  helping  or  hurting.

Let's  go  ahead  and  do  that,

so  that  when  we  have  that  extra  piece of  information  going  forward.

I  don't  have  the  graphs  here because  it's  boring.

It's  the  same  thing in  this  particular  application,

forcing  one  of  them  to  be  a  centerpoint.

There's  no  difference.

Part  of  that  might  be  in  this  case, the  D-optimal  design  was  giving  us

a  center point  or  something  close   to  a  center point.

That  might  not  have  been  changing  the  design  that  much.

You  might  see  a  bigger  difference

if  you  go  back  to  the  12-run  design enforce  the  centerpoint.

But  that's  the  advantage   of  having  a  simulation  framework  built  up

where  you  can  take  a  look  at  that

and  see  what  is  the  practical  impact   going  to  be  for  including  that.

Okay,  now  how  about...

I  mentioned  I  have  this  big  project   with  lots  of  constraints.

Would  a  constraint  maybe change  some  of  the  results?

Well,  we  could  possibly  include   the  constraints

and  it's  going  to  change  the  allowed  region  within  the...

Graphically,  you're to  going  to  see a  change  in  your  allowed  region,

and  we  can  simulate  that.

Actually,  I've  done  that.

I  don't  have  the  graph up  with  me  right  now,

but  what  it  does  is  there's  not that  much  an  impact,  SVEM   still  does well.

One  difference  we  did  note   is  that  running  this  simulation

and  then  constraining  the  region  somewhat is  that  the  space  filling  improved

because  it's  got  a  smaller  space   to  fill  and  not  as  much  noise  space,

but  the  D-optimal   will  perform

just  as  well  between  the  two   with  or  without  constraint.

That  was  pretty  interesting  to  see.

But  all  of  this  applies  just  as  well with  constraints

and  nothing  of  note  in  terms  of  difference for  analysis  methods  with  the  constraint,

at  least the  relatively  simple  ones   that  we  applied.

Right,  okay,  we're  almost  running  short on  time  here,  Andrew,

but  I  do  have  a  concern.

We  have  a  misspecified  design

and  we  would  like  to  wrap  up

and  leave  the  folks   with  a  few  key  takeaways.

Here's  an  example   where  now  this  functional  form

does  not  match  any  of  the  effects  we're  considering

and  we're  relatively  flat   on  this  perimeter

where  a  lot  of  those  optimal designs are going to be

so I'm going to see how  that  works  out.

Also  note  the  [inaudible 00:43:52]   Cholesterol  set  to  a  coefficient  zero

and a true  generating  function.

Now  taking  that  true  function  going to  profil er  output  right in the  table

and  you  can  see  how  nice  it  is   to  be  able  to  plot  these  things

to  see  the  response  surface   using  that  output  right  in  the  table.

Here's  really  your  true  response  surface, and  this  is  your  response  surface,

but  what's  interesting is  it  looks  like  there's  an  illusion  here.

It  looks  like  Cholesterol  is  impactful for  your  response.

It  looks  like  it  affects  your  response, but  in  reality  the  coefficient  is  zero.

But  the  reason  it  looks  like  that   is  because  of  the  mixture  constraint.

That's  why  it's  hard  to  parse  out,

which  the  individual  mixture  effects really  affect  your  response.

We're  not  as  concerned  about  that

as we are  of  saying,   what's  a  good  formulation  going  forward?

In  this  setting,   we  add  a  little  bit  of  noise, 4% CV,

which  is  used  frequently   in  the  pharma  world.

In  this  case,  the  mean  we're  using   is  the  mean  at  the  maximum,

which  in  this  case  is  one,

and  then  also  a  much  more  extreme  40%  CV.

This  looks  more  like  a  sonar

and  they're  trying to  find  Titanic  or  something.

Hopefully  none  of  your  pharma applications  look  like  this,

but  we  just  want  to  see   in  this  extreme  case  how  things  work  out.

What  we  see   is  in  the  small  12-run  example

with  relatively  small  process  variation is  process  plus  essay  variation

is  these  baseline  designs  went  out and  SVEM,  all  the same  methods,

and  then  if  we  go  up  to  40-run,

the  space filling  isn't  able   to  keep  up  as  well,

but D-optimal   will  really  do  better now,

even  though  it's  relatively  flat  out  there  and  the  size  where  most  of  the  runs  are,

it's  able  to  pick  up the  signature  of  the  surface  here.

Now,  here's  the  difference   between  the  full  model

and  then  the  space filling and  the D- optimal.

Not  as  big  of  a  difference   for  the  SVEM  methods,

but  you  do  still  have   a  few  tail  points  down  here.

Then  they're  all  not  performing as  well  as  the  SVEM  linear,

even  though  the  SVEM  linear

is  only  approximating  that  curvature for  that  response  surface.

If  we  go  up  to  the  super  noisy  view, no  one  does  a  really  good  job,

but  still  your  only  chance   is with  the  space  filling  approaches.

But  then  when  we  go  up   to  the  larger  sample  size,

even  in  the  face  of  all the  process  variation,  process  noise,

is  now  the  option  was  able to  bounce  out  over  that  noise  better

and  is  able  to  make  better  use of  those  runs  than  the  space  filling.

A  couple  of  considerations  there.

What's  your  run  size? How  saturated  is your  model?

How  much  process  variation  do  you  have relative  to  your  process mean,

goes  into  the  balance   of  the  space  filling  versus  optimal.

If  we  take  a  look   at  what  are  the  candid  optimal  points

we're  getting  out  of  the  space  filling   versus  optimal.

I'm  sorry,  for  the  space  filling,

then  what  we  see  is  we're  on  target for  this  is  ionizable  and  helper.

We're  on  target  for  all  of  our  approaches except  for  these  last  with  no  intercept.

They're  never  on  target,

they're  always  pushing  you  off   somewhere  else.

You  can  see  graphically how  that  lack  of  intercept.

Now,  if  we  allow  the  intercept,   then we're  on  target.

That  really  is  important  to  uncheck that  no  intercept  option  for  lasso.

For  all  the  people   that  are  not  using  JMP  Pro

and  don't  have  SVEM,

you  might  say,  well,  okay, what's  your  simulation?

Here,  what  is  better? AICc,  versus  BIC,  versus  P-value.

Unfortunately,  just  using  the  number of  simulations  we've  run,

there's  not  as  consistent  approach   as  there  is  with  SVEM.

If  you've  got  a  large  number  of  runs, where there is  either specified

or  correctly  specified  or  misspecified, the  forward  or  backward  AICc  do  well.

Full  model  does  worse, whereas  in  the  smaller  setting,

the  full  model  does  better   because  all  those  terms  are  relevant.

Also  the  P-values  here,  too.

Now,  you  see,  0.01  does  the  worst, 0.01  does  the  best  in  large  setting.

Not  consistency, what P-value do you use?

0.01, 0.05, 0.1 .

The  P-value  from  this  view   is   an  untuned  optimization  parameter,

so  maybe  best  to  avoid  that   and  stick  with  the  AICc

if  you're  in  base  JMP.

However,  we  have  seen  now  that

the SVEM  approaches   for  these  optimization  problems

do  give  you  almost   universally  better  solutions

than  the  single  shot  methods.

You  can  get  better  solutions with  JMP P ro,  with  SVEM.

Great.

I  guess  we  want  to  just  wrap  up.

Some  of  the  key  findings  here,  Andrew.

Yeah,  and  also,  Jim,  any  other  comments?

Do  you  have  any  other  comments  too  about these  optimization  problems  or  anything?

Interesting  things  we've  seen  recently?

We  have,  we're  up  against  time  for  sure, but  we've  done  some  pretty  amazing  things

that  we've  come  up   with  new  engineered  lumber

that's  better  than  it's  ever  been

and  propellants   that  are  having  physical  properties

and  performance that  we  haven't  seen  before.

We  have  taken  a  step,

a  leap  in  terms  of  some of  the  capabilities

that  we've  seen  in  our  mixture  model.

Can  we  summarize   with  the  highlighted  bullet  down  there,

that  SVEM  seems  to  be  our  way  to  go,

and  if  you  only  had  one   maybe  SVEM  forward  selection,

you'll  be  covered  pretty  well.

Yes,  that's  right, because  I'm  always  scared.

Even  though  the  last   lasso sometimes it  looks  less  with  intercept,

sometimes  it  looks  slightly  better.

I  don't  know  if  it's  maybe  one   or  two  cases  were  significantly  better,

but  always  neck  and  neck   with  forward  selection,

but  I'm  always  scared  that  I'm  going to  forget  to  turn  off  no  intercept

and  then  give  myself  something   that's  worse  than  doing

or  as  bad  as  doing  the  full  model.

I'm  always  scared  of  doing  that.

SVEM  forward  selection  with  de fault setting  seems  like  a  good  safe  way  to  go.

Perfect.

Well,  with  that,  we  stand  ready   to  take  your  questions.

Comments
MxAdn

This is an excellent presentation.  I am usually looking for ideas on how to improve my own productivity, and what new methods are worth learning.  Now I am really looking forward to JMP 17 with the automated SVEM!  Also, the discussion on D-optimal vs. space-filling was extremely helpful.