Choose Language Hide Translation Bar

Converting Binary Responses to Continuous in Product Development Using DOE - (2023-US-30MP-1464)

At HP Hood, the use of design of experiments (DOE) has helped to successfully identify formulations in new product development, delighting consumers of our food products.

 

Throughout our R&D department's wide implementation of DOE, there have been a number of successful DOE models, as well as some unsuccessful ones. One major stumbling block to model building with DOE has been dealing with subjective binary responses such as acceptable/unacceptable. These binary responses provide less information than continuous responses, thus inhibiting the ability to extract meaningful results from designed experiments. 

 

This presentation shares simple and practical strategies for using the JMP DOE platform to convert these binary responses to continuous ones, resulting in improved models and powerful insights. Real-world examples from consumer food products are given to demonstrate how DOE can be used for more than just building models. It can be used to overcome the problem of responses that are difficult to measure.

 

 

This  talk  is  titled  Expanded  Uses   of  Converting  Binary  Responses

 

to  Continuous  Responses in  Consumer  Product  Development.

It's  a  bit  of  a  mouthful,

but  I  promise  it  won't  be  that  complicated.

My  name  is  Curtis  Park. I'm  a  principal  scientist  at  HP  Hood.

HP  Hood  is  a  company, a  food  and  beverage  company.

We  make  a  lot  of  different  milks,   nondairy  milks.

We  also  make  yogurt, cottage  cheese,  ice  cream.

So  a  lot  of  a  lot  of  fun things  to  taste  at  work.

I'm  a  food  scientist  by  education.

A few  years  ago I  was  asked  to  take  a  look  at  a  problem

that  we  had  for  one  of  the  beverages that  we  were  producing.

I'm  going  to  show  you a  video  just  so  you  can  see.

But  we  were  getting   a  lot  of  consumer  complaints

and  these  complaints  were  happening

when  the  product  was close  to  the  end  of  shelf  life.

A s  you  see  in  this  video,

it's  pretty  obvious   why  people  were  complaining.

I  think  I  would  complain  if  I  saw  something  like  that  too.

It's  supposed  to  be a  nice  portable  beverage.

It's  thick  and  chunky   when  it's  being  poured  out.

Not  what  I  would  expect.

Believe  it  or  not, this  product  was  not  spoiled.

I  promise  you, it  was  not  spoiled.

So  I  was  asked  to  take  a  look  at  this and  figure  out  how  can  we  fix  it?

What's  the  problem? How  do  we  fix  it?

HP  Hood  at  the  time, this  was  a  few  years  ago.

We  were  early  on  in  our  journey with  using  JMP,

and  so  I  was  really  excited  to  have an  application  to  use  in  real  life

rather  than  just  reading about  it  or  learning  about  it.

Naturally  I  felt  like  this,  like  Yahoo! Let's  run  a DoE,  let's  do  it.

I  was  really  excited

and  for  those  of  you  who  might  not  have as  much  experience  doing DoE,

the  first  step  is  usually  taking  a  look at  what  factors  should  I  be  looking  at.

So  we  did  a  few  experiments.

If  you  can  forgive  me,  they  were  probably one  factor  at  a  time  experiments.

But  we  narrowed  in  on  what  we  believed   were  the  key  ingredients

that  could  have  been  causing  the  problem.

We  ended  up  making  a  design.

This  is  probably  the  fourth  or  fifth  iteration  of  the  design

that  we  came  up  with,

and  this  was  in  custom  design.

So  if  you  go  to  custom  design,

that's  that's  the  platform   that  we  use  to  generate  this DoE .

A s  you  can  see, this  is  this  is  what  we  had.

So  we  had  ingredients  A,  B  and  C, and  it  was  actually  a  response  surface.

So  we  had  all  of  the  two  way  interactions

and  the  quadratic  terms built  into  the  model.

It  ended  up  being  17  runs, as  you  can  see  here.

It's  17  different treatment  combinations.

This  much A ,  this  much B ,   this  much  C  for  each  run.

Once  we've  settled  on  this  design,

we  were  really  excited   so  let's  go  solve  this  problem.

Piece  of  cake,  right?

You  go  into  the  lab,  into  our  pilot  plant,

you  throw  some  things  together, the  beverage  comes  out.

I'm  making  it  a  lot  more simple  than  it  actually  is.

We  made  17  different  beverages

and  then  we  put  them   on  the  shelf  for  a  little  while

because  as  I  mentioned  earlier,

it  takes  a  little  bit of  time  for  this  problem  to  appear.

Put  them  on  the  shelf  for  a  while, sat  until  they  were  ready  to  be  analyzed.

This  is  just  a  screenshot of  a  data  table.

This  has  our  actual or  our  design  that  we  used.

A s  you  can  see,  there's  a  column here  to  the  right  that  I  highlighted.

It's  our  our  friend,  the  Y, our  response  column.

So  once  we  got  to  the  point  where  we  were ready  to  ready  to  measure  that  chunky  pour

now  we  started  thinking,  Oh,  how  are  we  going  to  measure  that?

Because  a t  the  time, we  did  not  have  a  chunky  parameter.

I've  never  heard  of  one. I've  never  found  one.

If  anyone  has  ever  found  one, we'd  love  to  to  see  it  and  maybe  buy  one.

But  it's  our  knowledge. It  doesn't  exist.

So  what  options   did  we  have  to  measure  this?

Because  if  you  can't  measure  it  with  DoE it's  really  not  that  useful.

So  we  have  a  fe w  options.

First  thing  is  we  can  measure everything  as  a  binary  response.

So  it's  either  a  pass  fail, it's  good  or  bad etc.

There's  some  pros  with  this  and  some  cons.

The  pros  would  be  it's  pretty  simple  to  do,  right?

Anybody  can  say  pass  or  fail  and  it takes  you  like  no  time  to  to  measure  it.

However,  it  has  some  serious  cons  to  it.

Such  as, it's  really  subjective  to  the  observer.

What  I  think  is  good,

a  colleague  of  mine  might  think  is  bad.

Or  even  worse,  what  I  think  is  good, my  boss  might  think  is  bad.

So  it's  really  subjective.

While  it  can  give  you  some  information,

they  don't  give  us as  much  information  as  we  want.

Because  when  you  do  logistic  regression, what  you  get  out  of  it  really  are  just

probabilities  of  something  passing or  probabilities  of  failing.

In  my  experience, that's  been  difficult  to  communicate

and  to  really  understand  what  to  do with  that  data,

especially  when  we're  trying to  communicate  with  non-technical  people.

So  continuous  if  there's  any  way  to  get a  continuous  response,

that's  what  we  strive  for  because they  give  us  a  lot  more  information.

We  can  know  how  good  is  it   or  how  bad  is  it,

because  not  all good  are  created  equal.

There's  another  option we  could  have  done

and  I  would  say  this  is  probably  the  best  option

if  you  can  do  it,  is  we  could  run consumer  testing  and  get  consumer  input.

What  this  would  look  like  is  I  have  all our  beverages,  17  beverages,

and  we  recruit  maybe  100, 120  consumers  of  our  product

and  we  have  them  sit  down   and  rate  every  single  one

for  different  attributes,

one  of  them  probably  being  how  well  do  you  like  how  this  pours?

The  reason  why  this  is  a  gold  standard

is  because  those  are  the  people's opinions  who  matter  to  us.

What  we  would  do  is  after  we  get 100  or  120  responses,

you  take  a  look  at  the  data  you  get,

we  can  take  averages  and  put those  averages  into  our  model.

However,

it  can  cost  a  lot  of  money and  it  can  take  a  lot  of  time.

So  if  your  budget  doesn't  allow  it

or  your  timeline  for  whatever  reason   doesn't  allow  it,

you  can't  do  this  for  everything.

Sometimes  the  thing  you're  trying  to  measure

isn't  such  a  huge  problem   that  you're  trying  to  solve

that  it's  worth  spending  all  that  money.

But  it  would  still  be  important to  be  able  to  measure  it.

Do  you  have  any  other  options?

I  mentioned  this  earlier.

You  can  find  an  instrument  that   can  measure  what  you're  looking  for.

Sometimes  they  exist.

Like  I  said,  I  don't  know of  a  chunky  parameter.

I  looked  in  our  warehouse  in  our R&D  center,  couldn't  find  one.

Even  if  you  can  find  one,

if  this  is  something   that's  really  specialized,

you're  not  going  to  use  it  very  often.

It  doesn't  make  sense  to  buy  the  piece of  equipment  or  it  could  be  something

that  would  be  really  great, but  it  requires  a  lot  of  expertise

that  maybe  your  R&D, your  technical  department  doesn't  have  or

just  doesn't  have  the  time   or  resources  to  to  deal  with.

I'm  going  to  show  you  the  last option  we  have  here.

What  I'm  going  to  say  is  training  a  group of  people  how  to  rate  that  attribute

of  interest  and  then  let  them give  you  all  the  ratings.

This  is  quite  as  good  as having  actual  consumers.

But  here  we're  trying   to  take  subjectivity  out  of  it

and  make  it  objective.

When  well  trained  humans  can be  great  measuring  instruments.

I'm  going  to  walk  you  through   what we've  done  at  Hood

when  we  have  some   hard  to  measure  attribute.

We're  going  to  use  the  case study  of  this  chunky  pour.

This  is  our  roadmap.

I'll  walk  you  through  this   and  then  we'll  actually  do  it  live.

The  first  thing  I  wanted  to  get  across

is  that  the  samples  that  you  produce from  DoE  can  be  used  for  many  purposes.

I  like  to  tell  people   that  your  samples  are  like  gold

and  you  should  treat  them  like  gold.

They're  very  valuable.

You  may  do  a DoE  thinking  that   you're  trying  to  answer  one  question,

but  something  else  might  pop  up  later

that  you  would  be  able  to  use  those samples  to  answer  that  question  as  well.

I've  had  that  happen  to  me  many  times, so  sometimes  it's  good  to  think  about

just  ask  yourself  the  question.

I've  done  all  this  work  to  make 17  different  beverages.

What  else  can  I  do  with  them?

What  else  can  I  learn?

In  our  case,  we  use  these  samples  as  a  "calibration  set"

so  that  we  can  teach  our  humans,   my  colleagues,

how  to  measure  this  chunky  pour.

So  here's  our  method.

The  first  thing  we  do  is  we  review

all  the  samples  with  a  small  group, some  maybe  1  or  2  or  3  people  that  are

really  knowledgeable  on  the  subject or  are  responsible  for  the  project.

What  you  do  is  you  look  at  all  the  samples

and  decide  which  samples should  be  used  to  train  the  Raiders.

We're  trying  to  build  a  scale  essentially, and  then  we'll  take  that  scale

and  we'll  get  our  friends,  let's  say  10, 15,  20  friends  to  actually  rate  these,

these  samples  for  us after  we've  trained  them.

Training  step  two,

have  them  read  each  video,  step  three .

If  it's  a  video, it  could  be  something  else,  a  picture,

or  it  could  be  actually  them pouring  out  the  product

if  you  have  enough,  etc.

You  can  get  the  idea.

Next,  we'll  take  the  average of  all  those  ratings.

We'll  look  at  the  data,  make  sure there's  nothing  funky  in  there

and  then  we  will  use  those average  values  to  build  a  model.

Let's  start  with,  oops.

Let's  start  with  steps  one  and  two.

So  we're  going  t o  assume  that   we've  looked  at  all  the  all  the  videos

and  the  way  we  typically  do  it because  it's  a  little  easier

is  you  start  off  answering  the  question, which  one  is  the  lowest  in  Chunky  pour?

That  would  be  this  one  right  here.

Number  one,  I'm  going to  play  each  one  of  these.

This  just to  make  it  clear, this  is  our  scale.

It's  a  continuous  scale  from  1  to  10 and  the  1  to  10  is  kind  of  arbitrary.

If  if  you  have  something  that  works better  for  you  then  great.

The  video  right  above it  corresponds  to  that.

So  this  first  video  corresponds  to  a  one.

So  as  you  can  see, while  we're  watching  this  video

pours  nicely, no  rippling  and  no  chunkiness.

Pours  as  expected. Beautiful.

That's  that's  the  easy sample  to  identify

and  then  in  the  in  the  sample  set,  we  ask ourselves,  okay,  which  one  is  the  worst?

In  this  case,  it  was  pretty  obvious.

I  will  tell  you  again, this  product  is  not  spoiled.

So  just  with  changing  a  few  ingredients.

You  can  see  it's  so  thick, we  can't  even  get  it  out  of  the  bottom.

So  that's  obviously  a 10.

Then  we  did  a  little  bit  of  work

to  try  to  figure  out,  okay,  which  one should  we  consider  to  be  a  five?

So  halfway  in  between.

This  one,  you  can  see  it  still  flows, but  there  is  chunkiness  to  it.

Then  maybe a  two  and  a  half  would  be  this  one.

See  it  has  a  little  less  chunkiness  to  it.

Flows  well,  probably  with  normal  shaking.

It'd  probably  be  fine.

So  there's  a  little  bit  of  subjectivity,

but  you  add  more  people to  make  it  more  objective.

Then  the  last  one.

This  is  seven  and  a  half.

So  you  can  see  it's  very,  very  chunky.

The  only  thing  that  really  is differentiating  it  from  number  ten  is

that  we  can  get  it  out of  the  bottle  still  flows.

But  as  you  can  see,  it's  pretty  thick.

What  I  would  do   and  basically  in  this  amount  of  time,

I  could  train the  people  that  are  going  to  help  us

to  analyze  this, to  measure  this  chunky  pour.

Then  we'll  have  them  rate once  we've  trained  them.

I'll  basically  do  what  I  just  did.

Maybe  we'd  take  a  little  bit  more  time

to  be  more  specific  with  certain things  we  want  them  to  be  looking  for.

If  what  you're  having  someone  rate   is  a  lot  more  complicated,

then  you'll  probably  have  to  need to  take  more  time  training  people.

This  one  wasn't  pretty  complicated

and  we're  really  just  looking for  people's  first  impression.

A fter  that  you  have  them   rate  all  the  videos

i  like  to  use  Microsoft  forms  just  because it's  easy  and  I  can  get  the  the  data

really  quickly  and  easily, but  you  can  use  whatever  you  want,

including  paper,  although  that  takes more  time  and  I  try  to  avoid  that.

Just  to  show  you  what  Microsoft,  what  our  forms  look  like.

Here's  a  preview  of  it.

This  is  as  if  you're doing  it  on  your  phone.

I  like  to  make  everything   as  simple  as  possible,

and  everybody  always  has  their  phone,  so  I  can  do  it  on  a  phone.

That's  my  goal.

I'm  just  saying  chunky  pour  doughy,

and  then  they  just  go through  and  rate  each  one.

So  chunky  poor  for  treatment.

One  I'll  say,  don't  know that  that  one  was  a  six

and  we're  just  asking  people for  the  first  impression.

There's  no  right  or  wrong  answers.

Usually  people's  first impression  is  right.

So  that's  why  I'm  asking  people not  to  think  too  hard  on  it.

Maybe  number  two  is  a  ten, and  number  three  was  a  three.

I  don't  know.

They  would  go  through  all  of  those.

Then  we  would  get  our  data   and  then  using  JMP

we  would  average  all  those  ratings

and  then  we  put  the  data  into   the  data  table  to  build  the  model.

So  we're  going  to  get  out  of PowerPoint  for  a  second  and  we'll  go

to  excel  for  a  second.

This  is  what  I  get  when  I  want  to  export the  data  from  Microsoft  forms.

Like  I  said,  you  don't  have  to  use  this, use  whatever  works  for  you.

A s  you  can  see, ID  is  the  the  rater  number

just  a  random  number,

not  random,  but  just an  identifier  for  each  person.

I  left  it  anonymous  so  we  don't.

We  don't  criticize  people  who  maybe didn't  do  as  well  as  everybody  else.

And  in  this  case,  this  actually this  data  is  real  from.

I  took  this  to  a  college  class  food science  class  and  had  them  do  this.

And  so  this  is  actual  real college  students  rating.

The  rating  the  the  videos.

And  as  you  can  see,  we  have  all  these columns,  a  column  for  each  one.

So  person  one  rated,

rated  treatment  one  and  eight, they  rated  treatment,

two  of  four  treatment  three  and  nine, etcetera,  etcetera,  etcetera.

So  we  want  to  get  to  put  this  into  jump.

So  we  have  I  like  to  use  the  jump  add  in.

So  in  Excel  right  here.

And  then  just  as  long  as  you're only  highlighting  one  cell.

And  you  click  data  table, it'll  import  everything.

I've  noticed  that  sometimes

I'll  accidentally  have  like  just  a  portion  of  the  data

highlighted   and  if  you  could  data  table  now

it's  only  going  to  import what  you  highlight.

So  either  highlight  everything or  only  highlight  one.

Once  you  hit  that  data  table  button,

you  will  get  something  like  this.

So  this  is   our  data.

We  need  to  in  the  end, just  to  show  you  where  we're  trying  to  get

to  with  this  data  table  because   we  have  to  manipulate  it  a  little  bit.

This  is  our  data  table  for  the DoE.

We  run  it  was

how  much  of  ingredient  A, B  and  C  were  in  there.

I  put,  we'll  talk  about  this  in  a  minute

but  I  put  my  scale whether  or  not  I  thought  something  passed

or  whether  or  not   I  thought  something  failed.

In  the  end,  we  need  one  more column  that  says  Chunky  pour.

We'll  call  it  continuous.

And  we'll  have  an  average rating  for  for  run  one.

Average  rating  for  one,  two, three,  four,  five,  etc.

If   we  look  at  this data  table  as  it  is  today

is  not  in  that  format   because  we  need  all  these

columns  to  be  rows  and  we  need the  the  rows  to  be  in  one  column.

There's  probably  a  thousand  different ways  we  could  do  this  in  JMP

and  they're  all  good and  they're  all  correct.

I'm  going  to  show  you  one  way  to  do  it.

It's  just  the  one  that  works  for  me.

First,  what  we're  going  to  do  is  we're  going  to  stack

all  of  the  columns on  top  of  each  other.

Then  we're  going  to  do  a  summary  table

that  has  the  average  and  maybe  we'll  also add  in  the  standard  deviation  for  fun.

But  the  very  first  thing  that  I've  always been  taught  to  do  is  when  you  get  data,

you  want  to  look  at  the  graph, the  data  and  look  at  the  plot.

So  we're  going  to  actually  look at  the  distribution  really  quickly.

So  if  we  go  to  analyze.

There  we  go. Analyze  distribution.

We  want  to  look  at  the  distribution for  all  of  the  treatments.

I'm  just  going  to  highlight  them.

Go  to  the  columns  and  say,  okay.

I'm  just  looking  to  see  is  there  anything

weird  about  this  data  that  we should  be  concerned  about?

When  I  look  at  so  we  can  see  for   1, 2, 3 ,  etcetera,

I'm  looking  for  outliers,

like  for  example,  three,  everybody rated  this  sample  between  1  and  6.

There  was  someone  up  here   who  rated  it  really  high,

and  there's  also  someone up  here  that  rated  this  one  high.

So  what  I  like  to  do is  if  you  click  on  this,

it'll  highlight  where...

So  this  this  row  represents one  raider,  one  person.

So  I'm  going  to  see  how they  rated  everything

and  you  can  see  they tend  to  be  an  outlier.

The  nice  thing  is  in  JMP  is  that  once  you  highlight  one  row,

all  it  will  highlight  for  all  the  other  responses.

So  I  can  see  that,  yeah,  they  rated 3 being  higher  4  being  higher.

We  go  down,  look.

Terminate. They're  opposite  of  everybody.

It  seems  like  for  some  reason

the  the  training, they  got  a  little  confused

and  they  thought  higher  number  meant lower  chunkiness  and  vice  versa.

So   what  I'm  going  to  do  is since  I  have  this  row  highlighted,

I'm  going  to  close  this, it'll  stay  highlighted.

So  this  is  row  one.

I'm  just  going  to  delete  this  data

and  then  we'll  move  on.

Now  we  feel  pretty  comfortable with  the  data  is  pretty  much  solid.

Like  I  said,  we're  going to  stack  the  columns.

If  we  go  to  tables  stack.

It's  going  to  pop  up and  we  just  want  to  stack

all  17  of  the  treatments.

The  nice  thing  is  in  JMP  17, now  you  get  this  preview.

I  love  the  preview so  then  I  know  if  I'm  doing  things  right.

What   we  see  here  is,

as  I  can  see,

it'll  have  the  ID  so  the  rater  and  then rate  the  chunky  pour  for  treatment  1.

They  gave  it  a  five   and  they  did  number  two,  a  seven.

This  is  how  we  want  the  data  structured and  we  can  change  the  column  names.

So  instead  of  data, we're  just  going  to  say  chunky  pour,

continuous.

Then  for  label,  I'm  just  going  to   call  it  run  because  that's  really

what  we're  going  to  use this  for  in  a  minute.

I  just  stack  it.

So  I  say,  okay,  that's  how  I  want  it.

Now  we  have  the  data  table  in  this  way so  now  it  lets  us  use  a  summary  table.

S ummary  tables  are nice  ways  to  be  able  to

make  a  table of  the  of  different  statistics.

So  what  we're  going  to  do  is  we're  going  to  highlight

the  chunky  pour continuous  column  and  say  statistics.

Do  mean.

For  fun  in  case  we  want  to  use  it, we'll  also  say  standard  deviation.

This  just  gives  us  the  overall mean  and  standard  deviation.

But  if  we  want  to  do  it  per  run,

I'll  highlight,  run  and  put  it  here  in  group.

Now  when  we  look  at  this  preview, we  have  one  through  17

and  conveniently,  they're  in  order.

One,  two,  three,  four, five,  six,  seven,  eight.

All  the  way  to  17.

We  have  the  mean and  the  standard  deviation.

So  we're  going  to  say,  okay.

Okay,  so  we  have  one  more  table.

Now  we're  to  the  point  where we're  where  we  need  to  be

because  I  have  each  run  as  a  row

and  have  a  column  for  the  average  column  for  the  standard  deviation.

So  what  I'm  going  to  do  is  I  will  highlight  this  column.

If  you  go  to  edit  copy  with  column  names

and  then  I'm  going  to  go to  our  original  data  table.

We're  gonna  make  a  new  column  here

and  say  edit  paste  with  column  names.

There  it  is.

I  should  have  done  both  of  those at  the  same  time,  but  I  didn't.

So  we're  going  to  do.

Do  this  one  as  well.

Okay,  so  now  we  are  ready to  do  our  modeling.

So  first,  first  thing  I  want  to  show  you

is  what  we  would  get  if  we  just did  pass  fail  our  binary  response.

What  we'll  do  is  if  we go  to  analyze  fit  model.

Because  I  made  this  this  design  in  JMP  in  the  custom  design  platform,

it  automatically  knows what  kind  of  design  this  is

so  that's  why  my  model  is  already  built.

If  there  is  a  really  convenient  way,

if  you  knew  this  was  a  response  surface  design,

let's  say, let's  say  this  wasn't  here.

The  macros  are  convenient.

If  I  highlighted  ingredient  A,  B  and  C.

Said  Macros  Response  Service.

It  pulls  it  all  up. It  already  knows  what  I'm  looking  for.

So  that's  helpful.

I  put  it  in  the  y  axis,  the  variable, the  response  y  chunky  pour  pass  fail.

What  it  gives  us  is  nominal  logistic.

I'm  not  statistician,

so  I'm  not  going  to  go  into  any of  the  statistics  behind  what  it's  doing.

I'm  just  going  to  show  you  how  what  you  get  out  of  it

and  what a  scientist  might  be  looking  at.

So  if  I  say  run, our  target  level  is  passed.

So  when  it's  going  to  do  probabilities and  probability  of  passing.

So  we  say  run.

This  is  what  we  get.

So,  I  mean, the  first  thing  that  a  scientists  like

myself  would  probably  look at  is  this  effects  summary.

I'm  looking  at  probably  looking  at  P  values  and  I  say,

well,  nothing  significant  except  ingredient  A.

There  are  other  things  that  we would  look  at,  but  I'm  going  to...

I'm  going  to  go  over  that.

We're  not  going  to  cover  that  today.

Instead,   I  want  to  just  look  at  the  profiler,

because  that's  what  we  find,  at  least  in  our  in  our  experience,

the  profiler  being  the  most  useful and  easiest  to  interpret

for  the  scientists  and  when they're  communicating  with  others.

So  what  this  is,  is I'm  going  to  make  it  a  little  bigger.

Is  on  the  left  here.

We're  going  to  get  a  probability  of failing  and  a  probability  of  passing.

So  if  we  have  0.13  of  ingredient, a  0.12  of  ingridient  B,

0.45  and  of  ingredient  C,

and  it's  actually  0.13%,   0.12%,   0.45%.

I  just  didn't  change  it.

It's  a  very,  very  small  proportion of  the  formula  that  we're  changing

anyways  at  those  levels,

this  says  100%  of  the  time we're  going  to  pass.

If  I  move  it  up,  let's  say  to..

Have  like,

say  point  two  of  this  ingredient  now.

Now,  looks  like  we're  going to  pass  only  64%  of  the  time.

You  can  see  these  curves,

how  I  changed  ingredient  B  a  little  bit and  ingredient  C,

maybe  we  can  get  back  up to  a  point  where  we  pass  98%  of  the  time.

You  can  play  around  with  this.

But  the  problem  with  this  is, is  like  I  said  earlier,

passing.

Maybe  this  pass  right  here  is  not the  same  as  passing  over  here.

However,  we  don't  really  know  that  with  this  information,

and  it's  kind  of  hard  thing  for  some people  to  wrap  their  head  around,

like  it  was  just  probability  of  passing.

What  do  I  do  if  if  all   I  can  get  is  an  85%  pass  rate?

Like,  let's  say  hypothetically, this  was  the  best  we  could  do.

What  do  I  do  with  that?

So  that's  why  we're  looking at  continuous  responses.

I'm  just  going  to  close  this   and  we're  going  to  do  that,

build  that  model  again, except  let's  do  it  for  the  mean

of  our  continuous  scale.

So  we're  going  to  have  to  remove  chunky  pour

and  we're  going to  add  the  average  here.

We're  just  going  to  say  run. Keep  it  simple.

Do  the  effects  screening  report.

Now  you  can  see   there's  a  lot  more  information  going  on

that  we didn't  get  before.

So  where  before,  if  you  remember,

all  we  saw  was  that  ingredient  A  had  a  really  low  P  value.

Everything  else  was  like  0.99.

The  conclusion  was ingredient A  does  everything.

Well,  it's  not  actually  the  whole  truth, as  we  can  see  here.

Yes, ingredient  is  the  most  and  most  important.

The  main  effect  of  ingredient  right  here.

But  B  and  C  also  have  a  role  to  play.

While  not  as  big, they're  still  an  important  role.

So  we  look  at  our  actual  predicted  plot.

It  looks  pretty  healthy.

Our  lack  of  fit. Look s  good.

I'm  not  going  to  go  into  all  the  details of  everything  that  we  look  at,

mainly  because  I'm  not  statistician.

That's  just  what  I  look  at. I'll  look  at  the  lack  of  fit.

I'll  look  at  the  residuals  to  see  if

there's  anything  weird, the  studentized  residuals.

Then  really,  I  come  to  the  profiler

and  now  you  can  see   this  gives  us  a  much  different  picture,

much  more  complete  picture, where  as  I  increase  ingredient  A,

the  chunky  pore  increases, but  increasing  these  these  ones  does  too.

So  they  they  also  have  a  role  to  play.

If  we  were  to

say  that  we  want  to  minimize  it, I  think  it's  pretty  obvious  what  the...

Desirability  is  going to  come  out  to  being.

But  just  to  show  you,

we're  going  to  you  go  to  the  red Triangle  by  the  prediction  profiler.

Optimization  desirability  and  we're going  to  do  the  desirability  function.

Then  here,  this  is  the  desirability.

I  find  it  useful.

You  can  change  it  in  the  red  triangle,

but  I  find  it  easier  if  you  just hit  control  and  then  click  on  it.

Now  we  can  change  what  our  goal  is.

So  in  this  case,  we  want  to  minimize  this because  we  don't  want  it  right?

We  don't  like  chunky  pour .

Consumers  don't  like  it  either.

So  we're  just  going  to  say  minimize  and  okay.

Now  we  can  go  back to  that  optimization  and  desirability

and  say  maximize  it.

What I  thought  I  was  going  to  do.

Say,  take  these  two  ingredients  out.

Put  this  one  as  low  as  you  can.

You'll  get  the  the  lowest chunky  pour  that  you  can.

In  reality,  we  had  some  other  constraints, so  we  couldn't  do  that.

There  were  other  factors  at  play, but  this  definitely  gave  us

a  really  good  idea   of  where  we  needed  to  go,

what  was  important  and  how  do  we control  this  chunky  pour

to  the  point  where  when  we  implemented the  changes,  the  complaints  went  away.

It's  been  good  ever  since.

That  is  the  the  nutshell

of  how  you  could  take  something that  is  hard  to  measure.

It's  really  subjective.

It's  binary so  you  pass  fail   or  good  or  bad,

and  you  can  convert  it into  something  that's  continuous.

It's  a  relatively  simple  method.

You  can  use  it  for  a  number  of  things.

As  long  as  you  have  people  available  to  help  you  out,

you  can  you  can  measure  a  lot  of  things   that  could  be  considered  hard  to  measure.

Where   do  we  go  from  here?

At  Hood.

Just  to  give  you  an  example  of  some other  things  that  we  encountered.

This  one,  the  Chunky  Pour,  is  actually one  that's  a  little  easier  to  do.

But  let's  say  this  is  another  product  we were  working  on  a  long  time  ago  where

let's  say  you  have  coffee   and  you're  going  to  add  some  foam  to  it

and  you  want  to  understand  how  well  does that  foam  dissipate  into  the  coffee?

That's  a  that's  a  tough  thing  to  measure.

We  definitely  don't  have  any  instrumentation

that  can  really  measure  it.

Videos  really  helped  us to  understand  how  we  could  measure  it

and  get  some  useful  information  out  of...

As  you  can  see,  we're  trying  to  measure how  does  that  look?

How  well  does  it  move  that  one  versus, let's  say,  this  treatment  over  here?

You  can  see  they're  quite  different.

Where  one  moves  really  fast, the  other  moves  really  slow.

This  one  looks  kind  of  chunky the  other  one  didn't  so  much.

That's  that's  how  we  use  it. We  use  it  quite  often.

I  appreciate  you  taking the  time  to  listen  to  my  talk.

Hopefully,  I  hope  that this  has  been  useful.

You'll  be  able  to  find  a  way  that  you  can  implement  it  to

in  in  your  day  to  day  work.

Thank  you.

Author