Choose Language Hide Translation Bar

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Quality is a top concern for most manufacturers. Within the space of an established sampling mechanism, it is vital to be able to tell how likely a set of good samples (hero) actually represents that the entire batch/crate is good. In this presentation, we provide a distribution analysis strategy to assist in answering this question through such methods as modeling, simulation, probability analysis, and data visualization. We also demonstrate how to accomplish this analysis and develop an end-to-end application using the JMP script and user interface. The strategy is evaluated on a real-world induced data set of product samples. It provides a valuable strategy and tool for evaluating the current quality of products and decision making so that the process can be improved.     Hello  everyone. Today  my  topic  is,  Heros  or  Zeros: A  Product  Distribution  Analysis  Using  JMP. First,  a  little  bit  background. An  organization with  an  established  process may  decide  to  implement  process  control and  process  discipline in  their  organization. For  example,  if  you  have  a  product moving  from  the  development  stage to  the  mass  production  stage, at  this  conjunction, one  of  the  problems  that  can  happen in  that  we  may  have process  variation  issue. The  variation  can  be  too  large  or  the variation  doesn't  meet  the  expectation. The  variation  here  can  be the  variation  of  mean, the  variation  of  standard  deviation,  etc. We  will  want  to  find  out what  the  root  cause  is for  such  variation  problem and  try  to  fix  that. But  before  that, we  will  need  to  figure  out what  type  of  variation  we  are  facing because  the  type  of  variation will  dictate  what  kind of action and  investigation  strategy  we  should  take. The  demonstration  here  today will  investigate  these through  an  explanatory  analysis. We  use  standard  deviation as  statistics  of  interest  here. The  issue  is  we  have  a  process that  has  a  high overall   standard deviation, but  we  can  also  observe  some  of  the batches  that  has  lower  standard  deviation. We  call  these  batches,  hero  batches. We  will  want  to  find  out  what  caused such  high  overall  standard  deviation, but  before  that,  we  need  to  figure  out what  process  variation  we  are  facing, what  kind of process  variation can  give  us  what  we  observed. In  general,  there  will  be two  types  of  situation  here. One  is  that  we  have a  completely  random  process and  the  variation  is  systemic, as  we  can  see  here. Although  the  process  is  random, but  depends  how  we  batch  it and  how  we  sample  it. Some  of  the  batches  may  have  lower sample  standard  deviation  than  the  others. Another  situation  is  that our  process  is  not  random. As  we  can  see  here, this  process  goes  up  and  down. It  has  some  mean  shift. It's  not  a  random  process, but  depends  on  how  we  batch  it. Some  of  the  batches  that  reside in  the  stable  period will  have  relatively  lower sample  standard  deviation compared  to  some  of  the  batches that  reside  in  the  unstable  period that  might  have  larger  standard  deviation. We  can  also  define  a  threshold such  as  here,  point  A,  standard  deviation. We  compare  this  threshold to  the  batch  standard  deviation. It  will  tell  us  how  many of  the  batches  satisfy  the  criteria. With  these  two  scenarios  in  mind, we  can  formulate a  statistical  hypothesis  test to  test  what  process variation  we  are  dealing  with. We  can  assume  our  process  is  random, then  how  likely  we  will observe  what  we  observed? A  more  detailed  statement  is  like  this, assuming  batches with  low  standard  deviation are  just  due  to  sampling  lack and  the  historical  data is  representative  of  the  population, then  the  simulated  batches generated  from  the  same  distribution should  have  a  passing  rate that  is  statistically  indistinguishable than  the  actual  passing  rate of  the  historical  data. On  the  right-hand  side, you  can  see  this  wheel. This  is  the  procedure  we  went through  to  make  this  testing  happen. First,  we  will  need  to  define  a  threshold. Through  this  threshold, we  can  calculate  the  passing  rate. We  compare  the  batches in  the  historical  data through  this  threshold to  get  the  percentage  of  the historical  batches  that  are  good  batches. Because  we  also  assume that  our  process  is  random, we  can  fit  the  historical  data to  several  distribution and  then pick  the  best  fitted  one. Using  this  fitted  distribution, we  can  generate  a  set  of  K  samples. K  here  is  the  same as  the  number  of  samples in  each  batches  in  the  historical  data. We  repeat  this  procedure  N  times. N here  is  the  same  as  the  number of  batches  in  the  historical  data. For  each  simulated  batches, we  can  then  calculate their  sample  standard  deviation. Compare  this  sample  standard  deviation to  the  threshold  we  defined  before, it  will  give  us  a  set  of  binomial  data. With  this  binomial  data and  the  passing  rate  we  already  have, we  can  perform  a  one-sample  proportion test  to  test  our  hypothesis. Using  JMP,  we  are  able  to  integrate this  entire  procedure  into  an  application. Here,  I  will  do  a  quick  demonstration to  show  you  how  this  application  works. This  application  can  import any  of  the  data  file with  a  value  column  and  also  index  column that  indicates  the  batch  index. With  a  click  of  button, it  will  automatically  fit  our  data to  several  distribution and  pick  the  best  one. Right  now,  the  best  fitted  one is  a  normal  distribution. We  can  then  set  up the  number  of  simulated  data  sets  we  want and  also  the  size  of  the  set and  also  the  threshold. When  we  click, it  will  perform  the  hypothesis  testing I  mentioned  before. It  also  shows  the  percentage of  historical  batches  that  are  good and  also  the  percentage  of  the simulated  batches  that  are  good. At  the  last,  it  will  show  you a  visualization  of  a  histogram which  indicates  the  proportion of the  simulated  batches  that  are  good. Now,  we  go  back  to  the  testing, the  hypothesis  testing. The  data  we  have  here  shows we  reject  the  null  hypothesis. We  check  the  P  value, we  reject  the  null  hypothesis with  95%  confidence. The  95%  confidence is  the  default  setting  here. This  conclusion  suggests the  process  is  not  random and  the  good  batches  do  exist in  the  stable  period  of  the  process. This  conclusion  can  lead  to several  action  items. For  example,  we  can  investigate the  process  variable, the  process  parameter between  the  stable  period and  the  unstable  period and  see  what  changed. Of  course, we  can  also  get  a  different  testing  result where  we  cannot  reject the  null hypothesis. These  suggest  our  process  might  be  random. We  might  have  systemic  variation and  these  will  lead  to  completely different  investigation  and  action  method. For  example,  the  worst-case  scenario, in  order  to  reduce  the  systemic  variation, we  might  need  to  completely  change the  manufacturing  environment. With  this,  I  conclude my  today's  presentation. I  also  want  to  thank  John  Daffin, who  is  a  colleague  of  mine. He  brought  up  this  interesting  question to  my  attention  during  a  project  meeting. I  also  want  to  thank  you  today for  hearing  my  presentation. I'm  very  appreciative  of  it.
  Disease often shows a predilection for specific site locations. To demonstrate the body regions involved, researchers may use infographics or tables. While tables may be more descriptive and universal to any statistical software, heat maps rapidly communicate disease involvement across a cohort. At our institution, software to portray data into the confines of a human body is not readily available. Additionally, we needed the human body to be segmented into specific regions. While a graphic editor can be used, it is labor intensive, limits generalizability, and cannot be repeated with different data sets.   To create a scalable solution, we used the Custom Map Creator in JMP, an add-in to Graph Builder that allows users to create custom shapes. With this program, we outlined a human body, created anatomic regions, and colored each region based on the percentage of occurrence. The selected color range is scaled to directly correlate with the data. Each anatomical subunit is then filled and shaded by the data in the respective column. Heat maps provide easy interpretation and clearly portray the most involved areas for rapid communication to readers.       Today,  I  want  to  give  a  little presentation  about using  the  JMP  Custom  Map  Creator and  how  we use  that  to  develop  a  human  heat  map. I'd  like  to  give  a  shout out  to  my  mentor, Dr.  Benjamin  Caffender,  who  helped push  this  project  along. A  little  bit  about  the  Custom  App  creator, it  allows  you  to  create  custom  shapes that  can  be  displayed  in  Graph  Builder. What  this  allows  you  to  do  is  to take  statistical  data  and  percentages in  particular,  and  present  them in  a  color-coordinated  and color-referenced  presentation. The  way  that  this  is  used, it  comes  with  some  built-in  maps of  the  United  States or different  places  in  the  world, is  to  be  like,  oh,  in  this  state, this  percentage  of  whatever is  taking  place  is  occurring. Gives  you  that  heat  map  approach, but  it  also  allows  you  to  create different  objects  as  well. For  us,  this  was  a  dermatology  project, so  we  had  cutaneous  disease, and  so  we  wanted to  show  lesion  involvement on  what  part  of  the  bodies were  affected  by  the  disease. So  we  had  face,  90 %, we  had  scalp,  but  just  under  20 %. That  thing, we  wanted  to  show  this  dynamically. We  ran  into  what's  the  best way  to  go  about  it. We  did  some  digging. We  found  this  custom app  creator  to  do  that. What  we  did  is  we  took  an  outline  of  an image,  and  you  can  see  all  these  dots. We  went  in  and  we  traced  this  image and  created  our  different  body  areas, and  JMP  will  set those  into  specific  shapes where  you  can  input  data that  will  appear  in  those  areas. You  can  take  whatever  picture  you  want. This  is  is  an  example  of  one  that  we  used. Then  that  data  will  get incorporated  into  those  areas. So  you  can  see  over on  the  left  of  our  image, really  to  show  the  face  has  significant involvement  with  the  darker  red, and  that's  correlated  up  with the  90 %  involvement versus  lighter  areas  such as  the  trunk  or  upper  extremities. But  the  custom  upgrade lets  you  create  those  areas and  then  put  those  inputs and  put  those  percentages  in  very  easily. Rather  than  me  talking  about  it,  I'd  like to  show  you  a  little  demo  of  how  it  works, and  we  will get  going  from  there. I  already  have  JMP  open. Once  you  install  this  add-in,  it's going  to  add-in  map  shapes and  then  custom  map  creator. We  put  custom  map  creator, and  we  get  these different  data  tables  that  pop  up as well  as  our  map  creator  itself. So  what  you  do  is  you can  paste  an  image  in. I  have  a  file  over  here  that  I will  just  drag  and  drop  in. You  can  copy  paste  however  it  may  be. You  can  adjust  and  zoom  and  that  thing to  get  it  exactly  how  you  want. Give  it  a  name. Now  you  start  drawing, you  start  tracing. I'll  just  quickly  click  through  here. That's  one  shape. We  hit  Next  Shape. You  can  change  the  names  of  your shape  to  whatever  it  may  be. When  we  were  using  it, we  had  different  body  areas. If  you  look  over  on  the  right  at  the  data tables  up  here in  this  table,  this  example  X,  Y, we're  seeing  the  different, the  very  specific  data  points  or dots  that  are  being  graphed. Over  here,  we're  having these  shape  IDs  be  placed. You  can  see  up  here, I'm  working  on  shape  five. You  hit  Next  Shape. As  you  put  that  in, you  get  these  next  shapes. These  are  linked  between these  two  data  tables. Now  we  have,  let's  say, I've  finished  my  image, I  have  everything  graphed  out, you  can  hit  finish, and  that  will  just  save  it,  or you  can  also  finish  with  random  data, which  I  think  is  an  easy  way. It's  an  easy  wayto  show  it, but  also  just  to  actually  have the  data  show  up  in  the  right  place. Hit  open. Now  we  have... This  is  our... This  is  our  graph. We  can  see  this  data they  gave  us  over  here,  shape  one, I'll  just  drop  this  down  to  one  over to  shape  five  we'll  put  that  at  100. You  can  see  over  here,  shape  five,  100, this  is  our  red and  has  the  most  involvement. Down  here,  shape  one, I  guess  we  have  a  couple  on  the  lower and  this  is  the  darker  blue. But  you're  able  to  get that  stratification. So  what  you  can  do  is  you  can  either have  your  own  data  table, or  you  can  just  put  in  your  own  data  into these  shapes  as  well, and  it  will  adjust  in  real  time. What's  really  nice  about  this  add-in is  you  can  go  right here  and  you  can  change  these  gradients. Right  now  we  have  a  red  to  blue. You  can  change  it  from  this  yellow  to  red. We  can  have  that  higher  volume with  the  darker  colors  and  less involvement  with  the  lighter  colors really  easily. The  two  big  things  as  you're  doing  this that  are  important  to  know  is  if  this  is your  own  data  table,  this  shape, this  name  column and  the  column  properties  and  map  role, we  see  here  this  map  name  data  table is  connected  to  our  example  name. Jmp, which  that  is  one  of... That  is  this  file  over  here. It  is  super  important that  those  have  to  be  linked. If  you  hit  generate  random  data, it  does  it  for  you. If  you  have  your  own  data  table, you  have  to  manually  set  that. But  what  you'll  notice  is  if  you  manually create  your  own  data  table, here  this  shape  1,2,3,4,5  matches  up perfectly  with  shape  1,2,3,4,5  here. Those  names  have  to  be  exact and  those  roles  have  to  be  exact. If  you  do  that,  then  you  can have  this  data  pop  up  really  easily. You  can  also  get  these  data  tables through  Graph  Builder  as  well. But  yeah,  that's  just  a  quick overview  of  how  to  use  this  add-in. It's  very  versatile. Like  I  said,  we're  able to  use  it  for  our  project. Yeah,  I  hope  this  gave  you  just a  brief  little  bit  of  guidance  on  that. I've  uploaded  some instructions  on  how  to  use  this  as  well, which  you  can  also  find. But  I  hope  you  have  a  great  day and  a  great  rest  of  other presentation  to  get  to  watch.
X-ray photoelectron spectroscopy (XPS) analyzes the surface chemistry of materials. It is also known as electron spectroscopy for chemical analysis (ESCA) and is commonly used to measure elemental composition/stoichiometry of thin film coatings in different industries. We have applied this technique to develop Atomic Layer Deposition (ALD) AlOx coating (of < 50 nm thick) processes by testing deposited film composition to identify the O/Al ratio.   Our pharmaceutical program customers questioned whether XPS is an appropriate metrology technique to detect process variation in the coating composition. This presentation demonstrates the adequacy of XPS by using Gauge R&R in JMP 17. We designed our testing experiments using measurement systems analysis (MSA) designs platform, and a fast replicate crossed model was used with six sample coupons on two different kinds of substrates (silicon coupons and active pharmaceutical ingredient [API] pellets). Each sample was split into four parts and all 24 (6 X 4) samples were measured independently by two different vendors blindly.   The data distribution was reviewed using a variety of methods: X-bar and R control chart, performed repeatability, reproducibility, part-to-part variation testing, calculated Gauge R&R (P/TV), P/PV, P/T in MSA Gauge R&R platform. Also, the evaluating the measurement process (EMP) platform was used to determine interclass correlation (ICC) and to identify if any interaction exists with either substrate type or vendor. Both MSA platforms confirmed that part variation is significantly higher than precision level, hence XPS is adequate to detect the variation in the process     Hello .   Good  morning ,  everyone . Good  evening ,  everyone . I 'm  Sukti  Chatterjee . Before  starting  my  presentation , I  would  like  to  introduce  myself   with  few  words . I 'm  Sukti  Chatterjee   from  CTO  team  of  Applied  Materials . It  is  advanced  technology  team , and  our  team  goal  is  to  develop  product adjacent  to  the  semi  industry or  outside  the  semi  industry . For  example , we  are  working  for  aerospace  industry , pharmaceutical  industry , or  industrial  coating . This  example ,  present  example , we  are  taking from  the  pharmaceutical  industry . My  topic  of  the  presentation  today. Gauge  R&R  of  X -ray  photoelectron   spectroscopy  to  monitor  a  coating  process . Agenda  of  my  talk  today. Fi,rst,  we  will  talk  about  the  background and  problem  statement , then  we  will  discuss  about  the  operation definition  and  data  collection  plan . Next ,  MSA  component  analysis . Finally  we 'll  talk  about  the  plan for  MSA  component  improvement . Let 's  start  with  the  background . In  the  pharmaceutical  industry , in  therapeutic  windows ,  therapeutic  areas , there  are ,  for  example ,  antibiotic  drug , alcohol  addiction  or  cancer  patient , they  need  everyday  injection   because  drug  level  in  the  blood is  certainly  increasing ,   spiking  in  the  blood , and  then  within  short  time ,   within  a  few  hours , it  is  going  beyond   the  therapeutic  window  limit . That 's  why  they  need   everyday  injection  and  it  is  painful . It  causes  some  side  effect , that 's  why  patient  skipping the  medication  or  stop  the  medications . To  solve  this  problem , our  customer  needs  some  approach to  tailor  the  release  of  drug . Our  team  developed  a  barrier  layer , aluminum  oxide  barrier  layer , that  forms  a  shell   around  the  pharmaceutical  particles . Properties  of  this  barrier  layer can  control  the  release of  the  drug  in  the  blood . Even  it  is  possible  like  that , it  can  release  few  weeks   instead  of  few  hours . Here ,  we  will  talk  about the  composition  analysis , and  what  is  the  noise  analysis of  this  composition  measurement ? That  we  will  talk  here . Our  problem  is  measurement of  AlOx  coating  composition . Our  spec  limit ,   customer  spec  limit ,  is  O /Al  ratio in  the  aluminum  oxide  film  is  1 .2 -2 .3 . Our  objective  here  to  determine   the  XPS  method  if  it  is  adequate to  differentiate  AlOx  process  variation . We  will  determine  here ,   gauge  R&R  measurement  error of  XPS  for  AlOx  composition  analysis . X -ray  photoelectron  spectroscopy can  measure  quantitatively atomic  percentage  of  composition . It  can  measure  aluminum   and  oxygen  percentage . XPS  actually  measure  the  kinetic  energy of  photoelectrons  emitted   from  the  elements and  it  counts  the  electrons . Whenever  it  is  counting  the  electrons , it  can  count  the  presence  of  elements and  also  it  counts   the  element  which  is  bond  to  it . That 's  why  we  can  get  the  information about  aluminum  and  oxygen in  the  aluminum  oxide  film . Most  of  the  source  of  error  for  XPS , it  can  add  it  in  gauge  R&R . It  can  reproducibility ,  it 's  coming from  the  calibration  electron  count . It  can  add  repeatability   and  reproducibility  error . Analysis  can  add  reproducibility  error . We  will  talk  more  details this  one  in  the  next  slide . In  our  operation  definition , we 'll  talk  about  the  different  steps of  the  XPS  measurement   and  how  it  can  introduce the  error  in  the  measurement  error  GRR , gauge  R&R  error . Our  objective  measure  aluminum  oxide coating  composition , and  to  measure  it  in  XPS , first  we  need  to  do  baseline  correction . It  is  automatic , and  then  we  need  to  go  to  the  calibration . In  calibration ,  normally  applied  materials have  calibration  sample , especially  whenever  we  have  some developed  technologies like  aluminum  oxide . But  in  our  cases ,   we  are  coating  pharma  particles and  our  process  window   is  totally  different from  our  applied  materials  core  technology process  window  for  aluminum  oxide , because   coating  need  to  be  compatible   with  the  pharma  particles . We  are  coating  this  particle , at  the  same  time , we  are  coating  also  silicon  wafer and  API  pallet because  XPS  cannot  measure  particles . It  needs  some  planar  substrate . That 's  why  we  are  depositing on  silicon  wafer  and  API  pallet . Since  we  don 't  have  calibration  sample , we  are  using  the  second  option for  calibration   like  carbon  peak  calibration . Left -hand  side  picture ,   you  can  see  carbon  peak  calibration and  it  is  manually  need  to  do  it, and  that 's  why  it  impact   on  the  reproducibility . Then  after  calibration ,  we  need  to  do  XPS  survey or  high  resolution  scan to  get  the  spectra. In  the  spectra , you  can  see oxygen  peak,  aluminum  peak . Since  we  need  to  do  manual  calibration and  we  have  automatic  baseline  correction , this  can  impact  error   on  repeatability  and  reproducibility . Next ,  we  need  to  do  analysis . Analysis  is  peak  fitting . We  need  to  fit  this  peak   and  then  we  can  gauge . From  peak  area ,  we  can  calculate the  oxygen  aluminum  percentage . Since  it  is  semi  automatic , it  can  add  error  in  the  reproducibility . By  XPS  measurement , we  are  calculating  O /Al  ratio and  our  customer  spec  limit  is  1 .2 -2 .3 . Next ,  we  will  talk  about the  cause  and  effect  diagram , MSA  cause  and  effect  diagram . In  MSA  cause  and  effect  diagram , we  did  some  detailed  analysis, and  we  found  several  one   it  can  impact  on  the  gauge  R&R . We  highlighted  also  major  ones like  electron  counts , calibration  analysis , we  talked  in  the  earlier  slide . Now  we 're  adding  another  one ,  it 's  sample  loading , how  it  is  added  error  in  the  gauge  R&R . Sample  loading ,  we  need  to  do  it .   It 's  not  automatic ,  it  is  not  full  wafer . We  are  doing  with  coupon  wafer , so  we  need  to  place  the  coupon . If  it  is  location  a  little  bit  different or  angle  is  little  bit  different , then  it  can  impact  on  the  measurement . This  is  impacting  on  reproducibility . All  other  major  one  impact  we  already discussed  in  the  previous  slides . Other  one  is  the  sample . It  depends  on  the  process . For  this  presentation ,  it  is  out  of  scope . We  will  talk  about  these  four   in  this  presentation . Next ,  our  sample  collection  plan . For  our  sample  collection ,   we  use  six  samples  for  MSA  analysis , and  for  these  six  samples   we  have  four  replicates . Here  you  can  see  these  four  replicates . We  measure  those  samples  in  two  sites . Since  we  have  the  four  replicates , we  are  measuring  those  sequentially . Is  it  possible   that  if  samples  are  degraded then  sample  degradation  could  be  a  risk ? We  will  talk  about  this  risk   later  on  more  details . Our  expected  outcomes like  that  we  need  to  find  out  XPS  method is  adequate  to  differentiate   process  variation . Also , you  like  to  gauge  like  that whenever  we  are  measuring  two  sites   that  have  similar  result . Also , we  like  to  gauge  like  that  sample or  part  is  not  interacting  with  the  site . Now  we  need  to  do  the  MSA  design . In  MSA  design ,  we  are  using  substrate and  site  at  the  cost  factor . This  is  the  site ,  this  is  the  part , and  we  have  also  two  different  substrate . We  mentioned  it  before , API  palette  and  silicon  wafer . S  numbers  are  silicon  wafer , A  numbers  are  API  palette . We 'll  not  be  able  to  use  actually   completely  randomized  option and  we  use  first  repeat . For  first  repeat  option , here  we  are  not  changing   the  sample  replicate  number . That  could  impact   on  sample  degradation  problem . That 's  why  later  on  we  will  compare  first   and  fourth  replicate to  check  this  sampling  risk . For  this  MSA  analysis , we  sequentially  use  several  JMP applications  from  JMP  platform . We  use  data  distribution  of  MSA  samples . That  is  from  descriptive inferential  statistics  application . It  is  from  distribution  fit  Y  by  X . Then  we  check  the  data  variability using  control  chart  and  one -way  ANOVA . Then  we  analyze  gauge  R&R  components . It  is  from  the  variability  chart . Then  we 'd  like  to  gauge  like   that  what  is  the  relation with  process  capability  with  gauge  R&R . That  we  can  find  out  like  that , interclass  correlation  versus  P /T  plot . Next , we  did  the  root  cause  analysis to  plan  for  improving  the  GRR . We  will  find  out what  is  the  GRR  major  error and  how  we  can  find  out . That  we  are  using  for  box  plot ,   density  ellipse , matched  pairs ,  and  fit  line that  are  different  platform   of  JMP  platform . Let 's  start  with  the  data  distribution . We  developed  process  initially   at  the  two  spec  limit , upper  spec  limit  and  lower  spec  limit . In  upper  spec  limit ,  we  have  two  samples , two  parts  and  four  replicates . All  are  measured  two  sites , and  we  already  mentioned we  did  the  first  repeat . Similarly ,  at  the  lower  spec  limit ,   also  we  have  four  parts , four  replicates  and  two  sites . Since  we  did  the  process  development at  the  two  end  of  the  spec  limit , that 's  why  we  can  see   that  our  distribution  is  bimodal . It's completely  bimodal  distribution . Problem  of  bimodal  distribution , it  can  impact  on  the  GRR  components . It  can  impact  on  P /TV  ratio , it  can  impact  on  P /PV  ratio and  misclassification . Since  P /T  ratio  is  not  related  with ... It  is  not  dependent  with  the  part , that  is  the  reason  P /T  ratio it 's  not  impacting by  the  sample  distribution . That 's  why  we  will  be  used in  our  following  slides . Our  figure  of  merits   we  are  using  as  a  P /T  ratio . For  misclassification  probabilities , there  is  five  probabilities . Last  three ,  it  could  be  impacted by  the  sample  distribution  more , and  first  two  is  less  impacted . To  minimize  the  risk ,  again , we  are  focusing  on  the  P /T  ratio as  a  figure  of  merit . In  the  next  time ,   our  plan  to  do  MSA  analysis using  uniform  sample  distribution . Let 's  check  now  the  variability  of  data . Here  we  can  see  that  we  use  I -MR  chart , individual  moving  range  chart , and  we  saw  that  many  data  points are  outside  the  control  limit   in  the  upper  chart , and  in  the  lower  moving   range chart , we  saw  that  three  data  point   is  outside  the  control  limit , and  that  these  three  data  points ,   it  is  sudden  shift . It  is  sudden  shift , it 's  not  staying  there ,  it  is  going  back . It  means  it  is  the  type  II  shift and  there  is  a  mixture   of  common  cause  variation and  special  cause  variation   in  the  control  chart . That 's  the  reason  here control  limits  are  meaningless . We  need  to  subgrouping  with  special  cause and  then  only  we  can  consider   the  control  limits . Now  we  like  to  find  out what  are  the  special  cause . First  we  will  check  if  part  variation could  be  a  special  cause . We  did  it  using  the  one -way  ANOVA and  in  one -way  ANOVA , we  can  see there  is  a  variation  of  the  samples . We  did  the  process  near  upper  spec  limit and  we  did  the  process  lower  spec  limit . That 's  why  samples  are  different . That  also  we  found  by  one -way  ANOVA , and  here  we  can  see  that within  variation  is  very  small   compared  to  part  variation , and  also  by  analysis   of  variance  is  showing  like  that . Here  our  hypothesis   is  all  parts  are  same , but  it  is  rejecting  the  hypothesis because  P -value  is  less  than  0 .05 . It 's  telling  us   it  is  significantly  different . That  means  part  variation is  a  special  cause , so  we  can  use  as  a  candidate for  subgrouping . Again ,  similarly  we  check  with  the  site variation  if  it  is  a  special  cause  or  not . We  considering  two  sites  measurement   near  upper  spec  limit as  well  as  near  lower  spec  limit . We  saw  that  here  our  hypothesis is  two  sites  are  measurement  similar, and  we  found  that  its  P -value   is  higher  than  0 .05 . For  upper  spec  limit , there  is  no  evidence   that  we  can  reject  the  hypothesis . It  is  similar ,  on  the  other  hand ,   for  lower  specs  limit . It  is  marginally  rejected   because  it  is  less  than  0 .05 . For  site  variation ,   either  it  is  marginally  rejected or  there  is  no  evidence  to  reject . That 's  why  site  variation   is  not  a  good  candidate and  part  variation  is   the  better  candidate . What  we  did  next ,  we  make  our  control chart  again  with  phase  option  and  A  here , sampled  part  at  a  different  phase . When  we  do  it ,  we  saw that  in  a  moving  range  chart , we  found  change  in  the  variation in  the  measurement  in  the  moving  range , and  that  calculated  the  control  limits for  the  bottom  chart  and  the  upper  chart . Now  we  saw  that  all  the  points , all  the  measurement  points are  inside  the  control  limit . These  is  the  variations  of  each  sample . It  is  the  repeatability . When  we  consider  site  A  and  site  B , and  we  saw  also  site  B   has  also  repeatability . But  compared  to  site  A  and  site  B , there  is  some  variation  of  repeatability . That  is  called  reproducibility . Now  we  calculate  the  gauge  R&R , all  the  components  in  the  next  slide , and  we 'll  find  out  what  is   the  dominating  error  in  gauge  R&R . First , we  did  main  effect . We  didn 't  consider  for  the  main  effect part  and  site  variation  interaction , so  only  the  main  effect . Here ,  we  saw   the  repeatability ,  reproducibility . Repeatability  is  22 % and  reproducibility  is  15 %. I  already  mentioned  as  a  gauge  R&R , we  are  considering  P /T  ratio because  our  sample   distribution  is  bimodal , and  we  saw  that  P /T  ratio  is  26 %. It  is  passed ,  it  is  less  than  30 %. It  is  marginally  passed , and  major  error  is  22 %  repeatability . One  more  thing  I  should  mention  here , we  are  considering  P /T  ratio but  P /TV  or  P /PV  ratio  is  very  close for  our  measurement  cases because  our  sample  distribution  is  bimodal and  at  the  two  end  of  the  spec  limit . That  is  the  reason  this  ratio  T   or  TV  are  very  close  or  PV  is  very  close . That  is  the  reason  we  have  this  gauge  R&R . This  figure  of  merits  is  very  close . Also , I  should  mention  here  type  I  error alpha  and  type  II  error  beta . Type  I  error ,  all  our  data  points   within  the  control  limit . That 's  the  reason  our  type  I  error good  part  is  falsely  rejected . It 's  very  small .  It  is  less  than  6 %. On  the  other  hand ,   type  II  error ,  it  is  6 %,  it  is  failed . It  is  more  than  10 %. Why  type  II  error  is  higher ? Our  repeatability  is  the  major  issue . Whenever  we  are  measuring  the  samples , it  is  within  the  spec  limit . But  it  is  possible  like  that  whenever a  customer  is  measuring  it . It  could  be  beyond  the  spec  limit because  repeatability  is  high  here . At  this  point ,   since  we  are  developing  the  product , we  are  in  the  initial feasibility  check  phase . Customer  is  happy   with  this  beta  type  II  error , but  we  have  option . If  we  can  improve  the  repeatability ,  then  it  can  improve  this  part  also . On  the  other  hand ,  if  we  can  consider that  part  and  site  interaction , then  we  saw  that  part   and  site  interaction  is  6 %, not  that  much , but  there  is  a  little  bit  interaction. And  when  we  didn 't  consider the  interaction  in  the  main  effect  mode , then  this  interaction  is  added   in  the  repeatability . That 's  why  we  found  that  whenever we  are  considering  the  crossed  effect , we  saw  repeatability   little  bit  decreasing because  our  interaction  is  very  small , not  that  much  decreasing . Since  this  interaction  is  very  small , our  figure  of  merits   are  not  changing  that  much . It  is  changing  from  little  bit . Now  from  here , we  know  that  our  dominating   error  is  repeatability . Before  going  about   the  more  discussion  with  repeatability , first  another  thing   I  would  like  to  mention , process  capability  with  gauge  R&R . Effect  of  gauge  R&R   on  the  process  capability . Here ,  process  capability we  are  plotting  in  ICC  versus  P /T  plot . ICC  is  the  part  variation   to  total  variation and  P /T  is  the  six  sigma  gauge , and  USL  minus  LSL . We  calculated  from  here  Cₚ , and  in  our  cases , in  our  process  current  condition ,   Cₚ  is  0 .93 . It  is  less  than  one . It  is  in  the  red  zone , and  we  need  to  go  Cₚ ... For  a  good  process  capability , we  need  to  go  between  Cₚ  1 .33 -2 . It  is  the  yellow  zone . To  improve  this  Cₚ ,  what  we  need  to  do ? In  this  part ,  this  is  the  process  part and  in  this  direction ,   it  is  the  measurement  part . Process  variability   or  part  variability  is  very  high . For  our  measurement , we  saw  that  our  P /T  is  24 %. If  we  would  like  to  increase ,   if  we  would  like  to  improve  the  P /T from  24  to  suppose  15 %  or  10 %, then  we  have  to  improve  30 % -50 %, and  within  that ,   our  repeatability  is  the  main  issue . That  is  the  reason  we  need to  improve  the  repeatability . Now  it  is  question . If  we  need  to  improve  the  repeatability , do  we  need  to  change  our  measurement  tool ? That  is  again  depending  on  the  ROI that  is  question  to  our  managing  level , or  we  can  address   the  repeatability  in  different  way . That 's  why  we 'd  like  to  find  out  the  root cause  why  repeatability  is  higher . Here  we  are  considering  variability  chart with  analysis  of  variance . Here  we  can  see  that  we  plot   all  the  samples  variability  together with  site  A  and  site  B  measurement . You  can  see  that  suppose ,  for  a  sample  A0 , this  is  the  measurement  repeatability , and  it  is  changing . This  repeatability  is  changing for  all  the  parts . Also  repeatability  is  changing with  the  site  to  site because  here  you  can  see   repeatability  is  0 .06  standard  deviation , but  in  these  cases  when  they  measure their  repeatability  is  0 .03 . That  is  the  reason   this  repeatability  is  changing with  part  to  part  also  site  to  site . Whenever  it  is  changing   with  site  to  site , it 's  called  reproducibility . Here  if  you  can  consider   the  analysis  of  variance , then  we  can  see  that  site   to  site  variation  is  much  smaller than  within  variation . This  is  the  repeatability , within  variation ,  and  site  to  site . Site  to  site  variation ,   it  is  reproducibility ,  it 's  much  smaller . Repeatability  again  from  here  also we  find  out  that  it  is  the  bigger  problem . Now  in  the  next   to  find  out  the  root  cause , we  plotted  all  the  repeatability side  by  side  together , and  for  both  the  cases ,  USL , upper  spec  limit  and  lower  spec  limit , and  all  the  cases  we  found that  its  repeatability  is  different . Next  we  like  to  correlate   or  find  out  any  relationship if  it  is  present  site  A   and  site  B  measurement . Ideally ,  site  A  measurement  will  equal to  site  B  measurement  should  be . But  in  our  cases ,  we  did  some  linear  fit and  we  found  that  we  have  intercept as  well  as  we  have  linear  fit  slope , it  is  not  one ,  it  is  not  zero . Here  we  found   that  linear  slope  is  less  than  0 .4 and  intercept  is  higher  than  0 .9 . Our  fitting  points  are  distributed  widely . That 's  the  reason   our  R -squared  is  also  poor . We  also  did  the  density  ellipse and  density  ellipse  also  telling that  this  correlation  is  less  than  0 .5 . If  they  have  a  very  good   correlation  relationship , then  it  should  be  0 .9 . If  it  is  0 .6 ,  then  it  will  be   moderately  correlated . But  in  our  cases ,  it 's  not  that . That 's  why  we  know  that  site  A measurement  is  not  site  B  measurement . It 's  the  repeatability  impacted   on  the  reproducibility . Problem  of  repeatability  is  impacted on  the  reproducibility . Now  we  check  more  closely how  it  is  different . We  are  comparing  by  match  pair the  site  A  and  site  B  variation . Here  our  hypothesis   is  site  A  equal  to  site  B , that  means  site  A   minus  site  B  equal  to  zero . We  saw  that  our  probability   for  this  hypothesis , site  A  minus  site  B  equal  to  zero ,   is  less  than  0 .05  in  both  the  cases . It  is  upper  spec  limit  and  lower  spec . Both  the  cases  you  can  see that  it  is  probability  is  less  than  0 .05 . That  means  site  A  and  site  B   measurement  is  different , and  you  can  see   our  difference  of  mean  value and  confidence  interval is  above  the  zero  point  line . That  means  though  this  is site  A  measurement  is  always  higher for  site  B  measurement . Now  from  here ,  our  question  appears , since  we  did  the  first  repeat  analysis for  our  MSA  design  is  first  repeat ,   it  could  be  possible  like  that if  samples  are  degraded ,   like  O /Al  composition  is  degraded . That 's  why  we  did  again  match  pair  test with  first  and  fourth  measurement both  in  site  A  and  site  B   for  all  six  samples , and  we  found  that  here , first  measurement  minus  fourth measurement  equals  zero . That  is  our  hypothesis . We  saw  that  P -value  is  higher than  0 .05  both  the  cases . That  means  our  sample   degradation  is  not  an  issue . First  sample ,  there  is  no  evidence . First  measurement and  fourth  measurement  is  dissimilar . That  means  it  is  the  measurement  issue . For  that ,  this  is  summarized in  the  dashboard  table , in  the  dashboard ,   like  our  figure  of  merit for  gauge  R&R  24 % and  repeatability  is  21 % and  that  repeatability  is  changing from  part  to  part  and  site  to  site , and  we  have  always  higher  repeatability for  site  A  compared  to  the  site  B . Now  for  our  next  plan , we  plan  for  a  discussion  each  site   as  well  as  with  the  process  team . Site  has  a  problem  like  repeatability as  well  part -site  interaction . We  know  that  what  error  could  be introduced  in  the  measurement like  background /baseline  correction , electron  counts ,  peak  deconvolution . We 'll  discuss  those  methods  source   of  error  with  site  A  person , site  A  facility ,  and  we  will  find  out how  we  can  do  the  streamlining  process for  improve  our  MSA . Also  we  have  a  plan  set  up a  calibration  sample or  we  can  set  up  a  set  up  sample that  we  can  measure   in  regular  interval  in  the  both  sites . On  the  other  hand ,  with  the  process  team , we 'll  talk  to  improve  MSA  next  time to  MSA  data  collection  uniform . Instead  of  bimodal ,   we  should  collect  the  data  uniformly . Then  also  we  saw  part to  part  repeatability  variation . There  is  one  reason   it  could  be  measurement  issue . Another  reason could  be  process  is  not  uniform . We  need  to  validate  our  thermal  math to  check  our  process  uniformity . Finally ,  I  would  like  to  mention that  what  is  the  impact on  my  learning  for  this  MSA  analysis . Now  we  know  that  several  JMP  platform or  JMP  application  can  help  me  to  know what  is  the  signal  variation from  the  noise  variation , and  then  we  can  identify   what  figure  of  merit  we  can  use to  justify  our  measurement  method . In  our  cases ,   we  found  P /T  is  the  best  method , best  figure  of  merit  to  analyze  it . Then  how  misclassification  risk   can  relate  to  the  MSA  component as  well  as  sample  distribution   that  we  learn . Root  cause  analysis , we  did  several  JMP  application that  can  help  us  to  plan  to  improving  MSA . Since  it  is  very  helpful   for  particular  program  application , that 's  why  I  would  like  to  introduce this  data  driven  decision  making for  all  the  programs  I  involve  in to  improve  the  project  quality ,   cost ,  and  time . Finally ,  I  would  like  to  promote  data driven  decision  using  JMP in  our  advanced  technology   group  like  CTO  team , or  other  different  projects . This  is  my  final  slide . I  would  like  to  mention  my  journey . I  started  JMP  learning   beginning  of  the  year , and  that  time  we  did  A0 ,  A1 ,  A2 . This  is  my  foundation . Then  after  I  work with  MSA  analysis  and  SPC . I  also  got  my  certificate , JMP  STIPS  certificate  May  2023 . Now  I  am  instructor   at  AMAT  JMP  instructor . I 'm  planning  to  in  person presentation  in  October  2023 , and  also  I  am  working for  my  Black  Belt  on  2024 . Thank  you  for  listening .
Functional or curved responses frequently occur in industry. Thanks to new features in JMP, we can now model and predict functional responses using key DOE or product design factors with JMP Functional DOE or Curve DOE modeling. A Functional DOE model is purely empirical. However, a Curve DOE model can incorporate mechanistic or expert knowledge on the functional form of the curved responses.  In this presentation, the methods and results of predicting functional responses using Functional DOE and Curve DOE modeling are compared using case studies from the consumer product industry.     Hello,  my  name  is  Fangy i  Luo and  I'm  from   Procter & Gamble. Today  I'm  presenting with  Chris  Gotwalth  from  JMP. We're  going  to  talk  about  how  to  model data  from  designed  experiments when  the  response  is  functional  curve. Functional  or  curve  responses occur  very  often  in  industry. Thanks  to  the  new  development  of  JMP, we  can  now  model and  predict  functional  responses as  a  functional  of  key  DOE or  product  design  factors using  both  functional  DOE or  curve  DOE  modeling. A  functional  DOE  model is  purely  empirical. However,  a  curve  DOE  model can  take  into  account  mechanistic or  expert  knowledge  on  the  functional form  of  the  curve  responses. In  this  presentation, the  method  and  results  of  predicting functional  responses  using  functional  DOE and  curve  DOE  modeling  will  be  compared using  case  studies  from  the consumer  product  industry. This  is  the  outline  of  a  talk. We  will  break  the  talk  into  two  parts. In  the  first  part,  Chris  will  talk  about what  are  the  functional data  examples  of  functional  data and  then  he  will  help  you with  fundamental  understanding of  the  functional  DOE  modeling, including  functional principle  component  analysis as  well  as  curve  DOE  modeling. In  the  second  part, I  will  use  two  examples from   Procter & Gamble and  compare  the  results of  functional  DOE  and  curve  DOE  modeling using  these  two  examples. The  first  example  is  Modeling  Viscosity over  Time  Data from F ormulation  Experiment. The  second  example  is  Modeling Absorption  Volume  over  Time  Data From  a  Diaper  Design  of  Experiment. Then  I  will  finish  the  talk with  a  brief  summary  and  conclusion. Thanks  Fangy i. Now  I'm  going  to  give  a  quick  intro to  functional  and  curve  data  analysis. But  first  I  want  to  point  out that  there  is  a  lot  of  this  kind  of  data out  there  and  JMP  really  has  made analyzing  curve  response  data  as  fast, easy  and  accurate  as  possible. If  you  haven't  heard of  functional  data  analysis  before, you  have  certainly  seen  it  out  there. It's  all  over  the  place, and I'll  show  you  some examples  to  make  that  clear. For  example, here  are  annual  home  price  indices from  1992 -2021  for  all  50  US  states. Each  function  has  a  beginning  measurement followed  by  a  sequence of  other  measurements and  then  a  final  measurement. They  all  have  a  beginning, a  middle  and  an  end. The  functions  don't  have  to  all  have the  same  start  and  endpoints or  measurements  at  the  same  times. In  a  time  series  analysis, we  are  really  interested  in  using  data to  predict  forward  into  the  future using  data  observed  from  the  past. In  a  functional  data  analysis or  a  curve  data  analysis, we  are  generally  more  interested in  explaining  the  variation internal  to  the  functions than  predicting  beyond the  range  of  times  we've  observed. In  product  and  process improvement  in  industry, we  are  often  working on  non-financial  curves. I'm  going  to  show  you  some  examples that  our  customers  have  shared  with  us. Here  we  see  a  set of  infrared  spectra  of  gasoline  samples used  to  develop  an  inexpensive  tool to  measure  octane  and  gasoline. The  green  curves  had  high  octane, and  the  red  ones  were  low  in  octane. The  height  of  the  left  peak turned  out  to  be  critical for  predicting  octane  level. Microbial  growth  curves are  a  common  type  of  functional data  in  the  biotech  industry. Today,  F angyi  will  be  demonstrating two  methods  in  JMP that  can  be  used  for  analyzing  DOEs, where  the  response is  a  set  of  measurements. The  first  method is  called  functional  DOE  analysis and  is  best  for  complicated response  functions  like  spectra when  you  need  the  model  to  really  learn the  curves  and  the  data  from  scratch. The  second  is  a   curve DOE  analysis, which  is  based  on non-linear  regression  models. When  you  can  use  the   curve DOE  analysis, I  found  that  you  get  more accurate  results  with  it. But  if  you  can't  get the   curve DOE  analysis  to  work, you  can  always  fall  back on  the  functional  DOE  analysis, as  it's  more  general  than   curve DOE. The  critical  step in  functional  data  analysis that  will  be  new  to  most  people is  called  functional principle  components  analysis, also  called  FPCA  for  short. This  is  how  we  decompose  the  curves into  shape  components that  describe  the  typical  patterns we  see  in  the  curves, as  well  as  weights  that  attribute how  strongly  each  individual  curve correlates  with  those  shape  components. It's  a  kind  of  dimension  reduction and  data  compression  technique that  reduces all  the  information  in  the  curves into  the  most compact  representation  possible. To  illustrate  FPCA,  take  a  look at  the  set  of  curves  in  the  plot  here. What  do  they  have  in  common? How  do  they  differ  from  one  another? What  I  see  in  common is  a  set  of  peak  shapes with  one  peak  per  curve, and  the  shapes go  to  zero  away  from  the  peak. They  also  appear  to  be  symmetric around  the  center  of  the  peak. In  terms  of  differences, I  see  variation  in  peak  heights, and  there  are  clear  horizontal  shifts from  left  to  right, and  some  curves are  also  narrower  than  other  ones. In  a  functional  data  analysis, the  first  thing  we  do is  find  a  smoothing  model that  converts  or  approximates the  discrete  measurements, converting  them  into  continuous  functions. There's  a  variety of  smoothing  models  in  FDE. I  don't  really  have  a  firm  rule as  to  which  one  is  the  best  in  general, but  here  are  my  observations about  the  most  common  ones. Wavelets  and  splines have  different  strengths. Wavelets  are  new  in  JMP  Pro  17 and  are  very  fast  and  are  generally the  best  with  complicated  functions such  as  spectra,  as  long  as  the  X coordinates  of  the  data  are  on  a  grid. On  the  other  hand, there  are  B and P  splines, which  are  slower  computationally but  are  better  for  data with  irregularly- spaced X s, and  are  also  often  better when  there  are  only  a  dozen or  fewer  measurements  per  function. If  the  data  aren't  large, I  would  try  both  splines  and  wavelets and  see  which  one is  giving  us  the  best  fit by  looking  at  the  graphs. The  main  graphs  I  use to  make  decisions  about  smoothing  models are  actual  by  predicted  plots and  you  wanted  the  one that  hugs  the  45- degree  line  more  closely. In  this  case,  I  would  choose the  wavelets  model  on  the  right over  the  spline  model  on  the  left, because  those  points  are  tighter around  that  45- degree  line. Immediately  after  JMP  Pro fits  a  smoothing  model  to  the  data, it  decomposes  the  signals into  dominant  characteristic  shapes it  found  in  the  data. In  mathematical  language, these  shapes  are  called   eigenfunctions, but  a  better  and  more  approachable  name would  be  to  call  them  shape  components. Here  we  see  that  JMP  has  found that  the  overall  mean  function is  a  peak  shape and  that  there  are  three  shape  components that  explain  97% of  the  variation  in  the  data. The  first  shape  component  appears to  correspond  to  a  peak  height. I've  learned  to  recognize that  the  second  shape is  a  type  of  left- right  peak  shift  pattern and  that  the  third  shape  component is  something  that  would control  the  peak  width. Remember  that  these  are  shapes learned  from  the  data, not  something that  I  gave  JMP  outside  of  the  data. What  has  happened is  the  observed  spectra  in  the  data has  been  decomposed into  an  additive  combination of  the  shape  components with  unique  weights for  each  individual  curve. The  functional  PCA  is   like  reverse engineering  the  recipe  of  the  curves in  terms  of  the  shape  components. The  mean  function  is  the  thing that  they  all  have  in  common. The  shape  components are  the  main  ingredients. And  the  weights are  the  amounts  of  the  ingredients in  the  individual  curves. The  functional  DOE  analysis is  the  same  mathematically as  extracting  the  scores  or  weights and  modeling  them  in  fit  model with  the  generalized  regression  platform. Fortunately, there  is  a  red  triangle  option in  the  Functional  Data  Explorer that  automates  the  modeling, linking  up  the  DOE  models with  the  shape  functions  for  you and  presenting  you  with  a  profiler that  connects  the  DOE  models with  the  shape  functions. You  can  directly  see how  changing  the  DOE  factors leads  to  changes in  the  predicted  curve  or  spectra. There  are  many  potential  applications of  functional  DOE  analysis, some  of  which  Fangyi  will  be presenting  later  in  this  talk. There  is  another  approach in  JMP  called  curve  DOE  modeling. This  answers  the  same  kind  of  question as  functional  DOE, but  it  is  nonlinear  regression  based rather  than  spline  or  wavelet  based. What  that  means  is  that  if  you  have a  good  idea  of  a  nonlinear  model, like  a  three- parameter  logistic  model, and  if  that  model  fits  your  data  well, you  can  get  models  and  results that  generalize  better than  a  functional  DOE  model, because  the  general  shape  of  the  curve doesn't  have  to  be  learned  from  scratch from  the  data  using  splines  or  wavelets. The  idea  being  that  if  you  can  make assumptions  about  your  data that  reproduce  the  modeling  effort  needed, your  predictions  will  be  more  accurate, especially  from  small  data  sets. Curve  DOE  analysis has  a  very  similar  workflow to  a  functional  DOE  analysis, except  that  you  go  through the  Fit  Curve  platform instead  of  the  functional  Data  Explorer, and  instead  of  choosing wavelets  or  splines, you  chose  a  parametric  model from  the  platform. Just  like  in  a  functional  DOE  analysis, you  want  to  review the  actual  by predicted  plot to  make  sure  that  your  nonlinear  model is  doing  a  good  job  of  fitting  the  data. A   curve DOE  analysis is  the  same  as  modeling the  nonlinear  regression  parameters extracted  from  the  curves  using the  generalized  regression  platform. This  is  the  same  thing  as  what's  going  on with  a  functional  DOE  analysis with  the  FPCA  weights. Fit Curve  automates  the  modeling and  visualization  just  as  FDE  does. Once  you  know  functional  DOE  analysis, it's  really  not  very  hard  at  all to  learn   curve DOE  analysis. Now  I'm  going  to  hand  it  over  to  F angyi who  has  some  nice  examples  illustrating functional  DOE  and   curve DOE. Thanks  Chris. Next  I'm  going  to  talk  about two  examples  from   Procter & Gamble. The  first  example is  viscosity  over  time  curves collected  from  a  number of  historical  formulation  experiments for  the  same  type  of  liquid  formulation. There  are  six  factors  we  would like  to  consider  for  the  modeling. They  are  all  formulation  ingredients and  we  call  them  factor  one  to  factor  six. The  goal  of  our  modeling is  to  use  these  formulation  factors to  predict  or  optimize viscosity  over  time  curve. The  response  of  modeling is  viscosity  over  time. This  slide  showed  you some  viscosity  over  time  data. For  majority  of  our  formulations, the  viscosity  of  the  formulations would  increase  first  with  time and  then  decrease  later  on. Next,  we're  going  to  perform  functional DOE  analysis  on  viscosity  over  time  data. Before  functional  DOE  analysis, we  need  to  perform functional  principal  component  analysis on  the  curves  smooth using  different  method. Here,  we  apply  functional  principal component  analysis to  the  curves  first  using  B-s plines and   find  five functional  principal  component where  they  cumulatively  explains about  100%  of  variation  in  the  curves. Each  of  the  curve  would  express as  the  sum  of  the  mean  function plus  linear  combination of  the  five  functional principal  components or  eigen functions also  called  shape  function. We  also  apply  direct  functional  principal component  analysis  to  the  data where  it  find  four functional  principal  components that  cumulatively  explains about  100%  of  variation across  viscosity  over  time  curves. E ach  curve  will  then be  expressed  as  the  mean  function plus  linear  combination  of  the four  functional  principal  components. This  slide  compares  the  functional principal  component  analysis  model  fit using  two  different  options. The  first  one  is  using  the  B-s pline  option and  the  second  one  is  using the  direct  functional  PCA  analysis. As  you  can  see  using  the  B -spline  option, the  model  fit  seems  to  be  smoother as  compared  to  the  model  fit using  direct  functional  PCA  analysis. This  slide  showed  you the  diagnostic  plots, the  observed  versus  predicted  viscosity from  the  functional principal  component  analysis using  two  different  options. Using  direct  functional  PCA  analysis, the  points  are  closer to  the  45- degree  lines as  compared to  the  one  using  B-s pline  option, indicating  that  direct functional  PCA  analysis fits  the  viscosity  over  time  data slightly  better  than  the  functional principal  component  analysis using  B-spline  option. After  performing  functional principal  component  analysis, there's  an  option  in  JMP, you  can  perform  functional  DOE  modeling and  get  functional  DOE  profiler. For  functional  DOE  modeling, basically  it's  combining the  functional  rincipal  component  analysis with  the  model  for  the  functional principal  component  scores using  formulation  factors. For  this  profiler  we  can  predict the  functional  responses, in  our  case,  is  viscosity  over  time  curves using  different  formulation  factors. You  can  select  a  combination of  the  formulation  factors and  it's  able  to  predict the  viscosity  over  time  curve. This  slide  shows  you  the  diagnostic  plots, the  observed  versus  predictive  viscosity and  also  the  residual  plots from  the  functional  DOE  modeling. As  you  can  see  that  the  residuals from  the  functional  DOE  modeling are  larger  than  the  functional principal  component  analysis before  the  functional  DOE  modeling. Our  colleagues  at   Procter & Gamble actually  they  find  that Gaussian  Peak  model  would  fit individual  viscocity  curves  very  well. This  Gaussian  Peak  model has  three  parameters  A, B, C, and  this  A  indicates  the  peak  value of  the  viscosity  over  time  curve and  B  is  a  critical  point, which  is  a  time  when  viscosity reaches  maximum, and  C  is  a  growth  rate. The  rate  of  the  viscosity  increase during  the  initial  phase. This  is  the  fitting of  the  viscosity  over  time  curve using  the  Gaussian  Peak  model using  a  feature  in  JMP, called curve  fitting. These  are  the  diagnostic  plots of  the  viscosity  curve  fitting using  the  Gaussian  Peak  models. It  looks  like  the  model  fitting are  not  too  bad, however,  the  arrows  seems  to  be  larger than  the  arrows  from  the  fitting using   functional principal  component  analysis. After   curve DOE  fitting using  Gaussian P eak  model, there's  option  in  JMP you  can  perform  curve  DOE  modeling. Basically,  curve  DOE  model  is  combining the  parametric  model  for  the  curves, the  Gaussian  Peak  model, and  the  model  for  the  parameters of  the  Gaussian  Peak  model express  the  parameter as  a  function  of  formulation  factors using  generalized  regression  models. Then  you  get  the   curve DOE  model and  this  is  a  profiler of  the   curve DOE  model. Using  this  profiler  you  can  predict the  shape  of  the  curve by  specifying  combination of  the  formulation  factors. Actually,  this  profiler is  somewhat  different from  the  functional  DOE  profiler we  got  previously. These  are  the  diagnostic  plots from   curve DOE  model. As  you  can  see  here that  the   curve DOE  model does  not  fit  the  data  well  and  it's much  worse  than  the  functional  DOE  model. These  are  the   curve DOE  model  fit on  the  original  data. As  you  can  see  that for  a  number  of  formulations, the   curve DOE  model does  not  fit  the  data  well. This  is  a  comparison  of  the  profilers from  functional  DOE  model and   curve DOE  model. As  you  can  see  that  the  profilers, they  look  quite  different. This  compares  the  diagnostic  plots from  functional  DOE  model and   curve DOE  model. As  you  can  see  that  functional  DOE  model fits  the  data  much  better than  the   curve DOE  model with  a  smaller  root  mean  square  error. Now  I'm  going  to  show you  the  second  example. This  example  is  from a  diaper  design  of  experiment with  four  different  product  A, B, C, D at  three  different  stations labeled  as   S1,   S2  and   S3, so  it's  a  factorial  design. Diaper  absorption  volume was  measured  over  time for  these  four  different  product at  three  different  stations. The  response  is  diaper  absorption volume  over  time and  the  goal is  to  understand  the  difference in  diaper  absorption  curves across  different  products  and  stations. These  are  a  few  examples  of  diaper absorption  volume  over  time  curves where  the  fitting  lines are  smoothing  curves. We  performed   functional principal  component  analysis on  the  diaper  absorption volume over  time  curves and  this  functional principal  component  analysis was  able  to  find five  functional  principal  component where  cumulatively, they  explains  about  almost  100% of  variations  among  multiple  curves. These  are  the  functional  principal component  analysis  model  fit. As  you  can  see,  for  almost  all  the  curves, the  fitted  curve  plateaued after  a  certain  time  point. Functional  principal  component  analysis model  fitted  curves  really  well as  you  can  see  from  the  diagnostic  plots. We  performed  functional  DOE  modeling of  the  functional principal  component  analysis and  this  is  profiler of  the  functional  DOE  model. This  model  allows  us to  evaluate  shape  of  the  curve for  different  diaper  products at  different  measuring  stations. The  product  comparison at  station  two  seems  to  be  different from  the  product  comparisons at  station  one  and  station  three. These  are  the  diagnostic  plots of  the  functional  DOE  model. Next,  we  would  like  to  perform curve DOE  modeling. Before   curve DOE  modeling, we  would  like  to  find some  parametric  model that  fits  the  diaper  absorption volume  over  time  data  well. I  found  that  there's  a  function  in  JMP called   biexponential  4P  model. This  model  is  a  mixture of  two  experiential  model with  four  unknown  parameters. This  model  fits  all  the  diaper  absorption volume  over  time  curves  really  well. These  are  the  diagnostic  plots of  the  curve  fitting  and  you  can  see that  the  biexponential 4P   model fits  all  the  curves  really  well. After  fitting  diaper absorption  volume  over  time  curves using  biexponential 4P  model, we  performed   curve DOE  modeling  using  JMP and  this  is  a  profiler of  the   curve DOE  model. Using  this  profiler,  you  are  able to  see  the  shape  of  the  curve as  a  function  of  diaper  product as  well  as  a  measuring  station. This  is  a  profiler  of  product  A at  station  two  and  then  station  three. These  are  the  diagnostic  plots of  the  curve  DOE  model and  you  can  see  that curve  DOE  model  fits  the  data  well, except  that  at  higher diaper  absorption  volume, the  residuals  are  getting  larger. These  are  the   curve DOE  model  fit on  the  original  data. As  you  can  see  that for  most  of  the  curves, this  model  fits  the  data  really  well. This  compels  the  model  profiler of  the  functional  DOE  model versus   curve DOE  model. As  you  may  notice that  there's  some  difference between  these  two  profiler at  the  later  time  point. The  predicted  diaper  absorption  volume at  the  later  time  point tend  to  plateau from  the  functional  DOE  model, but  it  continue  to  increase at  later  time  point using  the   curve DOE  model. This  compares  the  diagnostic  plots from  the  functional  DOE  model versus  curve DOE  model using biexponential 4P  model. As  you  can  see  that  both  of  these  models fits  the  data  really  well, with  functional  DOE  being  slightly  better with  slightly  small root  mean  square  error. Now,  you  have  seen  the  comparison of  functional  DOE  modeling versus  curve  DOE  modeling using  two  P&G  examples and  this  is  our  summary  and  conclusions. Functional  DOE  modeling is  always  a  good  choice. When  the  parametric  model fits  all  the  curve  data  well, curve DOE  modeling may  perform  really  well. However,  if  the  parametric  model does  not  fit  the  curve  data  well, then  the  curve  DOE  modeling may  perform  poorly. Functional  DOE  model  is  purely  empirical. However,   curve DOE  model may  take  into  account mechanistic  understanding or  extrovert  knowledge in  the  modeling,  so  it  can  be  hybrid. I t's  good  to  try  different  method like  different  smoothing  method before  functional principal  component  analysis. In  functional  DOE   modeling, try functional  DOE  model versus  curve DOE  model and  see  which  one  performs  best. This  is  end  of  our  presentation. Thank  you  all  for  your  attention.
Plasma enhanced chemical vapor deposition (PECVD) is extensively used to deposit thin films in semiconductor manufacturing. PECVD processes are gas phase processes typically conducted at low pressures in specially designed deposition chambers. Many process parameters influence the thin film properties in ways that are difficult to characterize.   This presentation shows: How the Generalized Regression platform was used to "rescue" a definitive screening design that had runs that couldn’t be completed. How the original design was augmented. How this complex design space was finally understood well enough to find a new precursor, effectively identifying a thin film that maximized the targeted film property (high mechanical strength) in the fewest number of experiments.   This presentation is predominantly hands-on, using JMP in real time.     Today  we 're  going  to  talk  about  how to  use  JMP  to  do  thin  film  development using  plasma -enhanced chemical  vapor  deposition . This  is  a  bit  different ,  I  think , than  some  of  the  typical  work that 's  done  in  industry , where  it 's  a  continuously  stirred  reactor and  you  can  always  mix  things . Regardless  of  what  happens , you  can  always  measure  outputs . But in  plasma-enhanced chemical  vapor  deposition , it 's  really  discrete  pockets  of  stability that  you  have  to  work  with . Even  though  we  can  set  up a large  parameter  space , there  can  be  spots within  that  parameter  space where  you  may  not  be  able to  strike  the  plasma or  it  could  arc because  the  power  density  is  too  high . Since  we  have  a  large  number of  deposition  parameters , we  need  to  use  a  design  of  experiments to  effectively  explore that  parameter  space . Even  if  we 're  able to  strike  the  plasma , there  are  still  issues with  thin  film  uniformity . We 're  depositing  nanometer  films with  nearly  perfect  uniformity across  a  12 -inch  wafer . Once  we  get  that ,  we  still  have to  hit  the  targeted  film  properties . We 're  going  to  talk  about   how  to  use  PECVD to  develop  new  thin  films from  new  precursors . The  first  thing  we 're  going  to  do is  talk  about  Precursor  1 . From  what  I  was  able  to  read from  the  JMP  tutorials , the  Definitive  Screening  Design is  a  very  effective  way to  screen  a  large  number  of  main  effects in  the  fewest  number  of  experiments . That 's  key  to  the  work  that  we  do. We  want  to  get  the  right  answer in  the  shortest  amount  of  time with  a  data -driven  approach . We  used  a  Definitive  Screening  DOE to  explore  seven  factors  in  26  runs . What  we 'll  do  is  just  open  up that  initial  DOE . This  is  the  setup  we  came for  the  Definitive  Screening  DOE . Here  are  the  seven  different  factors we're  varying  for  the  deposition, and  our  output  is  going to  be  this  parameter  Y, and  we 're  trying  to  maximize  it . If  we  look  at  the  range  of  parameters for  this  type  of  PECVD  processing , this  is  a  very  wide  range of  initial  parameters . Again ,  we 're  trying  to  screen for  main  effects , and  our  outputs  are  ranging  from,  say , 9-34 ,  and  our  baseline  was  21 . We  do  see  an  improvement  there . One  of  the  things  that  I  always  like to  do  when  we  do  a  DOE is  include  a  center  point  replicate   or  a  repeat  run to  see  how  reproducible   the  instrument  is , as  well  as  to  make  sure that  the  statistics  we  generate within  the  design  are  valid . These  are  the  two  center -point  runs , and  you  can  see  we  get excellent  reproducibility . The  other  thing  that 's  really  nice  for  us to  do  before  we  get  into  fitting  the  model is  just  to  look  at  the  output  variables and  try  to  identify any  trends  that  we  can  see . Is  there  anything  we  can  identify  quickly that  we  can  attribute  the  main  factors  to ? Here,  there 's  four  points  with  a  Y  value of  greater  than  30  within  the  DOE . If  we  select  those  points , it 's  nice  to  see  if  we  can  see any  trends  associated  with  these  data . One  of  the  fastest  ways  I  found  to  do  that is  to  quickly  do  a  multivariate  analysis , and  we  can  do  this  graphically . What  we 're  going  to  do is  take  all  our  factors and  then  our  output  variable , and  we 're  going  to  generate a  multivariate  analysis . Here  in  this  graph , this  is  our  Y  value . You  can  see  as  we  go  from  10 to  30 , the  four  values that  are  the  highest  are  highlighted , but  the  rows  are  the  various  factors . Here  we  can  see  for  helium , we  have  the  highest  values at  the  high  and  low  splits . For  precursor ,  we  have the  high  and  low  splits . But for  temperature  and  pressure , we  have  the  highest  values at  the  lowest  splits . It 's  really ,  I  think , a  good  indication  initially before  we 've  done  any  model  fitting , that  temperature  and  pressure  could  be important  variables  for  us  to  look  at . If  we  go  back  to  the  table, the  other  nice  thing  about  JMP , it 's  very  powerful  because... Again ,  before  we  do the  definitive  screening , we  can  use  a  predictor  screening to  identify  what  are the  most  important  factors . Again ,  we  use  the  standard  analysis , input  our  factors ,  our  response  is  Y , and  you  can  see what  the  predictor  screening  is  telling  us is ,  yes ,  pressure and  temperature  are  very  important . But one  thing  that  we  didn 't  catch in  that  multivariate  analysis is  the  precursor  flow . These   three  factors ,  pressure , precursor  flow ,  and  temperature appear  to  be  dominant  in  giving  us the  highest  values  of  Y . Now  I  wanted  to  fit  the  model   because  I  think the  real  power  of  the  DOE is  not  the  runs  in  the  table , but  it 's  the  response  surface  model that  you  can  use to  get  predictions  for  improvement as  well  as  directions  to  further  explore . But when  I  went  to  analyze  it , it  wouldn 't  work . It  turns  out  we  were  only  able to  complete  25  of  the  26  runs , and  I  was  not  aware that  the  Definitive  Screening  DOE , the  default  analysis  would  not  work if  you  did  not  complete  all  of  the  runs . At  this  point ,  I  contacted  Jed  at  JMP to  help  me  understand how  I  could  get  some  models   out  of  this  data that  we  carefully  collected over  a  period  of  time . I 'll  turn  it  over  to  Jed . When  Bill  called , like  he  said , when  he  hit  that  script that  saved  to  the  data  table of  Fit  Definitive  Screening , nothing  happens . If  you  look  over  here , the  log  is  saying ,  there  are  runs that  are  not  fold  over  center  point  run , and  it 's  run  17 ,  which  is  obviously   the  run  that  was  missing and  couldn 't  be  completed in  the  experiment . What  Bill  wanted  was  a  way  to  still fit  that  Definitive  Screening  model . We  came  up  with  two  different  approaches and  had  three  models  that  came  about from  those  two  different  approaches . The  first  one  is  related to  the  Definitive  Screening . The  designs  of  these types  of  models,  of  these  experiments, are  always  fold -over  pairs , where  there 's  a  pair  of  opposites . If  we  can  find  that  fold -over  pair , or  the  twin ,  I  guess ,  of  this  row  17 , we  should  be  able  to  exclude  both  rows and  then  fit the  definitive  screening  design . We  just  needed  a  simple  way  to  do  that . What  we  came  up  with  was  basically   to  use  a  couple  of  shortcuts . I 'm  going  to  first  standardize the  attributes  of  these  columns and  change  the  modeling  type  to  ordinal . As  I  do  this ,  you 'll  notice that  my  ability  to  select  has  changed . That  helps  when  I  look  at  a  data  filter . Now  I  have  boxes  rather  than  histograms , so  it  just  makes  it  faster  to  select . What  we  need  to  do  is  find the  opposite  row  of  this . I  have  this  row  17  selected , and  you  see  that  it 's  high ,  high ,  high , low ,  low ,  low ,  and  then I 'm  out  of  memory  space . I 'm  remembering  high , high ,  high ,  low ,  low ,  low . I  need  to  find  the  opposite  of  that , which  is  going  to  be  low , low ,  low ,  high ,  high ,  high . If  I  just  come  over  here  and  start  working my  way  down ,  low ,  low ,  low ,  high , by  the  time  I  get to  just  four  of  the  runs so  just  more  than  half of  the  factors  selected , now  I 'm  down  to  just  one  matching  row . It  just  so  happens  that  the  very  next  run was  the  fold-over  pair  in  this  experiment . We  can  select  both  of  those  runs , exclude  them , and  then  go  back  in into  the  column  properties and  change  that  modeling  type back  to  continuous . Now  when  we  hit that  Definitive  Screening  button , it  works . We  can  run  that  model and  see  that  it 's  predicting  fairly  well . We  can  see  the  profiler , but  we  also  were  really  aware that  one  of  the  runs out  of  26 ,  that 's  almost  4 %. We 're  throwing  away  roughly  4 % of  the  information  by  excluding  this . What  we 'd  really  like  to  do is  not  throw  that  information  away , find  a  way  to  use  that . We  used  the   Model Screening  platform in  JMP  Pro  to  run  a  bunch  of  models and  then  select  the  best . The  two  that  came  out  the  best were  a  Neural  and  Stepwise  model , and  I  can  walk  through  those really  quickly . The  Neural  model  was  with  our  response and  our  factors . Since  this  was  a  DOE , we 're  going  to  do  the  minimum  holdback , and  I 'm  just  going  to  choose a  random  seed  so  this  is  repeatable . The   Model Screening  platform generally  suggests  about  20  boosts . If  I  hit  Go  here , I  get  a  pretty  good   Rsquare  across  this. Maybe  I  might  try  to  tune  this  model by  adding  some  more  parameters , but  when  I  do ,  I  can  see  that  R square is  not  really  changing . I  don 't  think  I  want to  add  more  parameters and  risk  overfitting  the  model . That  was  one  extra  way  to  do  it . Then  the  second  model that  showed  up  easily  for  us using  the   Model Screening  platform was  to  do  Stepwise . The  way  we  did  that  was we  put  our  output  here and  then  use  the  shortcut to  do  a  response  surface . That  includes  all  main  factors , all  squared  terms , and  all  two -way  interactions . Then  if  we  change  this  to  Stepwise  here , we  can  hit  the  Run  button  and  Go . Now  JMP  is  going  to  enter and  exit  everything until  it  finds the  model  that  fits  the  best . We  can  go  ahead  and  run  that . Now  we  have  three  models that  we  want  to  compare . What  I 'm  going  to  do  is  I 'm  going to  take  this  first  model  and  save . I 'm  going  to  publish that  prediction  formula to  the  Formula  Depot . I 'm  going  to  give  it  a  really  quick  name , and  we  will  call  this  DSD . Whatever ,  I  can 't  type . We 'll  call  it  DSD ,  and  then  close  it . We 'll  do  this  with  the  Neural  as  well . We  will  publish  that  prediction  formula , give  it  a  name , and  do  the  same  with  this  final  model where  we  will  publish that  prediction  formula . This  last  one  was , we  called  it  Stepwise . Now  I  have  these  three  models and  I  can  compare  them . We  can  run  the  Model  Comparison  platform for  all  three  of  them from  within  the  Formula  Depot. We  can  get  a  rank of  the   Rsquares  of  those  models . We  can  look  at  the  actual   versus  predicted , and  we  can  see  that  they 're all  predicting  about  the  same . We  can  also  look  at  the  predicted  by  row , and  we  can  see  this  one  point from  the  Definitive  Screening  Design is  the  one  that  was  left  out  when  we  fit the  original  Definitive  Screening  Design . It  seems  to  be  important , and  probably  most  importantly , we  can  look  at  a  profiler for  all  of  these  against  each  other . If  I  turn  off  the  desirability ,  we  can  see how  these  models  compare  to  each  other . For  example ,  we  can  look at  that  Definitive  Screening  Design and  see  that  it 's  showing  some  curvature where  the  other  two  models  are  not . Maybe  we  can  look  and  see  over  here that  the  curvature  is  different for  each  of  those . Then  the  question  becomes , which  model  is  best  and  how  do  I  know ? Then  that  brings  us  back  to  Bill . Thanks ,  Jed . Let  me  share  my  screen . Can  you  see  my  screen ? Yes . I 'm  going  to  just ... Jed  saved  all  that  to  the  Formula  Depot, so I 'm  just  going  to  execute  the  script that  will  take  us  to  the  Formula  Depot that  he 's  already  saved . Again ,  we 'll  then  just go  right  to  the  profiler . We 're  going  to  fit  all  three  of  these . Then  we  have  the  profiler . This  is  what  the  real  power ,  I  think , of  the  DOE  is, because  the   Prediction Profiler ,  we  can optimize  and  maximize  the  desirability . The  response  surface  models  will  tell  us what  combination  of  factors  we  need to  get  the  highest  elastic   or  highest  value  of  the  output  parameter . What  was  really  eye -opening  for  me is  if  you  look  at  the  values  that  we  get when  we  do  this  optimization , two  of  the  predictions , the  Neural  net and  the  Definitive  Screening  DOE, are  giving  us  values  of  the  Y  parameter that  are  greater  than  anything we  saw  in  our  initial  data  table . We  had  a  maximum  value  of  34 . Let 's  see. I 'm  sorry , I  just  got  to  get  that  screen  back . Typically ,  it 's  very  unusual for  me  to  see  this  with  the  DOE . Typically ,  the  model ,  if  you  maximize  it , is  generally  close  to  what you  see  as  in  the  table . But in  this  case ,  it  looks  like we  really  have  some  low -hanging  fruit . We  needed  to  test this  combination  of  parameters and  really  see  if  that  prediction was  valid  or  not . If  we  go  back  to  our  JMP  journal... I  just  want  to  show  you  what  happened . We  took ,  I  think  this  is  the  prediction from  one  of  the  Neural  network  fits . Again ,  the  highest  value in  the  Definitive  Screening  DOE  was  34 . The  model  prediction  was  42 , but  when  we  actually  ran  it , we  saw  some  artifacts  in  the  film that  were  not  acceptable . The  plasma  itself  was  stable . There  was  no  way  to  see  this until  the  wafer  came  out  of  the  reactor . But you  can  see there 's  a   bullseye  pattern , which  is  due  to  film  non -uniformity . In  this  case ,  it 's  very  thick in  the  middle  and  thin  at  the  edge , which  gives  us  this  bullseye  pattern . Then  if  you  look  carefully , you  can  see  all  these  small  dots over  the  wafer , which  are  actually  the  holes under  the  shower head . The  shower head  has  thousands of  small  holes  where  the  gas  comes  out . In  this  case ,  we  have  a  shower head  pattern  and  a  bullseye . I  think  the  model  is  telling  us what  direction  to  go . But again ,  plasmas  are  challenging  to  use . Even  though  the  model  was  telling  us this  film  should  have the  highest  value  of  Y , the  film  itself  was  unacceptable . Then  we  have  to  rely  on  our  process and  theoretical  knowledge  of  the  process . We  know  that  argon has  a  lower  ionization energy, and  if  we  substitute  argon for  helium  and  the  plasma , we  can  get  a  higher  plasma  density , which  may  help  us overcome  these  challenges . What  we  did  is  switch  to  argon ,   and  you  can  see , although  the  film  is  not  perfect , it 's  much  more  uniform and  certainly  good  enough  for  us to  get  the  physical  properties of  the  film  that  we  can  use . In  this  case ,  we  were  able to  hit   a  Y  value  of  46 , which  again , is  much  greater  than  34 . We 're  certainly  trending in  the  right  direction . What  we  really  wanted  to  do  is , are  there  any  opportunities  for  us to  further  improve  the  film ? Again ,  that 's  where the   Prediction Profiler or  the  response  surface  models are  very  powerful . If  we  just  look  at  the  trends that  we  see  here... I 'm  just  going  to  blow  these  up so  we  can  see  them  a  little  bit  better for  each  of  these  cases . The  data  is  really  telling  us, in  certain  cases ,  there 's  things that  we  really  want  to  investigate . Lower  temperatures look  like  it 's  definitely favoring  the  highest  value  of  Y , pressure  appears  to  be  a  key  parameter , and  low -frequency  power for  this  initial  DOE looks  like  in  two  cases , you  want  to  go to  higher  low -frequency  power . The  Stepwise  is  giving  us  the  opposite . But you  can  see that  this  is  really   a  blueprint for  us  to  do  an  additional  design to  see  how  far  we  can  push  it . Can  we  go  to  lower  flow  rates, can  we  go  to  lower  temperatures, can  we  go  to  lower  pressures and  further  improve  the  film  properties ? It 's  really  sequential  learning , and  that  for  me , is  the  real  power  of  the  DOE . We  don 't  really  have  time to  go  through  all  of  that , but  what  I  did  is  put  together a  new  JMP  table  with  the  results from  our  sequential  learning for  this  set  of  experiments . Here  is  the  same  data  we  saw in  the  Definitive  Screening  DOE . Here  are  our  Y  values , and  we 're  ranging  from  9 to  34 . The  different  colors are  the  different  DOE . Here 's  the  next  DOE  that  we  did . What  you  can  see  is  based on   the  trends  in  the  Prediction  Profiler and  the  response  surface  models , we  fixed  the  low -frequency  power at  the  highest  setting  we  could . It  turns  out  physical  limitations for  this  plasma  chemistry  prevented  us from  adding  any  more than  20 %  low -frequency  power . We  also  fixed  the  temperature . We  can 't  operate for  this  chemistry  below  200 . We  know  that  lower  temperature gave  us  the  highest  value , so  we  fixed  these  two and  then  did  a  five -factor  DOE focusing  on  lower  precursor  flows , various  spacings ,  the  higher  powers , and  certainly  the  lower  pressures , which  was  indicated  as  one of  the  most  important  parameters . If  you  look  at  the  Y  values  here , you  can  see  we 're  definitely trending  in  the  right  direction . Now  we 're  going  from  mid  20s  up  to  56 . We 're  certainly  above the  46  we  saw  there . Then  we  did  the  same  learning . Again ,  the   Prediction Profiler  indicated what  parameters  we  should  explore . We  did  another  DOE . In  this  case , we  fixed  different  parameters , but  you  can  see that   the  trend  is  the  same . Now  we 're  hitting  up  to  66 in  terms  of  our  Y  value , and  we  did  one  final  experiment , and  in  this  case  you  can  see basically  the  sum  of  all  the  knowledge that  we  gained . It  turned  out  when  we  switched  to  argon , you  could  add  more  low -frequency  power . You  could  go  from  zero  to  40 %. In  our  final  analysis , this  DOE  showed  that  unlike  the  first  DOE , after  we  finetuned  everything and  switched  gasses , the  low -frequency  power had no  statistical  impact  on  the  Y  value . We  set  that  to  zero . We  found  out  the  lowest  spacing was  the  most  important . Our  sweet  spot  for  pressure  was  2 .3  torr , and  we  did  want  to  operate at  the  lowest  temperature . Really  we  had  a  three -factor  DOE   between  total  power ,  precursor  flow , and  argon  dilution to  really  dial  in  the  films , and  we  could  hit a  maximum  value  of  84 . I  summarized  all  that  in  a  box  plot  here , which  I  think  really  shows the  ability  of  the  DOE and  sequential  learning , where  we  started  out  with  a  seven -factor Definitive  Screening  DOE  with  26  runs and  ended  up  with  a  three -factor I -optimal  design  with  16  runs , but  you  can  see our  continued  improvement . This  was  our  reference  target , so  we  still  have  more  work  to  do . But this  is  a  very  powerful  way  for  us to  screen  seven  factors with  three -level  designs in  a  very  short  period  of  time . I  do  think  it 's  worthwhile just  to  point  out how  efficient  these  new  modern  DOE s  are . If  we  looked  at  what  we 're  really  doing , we  have  seven  factors that  we  started  with . All  of  these  are  three -level  designs . For  a  three -level ,  seven -factor  design , that  would  be  over  2 ,100  runs . We  could  run  90 out  of  the  experimental  designs and  achieve  this  increase  in  the  Y  value . I  think  these  modern  designs , the  optimal  designs  combined with  the  Definitive  Screening  DOE s are  a very  powerful  tool  for  us to  get  the  most  value with  the  fewest  number  of  experiments . The  final  thing  I  want  to  touch  on is  when  we  switched to  a  different  precursor . This  is  really   a  different  challenge  we  faced . The  goal  here  was  to  evaluate   different  precursors to  compare  how  it  stacked  up against  the  initial  baseline  film . What  we  tried  to  do  is  use all  of  our  learning  from  those  four  DOEs and  become  even  more  efficient. Instead  of  90  runs, can  we  do  this  in  52  runs ? With  Jed 's  help ,  we  put  together an  eight -factor  A -optimal  design , but  what  we  found  is  that the  chemistry  was  shockingly  different . All  of  the  parameter  space  that we  could  operate  easily  with  Precursor  1 was  not  the  case  here . In  fact ,  I  put  together  a  slide to  show  you how  bad  some  of  these  films  could  look , so  we  could  get  perfect  films . You 'd  be  hard -pressed to  tell  there 's  a  film . This  is  a  nanometer -thick  film , edge  to  edge  on  a  12 -inch  silicon  wafer , perfectly  uniform . Then  we  would  have  films that  look  like  this . Obviously ,  that 's  not a  design  that  we  wanted . But the  challenge  that  we  faced  was , we 're  doing  an  eight -factor  DOE , and  we 're  trying  to  do  this quickly  and  efficiently , and  30 %  of  the  runs  failed . I 'm  looking  at  a  table with  eight  different  factors . How  do  I  pick  out  the  factors that  are  contributing  to  this ? What  we  did  is  we  created  a  table for  our  eight  input  factors and  then  identified  all  of  the  films that  had  delamination ,  arcing , or  other  issues , and  then  created  a  metric , just  a  film  metric,  pass  or  fail . It  turns  out   we  can  fit  this  categorical  variable and  see  if  we  can  get  a  model that  will  help  us  understand what  is  really  causing these  issues  through  all  these  films . The  first  thing  we  can  do  quickly is  again  go to  our  Model Screening platform, Predictor  Screening , and  get  a  handle  on  what  factor ,  if  any , is  really  controlling  film  quality . If  we  look  at  this , it 's  pretty  clear  and  it 's  shocking, because  this  was  not  the  case with  Precursor  1 . The  flow  rate  of  the  precursor was  by  far  and  away the  most  dominant  factor   impacting  film  quality . But what  we  needed to  run  these  experiments is  not  just  knowing  this  factor , but  what  value  can  we  safely  run to  generate  quality  films. That 's  where  we  did  the  Neural  net . Again ,  we 'll  go  into  Predictive  Modeling, Neural  net, we 'll  take  our  factors , and  we 're  going  to  fit  film  quality   as  a  categorical  variable . I 'm  going  to  go  to  boosting  of  20 , as  Jed  mentioned . That 's  typically  what  the   Model Screening comes  up  with , and  we 'll  generate  our  model , and  you  can  see  we  get excellent   Rsquare  values . This  is  a  categorical  model . I  believe  that  looking  at  the  ROC  curve provides  insight into  how  well  the  model  fits . If  the  curve  is  along  this  diagonal , it 's  just  basically  a  guess . This  looks  like  a  half  square  wave . These  are  basically  perfect  fits in  the  ROC  curve . Then  the  question  is , how  can  we  utilize  that  data ? The  nice  thing  about  the  Neural  net is  that  you  have  a   Categorical Profiler. I  can  execute  the   Categorical Profiler , and  now  I  set  this  up  where  we know we  want  to  operate  at  lower  pressures . We  know  from  the  previous  work we  want  to  operate  at  lower  spacings . We  want  to  go to  the  lowest  possible  temperature. We 'll  just  set  this  in  the  middle . Then  basically  we  have this  profiler  that  tells  us the  primary  factor  affecting  this is  really  the  argon  flow  rate . If  we  can  keep  our  flow  rate below  400  sccms , we  can  have  a  100 %  success  rate for  the  films that  we 're  trying  to  optimize . With  this ,  we  set  up  a  new  DOE , limiting  the  total  flow  rate  to  400  sccms . We 'll  go  back  to  our  JMP  journal . We  were  able  to  come  up with  a  new  design and  complete  41  of  the  42  runs , and  we 're  still  executing  that  study . But it  just  shows  how  powerful the  Neural  network  is for  a  categorical  variable where  we  can  do  this  in  an  afternoon, where  at  one  o 'clock  we  found  these  films  weren 't  working . Three  hours  later ,  we  had  a  model that  told  us  how  to  set  up  a  new  design and  we  were  executing  that  later  that  day . I  think  that 's  the  material we  wanted  to  cover .
Monday, October 16, 2023
JMP has long been well-suited for producing production quality tables and graphs. Users have many options, such as taking screen shots or saving JMP reports in specific formats or even using JSL to place pieces of output reports in specific MS Word or Excel files. We have had a strong demand in recent customer engagements to go beyond providing solid analytical solutions accompanied by the excellent JMP tabular and graphical output as the teams look to operationalize a solution.   Operationalization often means deploying results to nonstatistical audiences using templates or special formats tailored more toward general consumption. We demonstrate a method integrating streamlined Python code generated with AI apps to effectively create output documents with advanced MS Word, which is not possible with JSL alone and doesn't require difficult Visual Basic code. We also show several other deployment tips and tricks that don't require Python.       Hello,  Team  Discovery.   I'm  Jim  Wisnowski, along  with  Andrew  Karl  from  Adsurgo, and  we're  here  to  talk a  little  bit  about  operationalization and  how  you  can  effectively  employ some  JMP  capabilities  to  do  so. The  genesis  of  this  presentation  came from  engagement  with  a  customer  last  year around  this  time  who  said, "Our  company  just  loves  JMP as  a  sandbox  tool and  maybe  in  the  lab, but  it  really  doesn't  do  anything for  operationalization  for  us." At  which  point,  it  struck  me  as, "I'm  not  really  sure  I  understand what  the  word  operationalization nor  really  know how  to  say  it  necessarily." S omehow  there's  a  disconnect because  all  of  our  projects, we  seem  like  we  deliver  a  product, and  it  is  finished. What  we  did  is  we  figured  out, "What  exactly  is  operationalization and  then  how  can  we  do some  neat  things  in  JMP  with  it?" What  we're  going  to  do is  I'm  going  to  talk  about  what  it  is and  then  give  you  some  quick  wins in  the  form  of  the  depot. Then  Andrew,  as  always,  will  do  the  work where  it  takes  the  high  intellect, and  he'll  also  show  you the  beautiful  part  of  operationalization. Key  here  is  we  all  know  all  models are  wrong  and  some  are  useful, but  the  other  side  of  that is  your  boss  or  your  boss's  boss, she  is  thinking  all  models  are  useless, whether  she  says  it  explicit  or  not. They're  useless  unless  somehow   they're  supporting  a  decision  of  mine or  we  get  them into  the  production  environment. We  need  to  make  sure  that even  though  we  have  a  good  model, that  it  can  go  to  the  next  step and  be  implemented. By  the  way,  I  do  want  to  show  George  Box over  here  on  the  right, and  then  Stu  Hunter  on  the  left, who  just  celebrated  his  100th  birthday on  the  3rd  of  June. By  definition,  what  is  operationalization? When  we  think  of  things, we  can  think  of  ideas such  as  the  leadership  saying  the  usual, What  is  the  problem? What  do  I  do  about  it? How  does  it  impact  me? And  then  the  now  what? What  is  the  solution? The  solution  isn't  necessarily all  the  things  that  we  do in  the  data  analytics,  data  science, world of  the  discovery, data  prep,  and  all  that. It  really  lies in  the  operationalization  piece for  that  senior leader  to  understand how  you're  going  to  solve  the  problem. In  other  words,  it's  really  how  do  we  get from  that  lab  environment to  the  production  line where  we  have  a  solution that's  going  to  be  useful  to  us. As  we  do  that,  let's  not  forget SAS's  recommendation  here that  we  better  make  sure  that we  have  a  good  disciplined  approach as  well  as  automated  in  that  world. Next  up,  we  can  dig  a  little  bit  deeper into  what  operationalization  is on  a  micro  level  kind of thing. I  asked  Chat GPT-4,  I  said, "Can  you  give  me  an  image of  what  operationalization looks  like  in  data  science?" This  is  essentially  what  they  did. I  had  to  do  the  graphics, but  they  gave  me  all  the  words. If  we  think  about  our  usual  data, do  our  data  development, and  then  we  figure  out  what  model  we  like, and  then  we  have  to  figure  out how  we're  going  to  employ  or  deploy  that, what  language? Is  it  going  to  be JavaScript,  C,  or  Python? Then  we  do  the  deployment, and  then  we  do  maybe perhaps  an  API  integration. Good  news  is  JMP  has  a  lot of  tools  for  us  to  do  that. We're  not  left  in  just  that  lab environment  as  suggested. Then  on  the  bottom  here, we  got  the  idea  that once  we  have  that  model  out  there, it's  not  a  launch  and  lead  kind of  thing. We  have  to  babysit  it and  perhaps  update  hyperparameters or  add  new  data and  see  if  it's  still  valid. Then  we  have  this  idea  here that  you  know  what? Not only are  our  users  liking  it, they  want  to  add  more  capabilities, so  we  start  scaling  up. We  have  to  make  sure  that  we  continue our  good  configuration  management and  data  compliance  and  documentation, ultimately  resulting  in  business  value. The  bottom  line  is  how  do  I  go from  this  lab  and  sandbox  environment to  having  business  value? That's  what  we're  looking  for in  our  operationalization. Forbes  gives  the  five  steps  here. Important  for  us  is  to  think  about, first,  you  have  to  have  a  good  story so  that  management  believes that  you  have  a  credible  way to  approach  this  and  solve  the  problem. Then  the  last  part  here  is, once  you  do  have  a  model, and  deployed, and  you  have it  operational ized make  sure  that  you  have  some  metrics  to  make  sure that  it  is  in  fact  performing. But  this  is  like  that  last  mile  idea is  we  take  all  of  this  work that  we  do  to  create  the  models, but  getting  it to  that  operationalization  piece is  the  tough  part. In  fact, we  can  see  that  the  data  out  there doesn't  suggest that  we're  doing  all  that  great. Overall,  we're  maybe  even  fewer than  half  of  these  models  make  it. Then  if  they  do,  it  takes  upwards of  three  months  or  so  to  do  so. DevOps,  we're  all  familiar  with in  terms  of  a  good  disciplined  approach for  software  development. When  we  take  that  step into  our  model  deployment  world, we'll  call  it  ModelOps, where  we  want  to  have  some  a  culture or  process  and  technologies to  effectively  employ  them. If  we  look  at  these  three  circles  here, it's  really  this  intersection between  two  and  three that  we're  focused  on to  make  sure  that  the  deployment   is influencing  those  business  decisions. I'd  like  to  go  and  do a  demonstration  here  in  JMP. Before  I  do  so,  I  do  want  to  point  out two  fantastic  presentations from  previous  discoveries that  do  talk  about  this  idea of  deployment  and  operationalization by  Nascif  and  Dan  Valente in  2016  as  well  as  2017. You  can  see  over  here, they  have  this  whole  idea that  if  you  give  a  man  a  fish, he  eats  for  a  day, and  if  you  teach   him to fish, he'll  eat  for  a  life, that's  this  operationalization  piece, which  they  also  call the  dark  side  of  analytics. That's  what  we're  going  to  get  into. M eijian  I  also  wrote  a  decent  paper as  well  on  how  you  can  do  that. But  for  us,  what  I  want  to  show  you is  using  the  Formula  Depot, and  I  got  a  little  plus  there  because it's  going  to  go  a  little  bit  beyond just  using  the  Formula  Depot because  that  is  not  enough. We'll  use  our  well-traveled Boston  housing  data. We'll  look  at  what's  the  price  of  it as  a  function  of  rooms,  distance, and  Lstat, which  is  an  economic  indicator. We'll  create  a  main  effects  model as  well  as  a  Bootstrap. Then  we'll  look  at  the  profiler, and  I'll  show  you  a  quick  trick that  could  be  helpful  for  you. Then  we'll  look  at  how  do  I  convert  this to  an  operational  solution and  being  able  to  deploy  it in  a  Python  environment. Certainly,  this  is  a  very  small  data  set, but  we  could  easily  have  done  this plugging  into  some  data  source and  using  query  builder and  things  like  that. But  just  want  to  show  you  some  quick  wins that  you  can  go  to  the  next  step. Because  often  we  hear  that  it's  great that  you  do  all  this  work, but  the  actual  implementation has  to  be  on  this  server, and  this  server can  only  use  JavaScript  or  C++, whatever  it  happens  to  be, Python,  maybe. How  can  we  take  our  good  work and  transport  it  into  use  in  operation? I'm  going  to  transition  over  to  JMP. Here's  a  journal  that  we  have  for  you, and  it  goes  through a  few  of  our  different  options and  what  we're  doing  here. But  here's  Boston  Housing that  we  know  and  love. Here  is  my  Least  Squares  model. What  you  do  in  the  Least  Squares  model, or  any  model  for  that  matter, is  under  Save  Columns, you're  going  to  publish that  prediction  formula. Then  the  Formula  Depot  comes  up  for  you. Let's  go  ahead  and  do the   Bootstrap Forest  as  well. Now we  have  the   Bootstrap Forest, which  we  can  also,  under  Save  Columns, publish  that  prediction  formula. If  we  come  back  up  here and  we  check  out that  Formula  Depot  report, we  can  see  that  it  is  something that  we  are  familiar  with  in  this  fashion. That  if  I  come  down  here off  of  the  red  triangle next  to  Formula  Depot, I  can  look  at  the  profiler . I  want  to  look  at  the  profiler of  both  these  models. I  do  OK, and  there  it  is. The  main  effects  model,  of  course, doesn't  have  any  interaction, but  maybe  you  want to  make  the  point  somehow that  when  you  have  very  short  distance, you have a real steep economic  impact kind of  a thing. What  we  could  do  is  we  could  show  this dynamically  in  PowerPoint, because  often, that  is  the  endpoint  of  some  of  our  work, and  an  analysis is  a  PowerPoint  presentation that  we'd  like  to  have some  sort of a  dynamic  display. There  are  certainly many  screen  capture  capabilities that  you're  probably  all  familiar  with. But  if  we  just  go  back  to  PowerPoint, we're  able  to  do  this  inherently in  the  software,  and  it's  pretty  quick. Here  is  a  blank  slide. What  I  want  to  do is  I  want  to  show  that  dynamic  display. I  just  come  under  Insert, and  I'll  do this  screen  recording  right  here. What  I'll  do  is  I  will  get  out  of  that and  I'll  come  back  to  here, and  then  I'll  come  back  to  JMP, go  to  PowerPoint. Now I do  the  screen  recording, and  I  want  to  be  at  the  right  spot if  it's  a  problem. I'll  just  go  ahead and  take  a  snapshot  of  what  I  want. I  will  go  ahead  and  hit  the  Record  button, and  it'll  give  me the  3-second  countdown. Then  now  I  can  narrate  and  say that  we  don't  see  any  interaction in  our  linear  your  model, but  you  can  see  a  Boosted Tree   does have  quite  a  bit  of  interaction. You're  happy  with  that. You  go  ahead  and  stop  the  recording. Then  now  you  have  this  embedded in  your  PowerPoint  that  easily. You  can  go  ahead  and  run  it  here, and  you  can  hear  me  in  the  background that  I  was  explaining  it, so  the  audio  is  in  there. Then  clearly, as  you're  in  presentation  mode, it  will  run  automatically  as  well. Now back  to  the  task  at  hand is  what  we  want  to  do is  deploy  this  solution in  a  model  that  is  Python. What  we  can  do under  the  Formula  Depot  options  here is  we  can  go  ahead and  generate  the  Python  code, and  we'll  do  it  for  the  sake of  argument  for  both  of  these. We  can  see  that  the  Boosted  Forest  here has  Python  code that  consists  of  55,000  lines  of  code. Good  to  know. But  we'll  put  that  aside  for  now. What  we're  going to  ultimately  decide  on,  we'll  say, is  t his  main  effects  model only  is  what  we're  going  to  use that  we  want  to  deploy  to  our  system. I'll  go  ahead  and  hit... By  the  way,  before  I  do  that, there  is  this  one  particular line  of  code  here that  says  import  JMP  score. That  is  a  Python  file  that  is  inherent, and  all  of  you  have  it on  your  install  directory, and  you're  going  to  need  that  to  use  it. What  it  is, it's  really  just  a  helper  menu. It  tells  you here's  how  you  do  vectorization, here's  how  you  do  power and  things  like  that, but  important  that  it's  there. I'll  go  ahead  and  do  File,  Save. What  that  creates  is  this  Least  Squares, and  then  we'll  call  it BH  for  Boston  House. Now I've  got  that  Python  code. What  would  be  nice  is  if  I  could  just  go to  Spyder  off  of  Anaconda and  just  run  it and  then  score  all  my  new  data. The  problem  is  it's  not  that  easy. It  doesn't  run  you  yet. It  takes  you  a  good  way  there, but  you  still  have  to  put some  wrappers  and  things  around  it. You  have  to  import your  Excel  file  or  whatever  it  is. I'm  not  a  Python  coder. In  fact,  people  say,  "What  do  you  do?" "I'm  a  data  scientist." "Oh,  you  must  be  good  at  Python." "No,  I've  never  actually  used  it." I'm  an  impostor  in  that  fashion. But  maybe  there's  a  way  that  we  could  use this  latest  technology,  ChatGPT-4, and  have  it  create a  Jupyter  Notebook  for  me. If  I  come  under  my  ChatGPT- 4  here, I  have  this  question  I  ask  it. "I  have  Python  code  generated from  a  model  in  the  JMP  Formula  Depot. Go  ahead  and  create a  Jupyter  Notebook  for  me that's  going  to  import  data  from  Excel." I  say,  "Can  it  do  it?" And  it  says,  "Absolutely." Then  what  it  does is  it  gives  me  the  code  right  here. I  copy  the  code, and  I  put  it  in  my  instantiation   of Jupyter from  Anaconda,  all  three. Then  I  run  it,  and  lo  and  behold, it  doesn't  work. Nothing  works  for  me  the  first  time. But  I  say,  "It  didn't  work, and  here  was  the  error  message  I  got." It  says,  "Oh,  well,  try  this." Then  I  tried  that,  and  then  it  worked. What  did  it  work? If  I  come  back  over  here, this  is  my  Jupyter  Notebook that  was  given  to  me  by   ChatGPT-4. Again,  I  know  nothing  about  Python, but I do  know  that  it  gave  me  these  lines. I  just  go  ahead  and  say, I'm  going  to  import  Pandas because  I  need  that to  get  my  Excel  file  in. Then  here  is  that  Fit  Least  Squares. That's  what  I  got  from  the  Formula  Depot. It  does  that. I'm  running  each  one  of  these,  by  the  way. Now it  says  go  ahead  and  import. I'm  going  to  import  an  Excel  file that  has  what  I  want  to  score, and  that's  going  to  be under  Boston  Housing  data. It's  new  data  that  I  want  to  score. Then  here's  this  outdata  thing that  it  told  me  was  my  error. I  said,  " I'll  do  that." Then  this  says,  "Hey,  just  let  me  know that  I  brought  in  some  data,"  and  it  does. Then  now  I'm  going to  go  ahead  and  score  it. I  go  ahead  and  score  it. Then  did  it  score? Sure  enough,  it  did. There's  the  first  five  values  of  it. Then  I  can  go  ahead  and  save  that, and  we'll  just  call  this  For  Demo. Maybe  I'll  even  add  100  in  there. Since  200  is  100, I'll  go  ahead  and  put  that  out. Then  I'll  say,  "W here  does  that  live?" Maybe  I'll  see  it  right  here. Here  it  is. Here's  the  scored  data  Excel  file. There  are  all  800  or  so predicted  values  from  that  linear  model. This  is  easy  as  that. Next  up, what  we  want  to  do  is  set  Andrew  up to  why  everything is  beautiful  in  this  world. Coming  back  to  PowerPoint  here, the  scenario  is  this. It's  that  we  were  working with  a  group  of  doctors  across  the  US, and  they  wanted  reports for  every  one  of  them how  productive  they  were  in  2022. They  thought  that  perhaps they  could  be  more  productive by  changing  some  of  the  codes   that they use for  certain  procedures. They  gave  us  millions  of  rows  of  data, and  we  came  up with  exactly  what  they  asked  for. We  created  a  template of  10  pages  or  so  of  the  report with  nice  JMP  graphics  in  there, and  it  was  virtually  perfect, except  for  one  thing. The  one  thing  is  that  this  data  table at  the  very  end  that  gave  the  results, we  couldn't  get  it  sized  properly or  put  the  borders  on  it. It's  as  simple  as  just  selecting  it and  then  right-clicking  and  do  AutoFit and  then  hitting the  border  symbol  next  to  it. That's  what  I  told  Dr.  Jay  right  here. This  is  for  Dr.  No. You  can  see  his  annual  report. Essentially, Andrew  swam  the  English  channel and  gave  them the  99.8%  solution  to  their  problem, but  they  weren't  quite  happy. It  wasn't  until  we  went  and  took the  step  to  make  this  automated. Again,  this  is  a  two-second  process, but  because  it  took  that by  100s  of  things, they  weren't  happy. But  then  we  ended  up  fixing  that, and  that's  when  the  customer  said, "This  is  absolutely  beautiful," hence  we  have beautiful  operationalization. With  that, I'm  going  to  turn  it  over  to  Andrew to  let  you  in  on  a  few  secrets of  how  you  can  get  some massively  increased  productivity. Thanks,  Jim. I'm  Andrew, and  I'm  going  to  show  you how  we  put  together  this  report. The  assumption  is  we  have  some  suppliers, and  here  are  our  suppliers. We've  got  four  different  suppliers, and  we've  got  some  metrics  that  we  track in  this  data  set that  we  update  frequently. We  want  to  be  able to  report  this  to  the  suppliers so  we  can  track  how  they're  doing and  have  a  record of  long-term  performance. What  we'd  like  to  do is  to  get  something  like  this is we  have  this  template  we  like where  we  have  our  header  up  top,  our  logo. We  have  in  bold, we've  got  the  company  name. Then  each  of  these,  we  substitute  in these  calculated  values  from  the  data  set. The  standard  JMP  functions  of, we  export  a  report  to  Word, we  get  everything, all  the  outline  boxes  are  open, but  it  doesn't  have  our  template  function, and we have to  fill  in  the  blanks. We  can  manually  copy  out  the  pictures, right-click  those  little  gray  triangles, and  say,  edit  copy  picture and  get  a  nice  version of  the  pictures  out. But  it's  still  a  manual  process that  has  to  be  done. This  little  application  here, I've  got  four  suppliers. What  if  you  have  400, and  they're  doing  that  once  a  month? That  becomes  unwieldy. How  can  we  do  this? Not  natively  within  JMP  or  JSL. You  can  get  close  sometimes to  being  on  your  application, but  a  more  flexible  example is  to  take  something  like  this where  we  have  a  template  file, we  write  the  report  once the  way  we  want  it. Every  time  we  have  something we  want  substituted  in  by  JMP, we  have  these  double  brackets, and  we're  going to  put  a  little  keyword  in  there. In  this  case,  team_p   is going  to  be  the  team  name. Then  down  here,  I've  got  mill_time_ p with  an  underscore, and  now, I've  got  that  bold  and  highlighted. We  put  whatever  format  we  want, and  anything  we  want  substituted  in, we  just  put  in  brackets. It's  a  natural  solution, so  it'd  be  great  if  we  get  this  to  work. Then  also,  in  addition  to  text, we  can  also  do  that  with  pictures. We've  got  the   q_pic11. If  we  want  a  JSL  line  of  box  equivalent, then  we  can  get this  table  structure  within  Word, and  we  just  put  in  our  keywords where  we  want  to  substitute  things  in. Also,  we're  going  to  have  a  table at  the  end  that  doesn't  have  a  key, that  doesn't  have  a  holder  here. I'll  get  to  that  in  a  second. Come  back  to  my  journal. At  the  end,  what  we  want  it  to  look  like, the  intended  document  result, is  we  also  have  this  table  at  the  end that  goes  through  Tabulate. We  have  this  nice  color  gradient we  can  put  it  into  JMP, and  we  want  to  get  this  into  Word. But  also,  as  Jim  mentioned, we  want  to  AutoF it  this  table and  we  want  it  to  look  nice, because  a  lot  of  times, the  tables  don't  fit  automatically. We  can  go  through  all  the  work and  create  the  tables, but  if  we  don't  make  them  fit, then  we're  going  to  have  a  lot  of  work ahead  of  us  to  go  through and  do  that  manually. It's  not  something we  can  program  natively  within  JMP. What  can  we  do  is  we  will  have  a  script that,  and  this  is  shared on  the  Discovery  website, where  we  can  open  up  the... We  have  a  reference  for  the  data  table, the  container  data,  and  this  gets  updated. Every  time  we  run  this to  generate  the  reports, it  pulls  in  the  new  data. We  have  the  output  path, which  is  a  folder  where all  the  temporary  files  get  written  to that  we  normally  clean  up plus  the  output  reports. Then  also  template  file,  that  Word  file that  contains  those  keywords. A ll  the  rest  of  the  script  is  going to  be  going  down  and  calculating the  individual  values that  gets  substituted  in. A t  the  end,  we  have  this  Python  script that  does  the  actual  substitution and  then  also, auto  fits  and  pulls  in  this  table. If  you're  saying, "I  don't  know  Python, I  don't  know  how  applicable this  is  going  to  be  to  me," we  don't  know  Python  either, but  we  got  this  to  work  pretty  quick because  GPT-4 wrote  the  Python  code  for  us. I'm  going  to  show  you an  example  of  how  we  did  that. What  this  script  will  do is  we'll  write  the  temporary  files to  the  output  folder. For  example,  here's  our  q12  graph, and  the  temporary  files  get  written  here. The  images  get  written to  the  output  folder. Then  with  the  static  code  in  Python, it  knows  to  pull  these  files  in. The  individual  calculations will  be  run  within  JMP, and  then  saved  within  the  script, and  then  passed  along to  Python  at  the  end. Here  we  run  all  of  our  JMP  code. We  load  any  of  our  values  into  variables. An  example  of  that  is, here  we  have  a  tabulate  function. We're  going  to  make  into  data  table. From  that  data  table,  we're  going  to  take the  mean  mill  time  column and  take  the  first  entry  in  that  column, and  then  that's  going to  become  our  mill_ time_ p  variable. That  is  what  will  get  sent  down in  the  Python  code. When  we  initiate  Python, we  tell  it  whatever  variables  we  wanted to  send  over  from  JMP  to  Python. Here's  that  mill_ time_ p. That  will  hold  any of  the  mill time  calculations, and  that  is  what  gets  substituted  in to  the  mill_time_ p  underscore  area within  the  template. I'm  going  to  go  back  to  my  template  file. Here's  my  mill_time_ p  area. That's  what  gets  substituted  in. The  intermediate  output from  the  Python  code  for  doing  this is  the  temp  report. We  can  see  these  values get  substituted  in. The  graphs  get  placed  in. We  get  our  nice  grid  of  graphs. At  the  end,  we  don't  have  our  table  yet. The  reason  we  don't  have  our  table  yet is  because  we  like  the  way  that if  we  move  a  table  from  JMP using  the  get  as  a  report  to  a  journal and  then  export  that  journal  to  Word with  Save  MS  Word, we  like  the  way it  keeps  the  table  structure, but  we  still  need  the  AutoF it. What  we  do  is,  in  addition  to  the  report that  gets  written  out  from  the  template, is  we  also  write  out this  other  temporary  table  file. We  get  the  table  import. Here  it  is,  and  what  we  need  to  do when  we  want  to  automate is  this  AutoF it t o  Window, and  then  also  reducing the  font  size  where  it  actually  fits. What  we  need  to  do  is, after  the  Python  code that  substitutes  out  into  the  template, what  we  found  is  we  have  to  convert that  DOC  file  created  by  JMP, convert  it  to  a  DOC  X  file, and  then  we  have  Python  code that  will  open  up  that  DOC  X  file, take  the  first  table, it  will  apply  AutoF it, change  the  font  size  to  7, and  then  it  will  append  it to  the  original  report  file. How  did  we  know  to  create  this? Again,  you  can  copy  what  we've  done. We've  got  this  file  available. But  how  can  you reproduce  this  for  your  own and  create  your  own  Python  to  do  this? I recorded  a  video  of  going  through on  GPT-4,  how  I  did  this, and  I'm  going  to  show that  now  and  narrate  that. The  prompt  I  give  is  that  I've  got a  Word  document  that  I  want  to  use. I'm  so  bad  at  Python. I'm  going  to  go  ahead and  give  GPT-4  my  actual  path  here because  I  don't  know how  to  substitute  out  the  backslashes, and  the  spaces,  and  everything. I  say,  "G o  to  this  path,  open  it  up, take  the  first  table  you  see  in  here, and  then  give  me  back  that  table  AutoFit." It's  going  to  tell  us,  "You  need to  install  this  other  package." I've  just  got  a  base Ana conda  installation. You  can  ask  it  for  instructions  and  help. You  make  sure  you  install  that  package and  then  you'll  be  able  to  run  this  code. Whenever  it  gives  me  this  thing, the  first  thing  I  noticed  is  it  tells  me, "Hey,  I'm  going  to  overwrite your  report  file,  be  careful." I  say,  "No,  I  don't  want  that." It's  interactive  the  way it  gives  you  the  code. I  say,  "F ix  this  code. I  want  to  write  to  a  different  file in  the  same  directory." It's  going  to  modify  the  code to  give  me  a  different  output  file in  the  same  directory. What  I'm  going  to  do is  I  just  copy-paste  this  over  to  Spyder, which  is  basically the  JSL  editor  for  Python. It's  the  Python  editor. I  hit  Run,  I  hit  the  green  arrow, and  I  get  this  note about,  "There's  an  error." I  don't  know  what  the  error  means. I  don't  want  to  spend  time on  Stack  Exchange  or  Stack  Overflow or  anything  looking  that  up. I  paste  the  error  back  to  Python, and  it's  nice  enough  that  it  apologizes and  explains  what  it  did  wrong. I'm  not  that  worried about  what  it  did  wrong. I  just  want  something  that  works and  gives  me  my  report. I'm  going  to  copy- paste the  code  it  gives  me and  then  go  back  to  the  Spyder, run  that. I  get  one  more  error. It  says,  "Table  is  not  subscriptable." Not  really  sure  what  that  means. I  tell  GPT-4  about  that. It  apologizes  again. It  thought  that  the  tables were  zero- based  index. It  turns  out  it's  one- based. It  fixes  that. I'm  going  to  copy  this  code  over. This  time,  it  runs  without  error. I  go  to  the  folder  I  specified  to  it, and  here's  the  modified  folder. Now you  can  see AutoF it's  been  applied  to  this  table. We  just  made  a  Python  code without  really  knowing  Python. You  don't  have  to  stop  there. If  you  have  any  version  of  modification that  you  want  to  give  to  it, you  can  change  the  font  size. Here  I'm  going  to  ask  it to  make  it  Wingdings. Let's  see  if  it  knows  how  to  do  that. It  gives  me  some  new  code,  and  I  run  it. Yeah, I  can  get  Wingdings  output. Just  to  make  sure it  didn't  just  give  me  gibberishish, I'm  going  to  copy  it  all and  make  sure  that  it  translates  back into  more  easily  readable  text, and  it  does. That's  what  we're  looking  for. Not  only  Python,  but  also  Visual  Basic. Another  thing  that  we  run  into is  we  output  things. You  might  have  a  500-page  report and  you  want  the  headers  to  look  right to  header 1 ,  header  2  in  words, so  that  way you  can  get  a  table  of  contents. W e  told  GPT-4  is, I'm  going  to  use  these  delimiters. I'm  going  to  script  these  in to  my  outline  box  titles. I  get  h1,  close  h1,  h2,  close  h2 for  header  1,  header  2. I  want  to  substitute  those  in. When  we  did  that  to  GPT-4, what  it  gave  us is  it  gave  us  this  macro  file, which  I'm  going  to  copy  that. I'm  not  sure  if  it  was  retained in  my  Microsoft  Word. We'll  find  out. If  not,  I'll  paste  it  in. You  have  to  enable  this  Developer  tab. That's  a  Word  option. When  you  do  that,  we  get  this. We  just  paste  in  our  macro  here, which  was  generated  by  GPT-4. I  don't  like  programming  in  VBA, but  it's  good  at  it. When  I  hit  Run  on  this, it  takes  everything, all  these  headers  and  fills  them  in. That  way  I  can  go  up  here,  I  could  say, References,  Table  of  Contents, and  put  on  my  table  of  contents. If  I  generated  500  page  report, now  I've  got the  right  structure  for  all  this. It's  easy  to  go  in  here,  too. If  you  want  to  add  numbers to  these  headers  and  stuff, you  just  right-click  in  here, and  you   go  to  Modify, and  you  can  turn  on that  numbering  structure  if  you  want. The  last  thing  I'm  going  to  show for  GPT-4,  specifically  for  JSL, is  you  might  have  noticed in  my  script  back  here, I'm  really  bad about  commenting  my  script. I've  got  a  couple  of  comments  in  here. But  for  the  most  part,  I  don't  have a  lot  of  comments  describing  it. If  you  have  code,  you  don't  want to  give  anything  proprietary, but  if  you  give  code  to  GPT-4  or  even  JSL, it'll  add  a  bunch  of  nice  comments  to  you and  explain  what  each  code  block  is  doing, and  that  way  other  readers  can  see. Also  for  Python,  if  you  don't  know  Python and  you're taking  a  look  at  our  script that  we've  provided, and  you're  saying, " What  does  this  do," you  can  provide  this  code  to  GPT-4 and  say,  "Explain  this  Python  code  to  me." It'll  give  you  a  long  narrative  story and  say,  "Here  is  exactly the  workflow  of  what's  happening. Here  are  the  key  functions that  are  doing  it." That's  my  favorite  part. You  can  say,  "Do  you  have any suggestions  for  this  code?" It'll  say, "Y ou  don't  have  any  error  handling. You've  got  some  memory  leak  issues." It'll  go  through  and  make a  bunch  of  suggestions. Then  it's  nice  enough  to  go  ahead and  give  you  some  new  code with  all  the  suggestions  implemented. With  all  this  in  place, you  can  go  from  doing all  of  your  calculations  in  JMP to  actually  getting  this  nice  output  file that  has  the  right  format  you  want. Everything  looks  nice. You're  not  going  through making  manual  changes. With  this  in  place, this  is  what  the  customer  said, take a  look  at  this  and  said, "This  is  beautiful." With  that,  we  hope  that you  can  take  this  same  idea and  go  make  some beautiful  reports  yourself.
Tennis is one of my favorite sports. The 'big three' of Federer, Nadal and Djokovic are my favorites. They elevated modern tennis to new heights in their rivalry. But it looks like Alcaraz is in line to take the throne. It is fun to compare their records and present the results at JMP discovery summit in Indian Wells which is the home to Indian Wells Open.   I started with Association of Tennis Professionals (ATP) data from 2022 for my project. Next, I combined over 25 years of data that include all match records of Federer, Nadal, Djokovic and Alcaraz. Using JMP’s powerful analytical and visualization tools, this report provides insight into the questions of who won the most matches of a year? What are the factors that influence players winning matches? Most importantly, who is the GOAT?     Hi,  everyone.   My  name  is  Jianfeng  Ding. I'm  a  research  statistician developer  at  JMP  IND. Today,  I'm  going  to  show  you  how  I  use  JMP to  explore  the  tennis  data and  find  out  who  is  a  goat,   the  greatest  of  all  time  of  men's  tennis. First, I  would  like  to  give  you   some  background  information why  I  choose  this  topic. When  I  heard  that  the  JMP  Discovery  Summit 2023  will  be  held  in  Indian  Wells, I  got  excited as  tennis  is  one  of  my  favorite  sports and  my  youngest  son   plays  varsity  tennis  at  his  high  school. I  have  watched   a  lot  of  tennis  over  the  years. Indian  Wells  is  a  home to  Indian  Wells  Master, which  is  often  called  the  Fifth   grand slam. I  thought  it  would  be  fun  to  use  JMP to  explore  and  analyze  the  tennis  data and  present  the  results  to  our  user at  Indian  Well  Discovery. The  second  motivation  come  from  JMP. JMP  has  grown  bigger and  richer  in  many  ways. There  are  so  many  wonderful features  created  by  my  colleagues. I  would  like  to  keep  myself  updated  with  these  new  cool  features by  applying  them  to  the  project. Currently,  you  are  seeing  one  of  them, Application  Builder. Instead  of  using  PowerPoint, I'm  using  JMP  Application  Builder for  today's  presentation. My  presentation  mainly  include  two  parts. I  will  take  you  on  a  tool  to  explore the  ATP  data  from  the  year  of  2022. ATP stands  for   Association  of  Tennis  Professional, which  is  the  world  governor  body  for  men's  tennis. Then  we  will  look  at  a  25-year  combined ATP  data  to  find  out  who  is  the  GOAT. First,  let's  see  where  do  I  get  the  data? I  get  the  data  from  the  web  and  GitHub, which  was  created  and  maintained   by  Jeff  Secman. He  is  a  software  developer   working  in  the  field  of  sports  statistics. On  this  web,  it  contained  the  ATP  match  data from  year  of  1968  to  the  current  year. We  can  also  get  women's  tennis data  from  this  web  as  well. What  data  looks  like? Here  is  the  data  from  the  year  of  2022. It  consists  of  about  49  variables with  about  3,000  observations. Each  observation  represent matches  play  on  ATP  tours. The  yellow  section  contains  a  variable about  the  tournaments and  the  blue  section  contains  a  variable  about  the  players. Each  observation  is  a  match, so  usually  the  variable  comes  with  two, one  for  the  winner  and  one  for  the  loser. Let's  look  at  all  those  variables about  the  tournament  first. I  build  the  graph  builder   on  tournament's  name and  a  tournament's  surface and  a  tournament's  level. From  the  tournament  name, the  country  with  more  player were  sitting  on  the  top. Sorry,  the  tournament  with  more  player  would  sit  on  the  top. Grand  Slain,  Australia,  Roland Gallos , US  Open,  and  Wimbledon are  the  largest and  most  prestigious  tournaments. In  last  year,  there  are  about 145  tournaments. We  also  can  see  there  are  typical five  surface  for  the  tournaments. They  are  clay,  grass,  and  hard and  usually  they  are  more  hard  surface  tournament  than  the  grass  and  clay. A lso  there  are  five  levels   of  these  tournaments. The  definition  is  defined  here. A,  D,  F,  G,  N. G  stands  for  the   grand slam, and  N  stands  for  the  Masters. Indian  Well  Master is  a  master-level  tournament. D  stands  for   Davis  Cup,  and  A  is  the  ATP  Tour. Next,  let's  look  at  the  variable about  the  players. I  run  the  graph  builder  again. The  plot  on  the  left  actually  show  me which  country  has  the  most  player. On  the  right, it  shows  those  players'  hands. Do  they  use  the  right  hand or  they're  using  left  hand? You  will  see  the  player most  are  right  handed. I  also  would  like  to  find  out which  country  has  more  top  ranked  player. I  created  this, the  winners  rank  and  I  can  slide. The  country  with  more top-ranked  player  will  pop  up. I'm  interested  to  see  what  about  top  100 and  US  sitting  on  the  top. That  means  US  has  more  top   ranked  player  than  the  other  country. Then  what  about  the  top  10? Look,  you  can  either  slide  or you  also  can  type  in  the  number. From  this,  Spain  popped  up  at  the  top and  I  hover  over,  I  saw Carlos  and  I  also  saw  the  Nadal. As  I  click  the  US  and  I  see  the  player, Taylor  Fritz,  who  ranked  number  nine. You  also  can  see  from  the  hand  side and  Nadal  within  this  top  10  player,   Nadar  is  left  handed. He's  one  of  left  handed in  this  top  10  player. Now  let's  move  on  to  check the  players'  age,  height,  and  ranking. The  tournaments,   the  range  can  be  ranged  for  the  last  year, they  actually  can  range  from  17-42. In  this  graph,  I  only  listed  the  top  10 with  their  average  ranking. From  this  I  find,  their  average  height  is  around  6'2, which  is  very  common for  males  tennis  player. I  also  find  Raphael  Nadal   and  Novak Dj okovic   are  the  oldest  in  this  list. Now,  let's  look  at  the  winning  statistics because  I  would  like  to  see who  win  the  most  matches  in  2022. I  find  out  Tsitsipas   list  as  the  number  one. Then  something  is  missing. Where  is  Rafael  Nadal,  and  Djokovic? I  couldn't  find  them  in  this  top  10  list  who  win  the  most  matches. This  remind  me   maybe  I  should  look  at  their  winning  ratio instead  of  just  number  of  matches  they  won. I  did  some  summary  statistics and  I  find  out  their  winning  ratio. Yes,  you  immediately  see, Novak  Djokovic,   Rafael Nadal ,  and  Carlos  Akras, they  have  a  pretty  high, they  are  the  top  three  player who  has  the  highest  winning  ratio. Although  their  number  of  winning  for the  matches  is  not  as  high  as  Tsitsipas. I  also  noticed  there  are  two  players who  has  pretty  decent,   pretty  good  winning  ratio, but  they  don't  play  many  matches. They  only  won  three  matches. Who  are  they and  what  type  of  tournament do  they  play? I  drilled  down  into  the  data and  I  find  out  one  player's  name  is  Kovacevic and  all  his  three  matches  coming  from  tournament  A  level and  the  player, Safwa his  all  three  matches coming  from  Davis  Cup. From  this  graph,  you  definitely  know the  tournament  level   will  affect  the  winning. Ultimately,  you  care  about  who  won the  most  championship  or  tournament  wins. This  graph  put  all  three   relative  statistics  in  one  plot. The  down  you  will  see   how  many  matches  they  win and  the  second, the  green  bar  means   what  are  their  winning  match-win  ratio? The  top  will  show  you how  many  total  championship  they  won  in  2022. I  see,  Djokovic,  Carlos Alcaraz   and  Rafael  Nadal. I  also  see  one  guy who  I'm  not  familiar  with,  and  his  name, hard  to  say,  but  let  me  call  him  FAA. FAA  doesn't  have  amazing  winning  ratio,   but  he  did  won  five  titles. Again,   I  drill  down  to  the  data  and  find  out all  FAA's  winning  title  coming  from  A-level  tournaments. You  look  at  Djokovic  or  Alcaraz  and  Nadal, they  are  championship  not  only  from  A-level  tournaments and  also  from   grand slam  and  a  Master  level. Again,   we  show  tournament  level  effect  winning. Let's  look  at  the  seed. What  does  seed  play in  the  players'  winning? I  have  to  point  out  the  players'  seeds actually  will  vary  over  the  years. But  in  general,  the  higher  seeded  players tend  to  win  more  matches and  more  tournaments. Grand slam  winner  usually are  highest  seeded  players. But  in  2022,  only  two  people  are  exception. One  is  Carlos  Alcaraz and  the  other  is  Taylor  Swift. Sorry,  it's  about  Taylor  Fritz. You  can  see  here,   Carlos, he  succeed,  start  low, but  he  won  the  Miami  Masters. This  helped  him  move  to  the  top. In  the  end,  year  of  2022, he  was  ranked  as  the  number  three  seed, and  he  was  able  to  win  the  US  Championship. Taylor  Fritz,  he  actually  won  the Championship  of  Indian  Well,  Master  2022. We  can  see  the  seeds  definitely  affect  the  winning. Now,  let's  look  at  the  comparison between  the  winner  and  the  loser. In  this  ATP  data,  there  is  a  section  list about  to  serve  statistics and  come  with  a  winner  and  a  loser. There  are  seven  variables related  to  the  serve  statistics. I'm  interested  in  this  first  one. What  it  is? The  first  one  means number  of  points  won  on  first  serve. I  click  and  build  a  plot. Instead  of  I  plot  all  those   absolutely  the  number  of  the  point I  use  the  ratio because  the  point  will  depend  on   how  long  you  played  your  matches. With  the  ratio  would  make  more  sense. The  blue  colored  represent  the  first  serve  percentage  won coming  from  the  winner and  the  pink  is  coming  from  the  losers. Actually,  majority  of  the  first  serve percentage  won  between  60%  and  90%. But  the  blue  color   shaded  more  to  the  right, indicating  winner  have  higher   first  serve  percentage  won . Next  I  would  like  to  be  interested  to  see the  variable  is  BPs  saved  and  BP  faced. BP faced  means  a  breaker  point  faced. For  if  you  serve  and  you  face  the  breaker  point, that  means  you  give  your  opponent  opportunity  to  break  you. You  better  not  t o  face  the  breakpoint. Instead  of  plotting  separately, my  son  suggested  me  to  convert  them  to  be breakpoint  converted, which  is  a  variable  defined  as the  difference  between   the B P  faced  and  BP  saved. Then  again,  we  can  see  the  blue  color shaded  more  towards  the  left, indicating  winner  face  less  breakpoint   and  save  more  breakpoints. The  pink  one  indicates  that  loser tend  to  face  more  breakpoint and  save  less  breakpoints. With  all  these  statistics   and  variable   I  have  shown  you, but  ultimately  I  would  like  to  know,   can  I  build  a  model? Can  I  predict  who  is  going  to  win and  how  many  they  can  win? I  build  a  summary  table   and  as  I  shown  you, all  these  ATP  data  come  with  matches. A  player  can  have  many  matches so  I  just  use  a  tabulate   to  do  the  summary  statistics. I  got  the  tournament  wins  for  each  player and  I  got  the  average  their  winning  match  ratio and  their  height   and  their  average,  their  seed. I  wanted  to  find  the  correlation  between the  variable  to  the  tournament  wins. Clearly  you  can  see the  match  winning  ratio  is  highly correlated  with  tournament  wins and  so  is  winner's  seeds. Also  I  defined  one  variable   I  call  the  div  rank, which  I  know  when  you  face  a  weak  player, opponent  or  strong  opponent, your  winning  odds  could  be  differently. I  do  the  subtraction, I  introduce  this  variable  into  the  model. You  also  notice  the  height, there  is  the  correlation  between  the  variable. I  just  happen  to  notice   when  you're  higher  or  you're  taller and  you  tend  to  have  a  better  ACE  rate and  you  have  better, like  the  first  one,  serve  one. Definitely  the  taller  player  has  advantage  at  serving. I  bring  all  these  model into  the  fit  model  platform. I  first  run  a  Least  Square  model and  I  get  the  conclusion  that the  winning  ratio  and  the  winner's  seed are  definitely  affect   how  many  tournament  you  can  win. I  also  think,  oh,  this  is  a  count  of  data. How  many  tournaments  you  will  win. Maybe  I  should  use [inaudible 00:19:36]   distribution and  I  run  and  I  actually  also  get the  similar  conclusion that  winning  ratio  and  winner  seed   is  very  important  variable. But  I  have  to  point  out, although  I  show  you  early  about the  tournament  level  plays  a  very  important  role  on  the  winning, but  because  the  data,  the  format  itself made  me  hard  to  put  it  into  the  model. I  need  a  lot  of  data  manipulation. Plus,  I  feel  like  instead  of  just  looking   at  the  one  year's  ATP  data, maybe  I  should  look  at  more in  order  to  build  a  complete or  good  predictor  model. I  will  keep  this  in  mind for  my  future  research. With  all  these  statistics  and  a  variable, I  show  you  so  far. That's  back  to  the  topic,  who  is  the GOAT ? I  actually  created  a  script and  I  wanted  to  get  the  data  in  the  past  25  years as  Federer  started  early. I  wanted  to  include  all  the  matches, all  of  them  have  played. I  would  like  to  find  out who  won  the   grand slam  title and  who  won  the  Indian  Wells. This  script  actually  is  able  to  go  to  the  Jeff  Sexel  web and  fetch  the  data  and  do  the  analysis  and  generate  the  report. You  can  see  2023, Alcaraz  won  both   Indian  Wells  and  Wimbledon and  Novak  won  Australia  Open and  Roland  Gallos. As  the  list  moved  down, you  pretty  much  see  their  name,  Djokovic Nadal  and  Federe r,  so  on. It's  almost  for  the  last  20  years, these  three  are  dominant. As  I  keep  moving  to  the  bottom,  finally,   I see  Andre  Agassi  and  Pete  Sampras, who  are  my  favorite  player  in  '90s. Also  you  see  these  three  guys, Djokovic,  Federer,  and  Nadal,  they  sit  on  the  top. This  include  a   grand slam  title  and  Indian  Wells  title. I  truly  believe  these  three  guys,   they  move  the  modern  tennis  to  high  level. Now,  let's  look  at  again, look  at  the  match  wins,  winning  ratio, tournament,  and  Grand  Slang  title. I  would  like  to  see  the  more  detail. The  green  bar  here,  the  bar  itself represent  their  match  winning  ratio. But  I  like  Graph  Builder's  feature. It  allowed  me  to  put  their   number  of  winning  matches  on  the  top. Then  you  can  see, although  their  winning  ratio is  very  close, they  all  like  above  80. But  Roger  Federer  won  the  most  matches  over  1,263. You  move  to  the  top   and  you  will  see  those  green  bars means  how  many  tournament  championship each  of  them  have  won. Again,  Federer  won  the  most. Then  you  look  on  the  blue  top, you  will  see  that Djokovic won  the  most,  23   grand slam  titles. Next,  I  want  to  check  on  their  ranking. These  four  lines not  only  show  their  ranking  over  the  years but  also  show  their  incredible  professional  tennis  career. Federer  started  early  in  2001. It  took  him  about   three  years  to  move  to  the  top, but  he  stayed  at  the  top for  a  long  time,  18  years. You  look,  Nadal  and  Djokovic, they  move  very  quickly  to  the  top and  also  they  stay  at  the  top  for  a  long  time. The  dip  here  usually  either  means  they had  injury  or  had  a  surgery  to  recover. I  know  Nadal  is  right  now  in  the  recovery  period because  he  just  had  a  surgery and  Djokovic  continue  to  play. I  truly  believe  that  those  two  lines will  continue  to  grow  for  a  while. For  Alcaraz,  he  just  started. We  will  see  if  he  will  follow the  same  trajectory  as  the  big  three. I  would  like  to  show  you  more  detail about  the  individual   grand slam  matches. Look  at  this  plot  on  the  left. This  show  in  the  past  25  years, how  many  grand slam  matches  Federer  has  played. Total  434   grand slam  matches. He  won  373  matches  and  he  lost  61  matches. That  bring  him  to  the  winning  ratio  is  86 %. It's  amazing. The  right-hand  plot, actually  a  plot,  his  opponents  ranking. I  want  to  show  it's  difficult. Usually  when  your  opponent  has  a  high  ranking, that  means  tough  to  win  the  match. The  red  dot  here   all  represent  the  winning  matches and  the  blue  dot  here  represent  the  losing  matches, and  the  square  indicate  the  final  matches. These  are  all   grand slam  matches. You  look,  most  of  the  Federers'  opponent  is  all  high  rank  player and  only  the  few,  I  guess  he  was  lucky. He  was  able  to  play the  opponent  with  low  rank. We  also  can  look  like  how  his performance  in  each   grand slam as  I  click  Wimbledon,  you  will  see, Federer  won  a  lot  in  Wimbledon. Then  let  me  click  the  one  for  the  Roland-Gallos and  in  Federers'  entire  career and  he  only  won  once  in  the  Roland-Gallos. That  was  the  year  2009. The  other  day,   he  pretty  much  lost  to  Nadal. Let's  see  what  happened  in  2009. I  bring  Nadal's  record and  I  particularly  look  at  Roland  Gallos. You  pretty  much  see  all  the  red  square. That  means  he's  the  championship of  the  Roland  Gallos. He  only  lost  four  matches,  included  this  one  in  2009, in  the  semifinal,  he  lost. That  was  the  year,  actually, Federer  was  able  to  win  the  championship. I  will  skip,  Novak  and  Carlos, and  I  will  bring  you  the  overview of  all  these  four  guys'  performance in  all  the  four   grand slams. If  I  look  at  each  one  for  the  Australian, you  pretty  much  see  Novak  Djokovic  is  dominate. Then  if  you  look  at  the  Roland-Gallos, Nadal  is  dominate. For  the  US  Open,   they  all  have  won  the  US  Open. I  guess  US  Open   provide  opportunity  for  all  of  them. If  you  look  at  the  Wimbledon, I  think  Federer  and  both  Djokovic, they  both  did  pretty  well  in  Wimbledon, but  Federer  still  win more  than  Djokovic. I  wanted  to  finally  look  at their gra nd  slam  winning  ratio. From  this  plot,  it  shows  me,  yes, Djokovic won  the  most  grand  slam  title. Also  you  look  at  the  winning  ratio, overall,  Djokovic  has  highest or similar  like  the  Rafael  Nadal. Almost  in  every  category, you  can  see  Djokovic  has  higher  winning  ratio, except  for  the  Clay, the  Roland-Gallos,  Nadal,  is  the  best. I  would  say  just  based  on  winning  most  grand  slam  title and  highest  match  ratio, Djokovic is  the  goat. Next,  we  would  like  to  find  out who  is  the  youngest  among  four  of  them winning  the  grand  slam  title? That  was  Nadal. I  think  he  was  only  18.9, he  won  his  first g rand  slam  title. Alcaraz  at  age  19.3  won  his  US  Open. Although,  Djokovic  and  Federer  won  their  first  title  in  their  20s. But  you  look  at  their  long, amazing  career,  even  at  age  36, both  of  them  still  were  able to  win  the  grand  slam  title. I  think  that  Djokovic  will  continue  to  win. I  think  he  will  have  more  title under  his  belt. I  also  look  at,  they  definitely played  with  each  other. I  wanted  to  see  their  net  win  with  each  other. Rafael Nadal,  if  you  look  at Rafael  Nadal  against  Roger  Federer, so  Rafael  won  24  and  then  Roger won  against  Rafael  is  17. That  bring  their  net... Rafael  has  seven  net  wins  against  Roger. Novak Dj okovic  has  five  net  wins over  Federer  and  one  net  win  over  Nadal. Even  based  on  net  wins, I  think  Djokovic  is  a  goat. I still  would  like  to  see  their  serve statistics  because  from  that  ATP  data, this  is  the  data  more  related  to  their  techniques. I  put  all  these  variables  into  the  one  way  and  utilize  the  fit  group. With  such,  you  can  see  there's  a  lot of  the  data,  the  sample  size  is  bigger. With  all  the  data  together,  it  seems  that Djokovic  has  a  better  serve  statistic  than  the  rest  of  them. But  I  realize  this  is  big  sample  size. Sometimes  the   large  sample  size  can  transform  a  small  difference, become  a  statistically significant  difference. I  would  rather  to  see  the  subset. I  look  at  like  a  small  sample  size and  I  look  at  Wimbledon. Yeah,  and  in  Wimbledon, I  still  can  draw  the  conclusion  that Federer is  a  little  bit  better than  the  rest  of  them. But  once  I  look  at  the  other  grand  slam, like  the  Australian  Open, and  I  cannot  draw  the  same  conclusions. Overall,  I  think  their  technique is  very,  very  similar. The  successful  rate  for  serving, they  have  very  similar  statistics. With  all  the  statistical  variable, I  show  you  according  to  statistics  of  winning  most  grand  slam  title and  the  highest  match  winning  ratio,   Djokovic  is  the  GOAT. However,  statistics  don't paint  the  entire  picture as  a  player  can  have  a  much  larger  impact  than  just  statistics, such  as  the  way  they  play  the  game, the  love  for  the  game, and  especially  who  this  player  inspire. Such  as  the  young  kids, who  aspire  to  be  just  like  their  idols, including  my  son,  whose  dream  is  to  play Eastonball,  a  prestigious  tournament for  youth  at  Indian  Wells. In  the  end, it  was  just  an  honor  and  a  privilege   to  watch  these  three  great  player to  play  the  game, play  the  tennis  all  at  the  same  time, and  the  future  looks  bright for  more  great  tennis  to  watch. As  other  player  such  as  Carlos  Alcaraz, and  others  look  to  follow in  the  Big  Three's  footsteps. I  had  so  much  fun  doing  this  project by  using  features  such  as  graph  builder, dashboard  and  application  builder  in  JMP. This  feature  allowed  me   to  easily  explore  big  data  set and  quickly  identify the  atypical  observation. Dashboard  not  only  can  put  a  different  analysis  in  one  report, but  also  allowed  me  to  stay  in  the  report and  rerun  analysis  after  the  modification. Application  builder  allows  me  to  present  to  the  project without  having  to  use  PowerPoint. Although  this  project  mainly  analyze ATP  men's  tennis  data, the  analytical  tools  and  the  flow  can  be easily  applied  to  women's  tennis  data as  well  as  any  data  set that  have  patterns  in  other  fields. If  you  have  any  questions, please  feel  free  to  contact  me. Thank  you.
You have a question to answer, so you collect the relevant data and are ready to start creating a predictive model. But what which type of model do you choose and which modeling type? Is the goal to segment, predict, explain, or identify? Are the variables continuous, nominal, or ordinal?  Before we even get to choosing a type of model, we must define how the variables are used in analysis with the modeling type. What happens if we treat number of bedrooms as continuous versus a nominal or ordinal modeling type? We see when we pull up a distribution how modeling type impacts the type of summary statistics we get as means or frequencies. This poster demonstrates how the modeling type determines the results of your predictive model depending on which type of model is chosen. Get ready to play “Name that Analysis” as we go head-to-head on classifying different case study modeling examples with their respective modeling type.     Hello,  my  name  is  Andrea  Coombs,   and  I'm  joined  by  my  colleague, Olivia  Lipp incott. Olivia  and  I  have  given  a  presentation before  called  What  Model  When. If  you  want  to  take  a  closer  look at  that  presentation, you  can  take  a  look at  the  link  in  the  community  post. But  today,  we  want  to  talk about  something  a  little  bit  more. Yeah,  today  we  want to  think  about  modeling  type and  how  modeling  type  impacts  the analysis for  each  of  the  four  model  goals that  we  talked  about  previously. Right,  and  we're  actually  going to  use  the  same  data. This  is  data  that  we  pulled from  Redfin  that  represents the  housing  market  in  the  Cincinnati  area. Here  we're  trying  to  look at  the  price  of  homes  relative to  their  square  footage, the  number  of  beds, the  number  of  baths, and  so  on  and  so  forth. Previously,  we've  answered  the  question, what  model  when? It  really  depends  on  what  model you're  going  to  choose based  on  your  goal  for  the  analysis. For  segment,  we're  trying to  examine  relationships where  there's  no  intended  response; explain,  we're  trying to  explain  a  relationship and  look  at  the  underlying  factors and  how  those  affect  the  response; predict,  we're  trying to  predict  future  outcomes or  the  response  in  new  situations; and identify,  we're  trying to  find  important  variables. Right. Now  let's  bring the  modeling  type  into  the  picture. Both  your  responses  and  your  factors can  have  different  modeling  types. In  JMP,  there  are three  main  modeling  types: continuous,  nominal,  and  ordinal. Continuous  modeling  type  is  represented by  this  blue  triangle  icon  here, and  this  refers  to  numeric  data  only. The  nominal  modeling  type is  represented  by  this  red  icon, and  this  is  numeric  or  character  data where  values  belong  to  categories, but  the  order  is  not  important. For  the  ordinal  modeling  type, it  is  represented  by  this  green  icon, and  this  can  be  either  numeric or  character  data  as  well. But  in  this  case,  values  belong to  ordered  categories. When  you're  doing  an  analysis  in  JMP, you  want  to  make  sure  you  set  up the  correct  modeling  type, because  JMP  will  do the  correct  model  for  you, will  do  the  correct  analysis depending  on  modeling  type. Andrea,  I  have  a  game  for  us  to  play. It's  called  Name  That  Analysis. Do  you  want  to  play? Absolutely. I  love  games. Awesome. Here's  your  first  question. We  want  to  identify which  features  of  a  home are  most  important to  determining  the  price. For  example,  square  footage and  number  of  bathrooms can  explain  a  large  amount of  the  variation  in  price, but  other  features  are  less  important. All  right,  Olivia. I  think  you're  making this  first  question  easy  for  me. Is  the  answer  identify? Let's  see. Yeah,  you're  right. I  did  make  that  one a  little  bit  easy  to  get  us  going, but  that  is  identify to  find  important  variables  within  there. There's  a  couple of  different  places  in  JMP where  we  can  use  tools  to  identify if  that's  our  modeling  goal. Under  the  Analyze  menu  under  Screening, Predictive  Modeling  and  Fit  Model, using  tools  like  Predictor  Screening, Bootstrap  Forest, Generalized  Regression and  Stepwise  Selection. For  modeling  type, when  we're  looking at  the  goal  of  identify, it's  not  going  to  affect  things  much. JMP  is  going  to  do  the  correct  analysis as  long  as  your  modeling  types are  set  appropriately. We  took  a  look  at  this  and  we  took both  the  response  and  the  factors and  changed  them from  continuous  to  nominal and  looked  at  how which  factors  came  up  as  most  important. While  the  order  of  the  factors  varied, the  dominant  factors  stayed  the  same. All  right. It  looks  like  if  our  goal is  to  identify  important  factors, really,  the  exact  modeling  type we're  using isn't  impacting  things  that  much, it  looks  like,  Olivia. Right. Our  conclusions on  which  variables  are  important aren't  going  to  change  much based  on  the  modeling  type. All right. Well,  that  is  good  to  know. I  have  a  question  for  you. Are  you  ready? I'm  ready. All  right,  here  is  your  question. Let's  say  we  want  to  build  a  model to  predict  house  prices. This  model  will  be  based  on  many  important predictor  variables  we  have  in  our  data. For  example,  we  want  to  predict the  price  of  a  house that  we  want  to  put  on  the  market. Which  goal  do  you  think we're  working  with  here? Okay,  so  it's  not  like  question  one where  we're  trying  to  see which  factors  are  most  important to  predict  housing  prices. We're  just  really  trying  to  get that  final  housing  price  prediction. I'm  going  to  go  with  predict. All  right,  let's  see  if  you're  right. Yes,  you  are  right. The  goal  of  this  analysis  is  predict. There's  lots  of  different  platforms  in  JMP where  you  can  build  models  for  prediction. Within  each  of  those  platforms  in  JMP where  you  can  build  the  prediction  models, JMP  will  do  the  correct  analysis  for  you, depending  on  the  modeling  type of  your  response. Here  we  have  a  table of  different  modeling  types for  our  responses: continuous,  nominal,  and  ordinal. For  a  continuous  response, this  is  the  typical  one that  we  were  talking  about,  right? We  want  to  predict  the  price  of  a  home that  we're  going  to  put  on  the  market. Now,  when  we're  building  this  type of  model  with  a  continuous  response, well,  we  want  to  know how  powerful  that  model  is. What's  the  predictive  power  of  that  model? We  can  use   RSquared and  the  Root  Average  Squared  Error to  diagnose  that  model. Now,  for  a  nominal  and  ordinal  model, it's  a  little  bit  different. For  a  model  with  a  nominal  response, we  have  categories  as  the  response. In  this  example,  we're  looking at  whether  or  not  the  price will  be  over  or  below  $1  million. That's  what  we  want  to  predict. For  the  ordinal  response, here  we  have  an  ordered  category. We  want  to  predict whether  the  price  of  the  house is  going  to  be  low, medium,  and  high. For  the  nominal  and  ordinal  examples, again,  we  can  look  at   RSquared and  Root  Average  Squared  Error to  evaluate  those  models. But  there's  other  things  that  we  can  use to  evaluate  those  models, like  the  misclassification  rate and  the  area  under  the  ROC  curve. Of  course,  our  favorite  tool  in  JMP to  take  a  look  at  our  prediction  model is  the  Prediction  Profiler. Let's  take  a  look  at  the  difference between  the  Prediction  Profiler for  the  modeling  types  of  our  responses. For  the  continuous  response, we  can  see  that  on  the  Y-axis, we  have  the  mean  prediction plus  or  minus  the  confidence  interval given  the  value  of  the  model  factors here  on  each  of  the  X -axes. For  the  nominal and  ordinal  logistic  models, what  we  see  on  the  Y -axis is  the  probability  of  the  response  being in  a  certain  category. For  the  nominal  logistic  model, we  have  the  probability that  the  house  is  either  going  to  be above  or  below  a  million  dollars. For  this  ordinal  logistic  model, we  can  see  the  probability  of  having a  low,  medium,  or  high  price. Okay,  so  it  sounds  like  the  goal of  what  we  want  to  predict is  also  important  when  we're  talking about  that  prediction  goal, whether  we  want to  treat  price  as  continuous and  get  the  predictions of  the  exact  prices  out  of  there, or  if  we  want  to  treat  it  as  a  category. Right. You  just  need  to  get that  response  variable  set  up and  your  data  set  the  correct  way, and  then,  of  course, assign  the  correct  modeling  type, and  JMP  is  going  to  build the  correct  model  for  you. All  right,  Andrea. Are you  ready  for  your  next  question? I'm  ready. Let's  go. Okay. We  want  to  quantify the  effect  on  home  prices from  additional  bedrooms. For  example,  on  average, every  additional  bedroom  adds about $ 97,000  to  the  total  home  cost. Adding  a  bedroom  adds  $97,000? Man,  Cincinnati  is  a  tough  housing  market. That's  crazy. All  right,  well, so  let's  see. What  are  we  trying  to  do  here? We're  trying  to  quantify  the  effect  here. I  think  what  we're  trying  to  do  is  explain that  effect  that  bedrooms  has on  the  price  of  a  house. I'm  going  to  say  explain. You're  correct. Yeah,  we're  trying  to  describe the  relationships. In  explain,  we  use  the  parameter  estimates taken  from  the  model  equation to  quantify  those  relationships between  the  factors  and  the  responses. Typically,  we  use  in  JMP under  the  Fit  Model  menu  location tools  like  Standard Least  Squares, Logistic  and  Ordinal  Regression, and  Generalized  Regression. Modeling  type  can  really  impact how  our  factored  relationship with  the  response  variable  is  interpreted. We  took  a  look, and  we  were  looking at  how  does  the  number  of  beds  affect the  housing  price? We  changed  beds  from  continuous, to  nominal,  to  ordinal, and  see  what  that  relationship  was. We  can  see  under  the  continuous, that's  where  we've  got that  every  additional  bedroom  adds $97,000  about  to  the  total  home  price. That  prediction  profiler  shows a  linear  relationship when  we  treat  beds  as  continuous. But  when  we  treat  beds as  nominal  or  ordinal, there's  not  that  straight linear  relationship  going  on. We  see  a  spike  in  price  for  4-5  bedrooms compared  to  going  from  2-3  bedrooms. Right. I  see  with  nominal  and  ordinal, the  prediction  profiler  looks almost  exactly  the  same, so  it  must  be  the  same  model. However,  I'm  seeing with  the  parameter  estimates, they  look  a  little  bit  different between  nominal  and  ordinal. What's  going  on  there? Yeah,  so  the  nominal and  ordinal  modeling  type, and  when we  use  that  within a  regression, is  treating... They're  coded  differently within  the  regression, so  the  parameter  estimates  are  different. For  nominal,  that  intercept, we  think  of  that  as  the  mean  house  price across  all  the  different  bedrooms, and  each  of  those  parameter  estimates are  how  much  that  number  of  beds  increases or  decreases  that  mean  house  price. But  for  ord inal, because  we're  looking  at  order  matters, we  think  of  the  intercept as  if  there  are  zero  bedrooms and  each  of  those  parameter  estimates is  the  effect  of  adding an  additional  bedroom  onto  the  price. All  right. Modeling  type  is  really  going to  affect  my  parameter  estimates. I  really  need  to  think  about exactly  what  do  I  want  to  explain as  a  part  of  this  model when  I'm  doing  this  analysis. Yes. All  right. Are  you  ready for  the  final  question,  Olivia? Yeah,  bring  it  on. All  right,  here's  the  question. Let's  say  we  want to  identify  groups  of  homes that  are  similar  based on  a  list  of  possible  characteristics. In  other  words, we  want  to  identify  market  segments based  on  things  like  square  footage, location,  number  of  bedrooms,  et  cetera. Which  goal  do  you  think  this  is? I  think  you're  trying to  trick  me  with  that  identify, and  I'm  not  going  to  fall  for  it. Okay. But  there  are  no  responses within  this  question. I  think  we're  looking  at  clustering. I'm  going  to  say  segment. Okay. Well,  you're  right,  Olivia. I  did  try  and  trick  you  a  little  bit because  I  really  wanted  to  win. But  you're  right, that's  the  key  thing  here, is  that  there  are  no  responses  here in  this  analysis. We  are  definitely  looking  at  segment. When  our  goal  is  segment, we  can  use a  couple  of  different  clustering  tools. We  can  do  Hierarchical  Clustering, K-Means C lustering, or  Latent  Class  Analysis. It's  important  to  keep  in  mind that  with  Hierarchical  Clustering, you  can  only  include... Sorry,  you  can  include all  of  the  modeling  types: continuous,  nominal,  and  ordinal. But  for  K-Means  Clustering, you  can  only  include variables  that  are  continuous. For  Latent  Class  Analysis, you  can  only  include nominal  or  ordinal  variables. In  our  case  here, when  we're  looking at  the  number  of  bedrooms, lot  size,  year  built,  and  square  feet, we  have  a  combination of  continuous  and  nominal  variables. Hierarchical  Clustering  may  be  the  best clustering  tool  to  use  in  this  scenario. It  looks  like  with  that  parallel  plot with  Hierarchical  Clustering, maybe  we  could  call  Cluster  6 Amazing  Location. Yes. If  you  think  a  large  lot  size is an  amazing  location, yeah,  we  can  definitely  call that  segment  Amazing  Location  Homes. Well,  all  right,  Olivia, despite  me  giving  you  a  trick in  that  last  question, it  looks  like  we  ended  up with  a  tie  here  again. We'll  have  to  rematch  again  soon. Absolutely. We  talked  about  what  model  when, and that  really, what  model  you  choose  depends on  your  goal  for  the  analysis, whether  it's  segment,  explain, predict,  or  identify. Yeah,  in  terms  of  modeling  type,  again, JMP  is  going  to  do the  correct  analysis  for  you, especially  with  your  responses. If  you're  setting  them  up with  the  correct  modeling  type, JMP  is  going  to  do the  correct  analysis  for  you. If  your  goal  is  explain, you  might  need  to  think  a  little  bit about  which  modeling  type  to  use, depending  on  how  you  want  to  explain the  effect  of  something like  the  number  of  bedrooms. Thank  you,  Olivia. This  is  so  much  fun. Let's  do  it  again  next  year.
Cleaning large systems effectively is critical to many applications within industrial manufacturing. Establishment of a system's hold-up volume and the target concentration of residual contaminant after cleaning do not fully define a cleaning method, however. Assuming a cleaning method that repeats the same flush volume until the system has reached its target contaminant level, viable cleaning schemes will differ in how many flushes of the volume used to reach that target. By creating a graph showing "Flush volume" as a function of "Number of flushes needed" with editable table variables of "System hold-up volume" and "End contaminant level" in slider boxes, the user can visualize many different scenarios in many different systems before selecting the scheme that best fits their varied needs.      Hello,  my  name  is  Margaret  Wilson-Moses, and  I  work  at  Inpria  Corporation as  a  Senior  Product  Engineer. Inpria  Corporation is  up  in  Corvallis,  Oregon. Inpria  designs  and  manufactures metal  oxide  photoresists to  unlock  the  full  potential of  extreme  ultraviolet or  EUV  photolithography. At  its  most  basic  level, Inpria  is  made  up of  chemists  and  engineers. The  chemists  at  Inpria  are  the  ones developing  and  designing  the  chemistries, testing  them  and  refining  them to  behave  how  they  want, and  then  taking  those  small  formulations and  giving  the  best  designs to  the  engineers. The  engineers  scale  up  those  designs to  manufacture in  our  high- volume  manufacturing  facility where  we  have  tighter  control over  processing  and  higher  purity, and  that  finished  product goes  to  our  customers. Both  the  chemists  and  the  engineers at  Inpria  use  JMP  extensively. The  chemists  use  JMP to  design  their  experiments to  analyze  the  results of  those  experiments, often  using  hypothesis  testing. They  also  use  JMP  to  build  models, so  they're  better  able  to  predict how  certain  parameters  will  affect the  response  variables  they're  seeking. They  also  use  JMP to  present  their  findings within  their  teams  and  across  Inpria. Engineers  use  JMP to  work  up  our  process  data. We  have  sensors on  our  manufacturing  facility that  are  collecting  data every  two  seconds. We  have  a  large  amount  of  data and  we  use  automated  scripts  from  JMP to  pull  out  the  pieces  of  that  data that  we  need to  be  able  to  analyze  processing much  more  simply. We  use  control  charts to  look  at  process  parameters for  different  batches and  also  analyze  QC  metrics  of  our  product over  time  for  different  formulations. Those  QC  metrics  end  up on  our  certificates  of  analyzes. We  also  use  measurement  system  analyzes or  MSA  to  perform  gage  studies on  the  tools  that  we  use to  collect  the  quality  control  metrics. We  also  join  data  from  various  sources, as  do  the  chemists, because  different  groups  at  Inpria are  always  collecting  data, so  it's  good  to  be  able to  have  a  way  to  join  the  data and  look  for  correlations between  those  data  sets. Finally,  we  also  use  JMP to  present  our  data both  internally  and  to  our  customers. A  couple  of  years  ago, when  we  were  bringing  up our  high- volume  manufacturing  facility, I  was  tasked  with  the  goal of  developing  a  cleaning  scheme for  our  manufacturing  facility. Inpri a,  at  that  time,  and  in  a  way  today, cleans  our  manufacturing  systems by  dilution to  ensure  lot-to-lot  segregation. To  ensure  that  lot  A does  not  end  up  in  lot  B or  it's  limited  to  a  very  small  amount. I've  modeled  that  here. The  top  left, we  have  our  finished  product. When  we  transfer  that  to  bottles and  we're  done  wasting, we've  taken  all  we  can  take  out, we're  left  with  a  little  bit  of  heel in  our  system. That  heel  cannot  be  drained or  flushed  out. It's  just  unreachable  by  the  pumps. To  that  heel,  we  add  a  flush  that  doesn't contain  any  of  the  starting  material and  we  end  up  with  a  diluted  heel. We  waste  as  much  as  we  can and  end  up  with  the  same  volume  of  heel, but  at  a  lower  concentration  of  X. We  repeat  this  process, flushing  and  wasting,  flushing  and  wasting until  we  have  the  concentration that  we  need  at  the  end. Further  on, I'll  model  this  by  calling  X₀, the  concentration  of  X in  the  finished  product or  whatever  the  heel  starts  at, so  which  is  going  to  match the  final  product. Then  X₁  is  the  concentration  of  X after  the  first  flush. X₂  is  the  concentration after  the  second  flush, and  so  on  up  through  Xₙ. The  flush  volume  here,  called  F, is  the  same  volume  repeatedly throughout  the  process. My  objective  when  approaching  this  problem was  to  develop  a  dynamic and  user-friendly  visual  model to  show  to  my  team how  different  variables  in  this  problem a ffect  the  potential  cleaning  schemes we  could  use  with  the  overall  goal of  supporting  selection  of  a  scheme that  would  fulfill  all  our  criteria. Flushing. Cleaning  by  dilution  can  be  modeled with  this  equation  here, which  I  just  pulled  from  the  math that  comes  out  of  the  previous  slide taken  to  N  flushes. Xₙ,  the  amount  of  X  remaining after  N  flushes e quals  to  the  variables  on  the  right. So  where  H  is  the  volume  of  heel, F  is  the  volume  per  flush  in  the  cleaning, N  is  the  number  of  flushes, and  X₀  is  the  initial  amount  of  X. This  equation  does  importantly  assume that  every  flush  and  heel are  perfectly  mixed  per  flush, but  we  do  have  experiential  data that  supports that  that  is  indeed  the  case,  thankfully. We  are  also  assuming  that  the  heel  volume is  consistent  every  time, so  we're  wasting  to  the  same  exact  heel, and  that  the  flush  volume is  the  same  exact  volume every  single  flush. But  again,  we  have  experiential  data that  supports  that, so  we're  okay  with  these  assumptions. Functionally, we  are  actually  more  interested in  translating  this  equation to  be  a  function  of  N, the  number  of  flushes, and  F,  the  volume  per  flush, because  the  other  variables in  this  equation  are  pretty  much  set. Heel  or  H  is  constant per  manufacturing  system. It  may  vary  per  system,  but  it  is  known through  the  data  we've  collected in  the  past  to  calculate  what  it  is. X₀  and  Xₙ  vary  per  cleaning, but  again,  we  have  analytical  methods to  understand  both  X₀  and  Xₙ  per  cleaning. Those  are  our  known  variables. Flush  volume  and  number  of  flushes are  the  two  variables  that  we're  varying to  figure  out  a  cleaning  scheme. Those  are  the  ones  to  be  determined. We  have  a  few  boundary  conditions  also. The  total  amount  of  solvent  used must  be  as  low  as  possible. We  didn't  come  into  this with  a  hard  limit, but  more  solvent  used  is  higher  cost both  in  raw  materials  and  waste, so  we  want  to  limit  it as  much  as  possible. Another  boundary  condition  is  that we  want  to  keep  the  number  of  flushes less  than  or  equal  to  20 in  order  to  be  operational. That's  because  we  want  to  fit  our  cleaning within  our  production  cadence. Lower  flushes  is  better  operationally, but  we  have  a  hard  limit  at  20. Flush  volume  also has  a  boundary  condition. It  has  to  be  less  than  or  equal to  the  capacity  of  the  system. In  this  case, I've  set  our  capacity  equal  to  60  gallons. The  maximum  flush  volume you  could  have  is  clearly  60. If  you  take  the  equation I  showed  on  the  previous  slide and  these  boundary  conditions along  with  the  fact that  we  know  H,  X,  not  an  N, you  can  actually  solve  this  equation. You  can  find  F  and  N to  satisfy  the  boundary  conditions. But  that  misses  the  goal  that  I  had when  I  came  into  this  process. To  explain  a  little  bit  more  about  that, there's  many  ways why  going  with  a  visual  model  through  JMP was  more  advantageous  for  me when  I  approach  this  problem than  just  working with  this  stagnant  equation. First  off,  this  equation  is  not  intuitive. I'm  pretty  good  at  math, and  I  can't  see  what  this  equation  means in  a  graph  form  by  just  looking  at  it, versus  when  you  have  a  visual  model, you  can  see how  different  variables  affect  each  other. The  equation  is  limited. You  may  have  one  H, where  you  may  have  one  H, when you  want  to  have one  yield  value  per  manufacturing  system, but  you  have different  manufacturing  systems. You  may  want  to  vary  H  and  see what  the  different  cleaning  schemes  are. This  equation  really  fits. You  get  one  answer per  one  set  of  input  variables. With  a  visual  model, you  have  a  very  versatile  relationship that  you  can  see and  it  covers infinite  different  cleaning  schemes. The  equation  is  also  very  individual. I  crunch  this  math, I  make  the  assumptions, I  put  the  input  values  in, and  then  I  tell  my  team how  many  flushes  we  should  use and  what  the  volume  per  flush  should  be. They  can  check  my  math, but  they're  not  involved in  that  process  at  all. They  just  say,  "Okay." Versus  the  visual  model  that  I  built is  user-friendly  and  dynamic, so  they  can  tinker  around  with  themselves and  see  how  the  variables affect  each  other and  come  to  their  own  conclusions, which  also  leads to  more  buy-in  from  the  team when  we  end  up  choosing  a  cleaning  scheme. Lastly,  an  equation  is  boring. I  like  math,  and  it's  still  more  fun for  me  to  play  around  with  the  graph than  it  is  to  use  an  equation. The  visual  is  much  more  interactive. I'm  going  to  move  from  here to  sharing  my  model. I  started  with  just  a  simple  data  table with  three  table  variables. I  have  X ₀   here  and  Xₙ  both  in  ppm. And  I've  started  X₀  to  be   1 million  ppm, and  Xₙ  is  the  target  end  goal for  the  cleaning, and  heel  is  the  heel  of  the  system. The  number  of  flushes  here is  just  an  independent  variable, so  it's  just  an  integer  value between  1  and  35. I  did  go  beyond the  maximum  number  of  flushes just  to  get  a  better  view of  the  relationship. Then  the  flush  volume  column  here is  built  off  the  equation that  I  showed  previously. I  open  up  this  formula, you  can  see  that  this  is  the  same  equation I  shared  in  the  slide, and  it's  a  function of  the  table  variables. It's  a  function  of  these  table  variables here  that  are  brought  into  the  equation. I've  also  rounded  it  to  one  decimal  place because  our  control  system only  has  that  amount  of  sensitivity when  adding  flush  volume. Finally,  I  have  a  third  column  here which  is  the  total  solvent  used. This  is  simply  a  product of  the  first  two  columns. It  will  show  you how  much  solvent  you've  used per  cleaning  scheme. It  just  multiplies  the  number  of  flushes by  the  flush  volume to  give  the  total  amount  of  solvent  used. When  I  go  into  my  cleaning  model, it's  built  around  Graph  Builder. I'm  going  to  set  my  standard  here to  be  1  million  ppm  as  the  starting  value. Let's  say  our  target  is  10  ppm of  residual  X  at  the  end  of  cleaning and  our  heel  is  five  gallons. You  can  see  here  that  we  have  an  X on  the  graph  of  number  of  flushes and  then  there's  two  different  Y  values. There's  the  flush  volume  in  gallons and  the  total  solvent  used  in  gallons. I  also  have  a  reference  line  here  at  20 to  show  the  maximum  flushes that  are  operable. Each  of  these  points  reflected  here is  a  full  cleaning  scheme. If  I  select  this  point  right  here, this  is  saying  10  flushes. With  this  starting  value  of  X, this  goal  end  of  X  at  this  heel  value, and  you're  going  to  have the  flush  volume  of  10.8  gallons  per  flush in  that  cleaning  scheme. When  you  look  at  this  graph, all  of  the  red  points to  the  left  of  this  black  line are  valid  cleaning  schemes. Some  of  them  use  more  solvent, total  solvent,  and  some  of  them  use  less, but  they're  all  valid per  our  boundary  conditions. Then  I  have  at  the  right  here the  table  variables  in  slider  form, these  top  three  things, which  are  editable  either  by  a  slider or  by  the  number  edit  box  here that  are  connected  to  the  table  variable. Then  I   have  a  button  box  here which  will  take  all  this  information and  compute  it  and  give  you a  simple  answer  for  the  total  solvent  use. If  I  click  that  right  here, I  get  a  box  that  says, "Okay,  if  you  have  20  flushes, the  volume  per  flush  will  be  3.9 and  you're  going  to  use  a  total of  78  gallons  in  this  cleaning  flush." There's  a  few  things  we  can  get without  really  even  tinkering with  the  slider  variables  over  here with  this  visual  model that  we  couldn't  get  from  the  equation, like  I  said  before. First  off, you  do  see  this  negative  correlation between  number  of  flushes and  total  solvent  use. When  I  said that  you  could  solve  the  equation using  the  boundary  conditions and  our  known  table  variables, I  meant  that  if  you  are  trying  to  use the  least  amount  of  total  solvent and  you  have  a  maximum  of  20  flushes, then  your  best  scheme, assuming  you're  trying to  minimize  solvent  used, is  going  to  be  at  20  flushes. That's  due to  this  negative  correlation  here. The  more  flushes  you  used, the  less  total  volume  you're  using because  it's  negative  correlation. You  also  see  with  this  visual that  it's  an  exponential  decay. While  20  flushes  may  be  the  least  amount of  total  solvent  you're  using with  these  boundary  conditions, 18  flushes  only  uses  81  gallons, which  is  only  three  gallons  more than  you  use  at  20  flushes. This  is  where  it  becomes  important to  have  this  visual  here because  it  might  be  the  case that  the  time  we  save  in  operations by  reducing  flushes  from  20  to  18 is  actually  worth  the  three  gallons we  lose  in  total  solvent  used. All  right,  so  with  the  sliders, now  we  can  start  looking  at  the  sliders, with  the  sliders, you  can  see  how  these  variables affect  the  possible  cleaning  schemes. The  sliders  here  are  in  the  range  of, maybe  not  as  possible, but  what's  reasonable  for  this  model. So  you  can  see. You're  looking at  how  these  red  dots  change with  relationship  to  the  Y-axis. You're  looking at  both  the  steepness  of  this  curve because  that'll  show  you how  important  it  is  to  maximize your  flushes  within  these  bounds. You're  also  looking  at  the  overall  level. You  want  to  reduce  total  solvent. The  lower  down  your  red  dots  are, the  better  off  you're  going  to  be. By  manipulating  these, I  can  show  how  much  it  matters  to  start at  a  lower  or  higher  concentration. You  can  see  that,  yes, the  red  dots  are  moving  a  little  bit, but  it  takes  quite  a  while before  they  have  an  effect on  the  height  of  those  red  dots in  the  lower  graph. Xₙ ,  you  can  see  how  much  it  matters to  have  a  goal of  residual  solvent  in  there. If  your  goal  is a  higher  amount  of  solvent, if  you  have  more  buffer to  reach  your  goal, then  you're  going  to  need  less  solvent. Whereas  if  you  have  higher  criteria for  residual  solvent,  so  a  lower  goal, you're  going  to  use quite  a  bit  of  solvent. You  can  see  here  you  have  fewer and  fewer  cleaning  schemes  that  are  viable when  you  have that  lower  goal  of  residual  solvent. Let  me  just  back  to  10. Then  with  heel, you  can  tink  around  this and  see  how  much  the  heel  of  a  system affects  how  much  solvent and  what  cleaning  schemes  are  possible. A  greater  heel  means  you  have  to  use more  and  more  solvent to  achieve  your  goal  criteria. A  smaller  heel  mean  you  need  to  use  less  and  less  solvent. Beyond  just  selecting  a  cleaning  scheme, this  visual  model  allowed  our  team to  look  for  what  we  should  focus  on if  we're  trying  to  reduce the  overall  amount  of  solvent when  we're  cleaning. We  did  that  by  looking  at if  we  had  some  options to  change  our  variables of   X₀ ,  Xₙ,  and  heel, which  one  would  give  us the  most  bang  for  our  buck? Which  one  could  we  change with  reasonable   effort  on  our  part to  get  the  biggest  reduction in  total  solvent  used  per  cleaning? Here,  I'll  say  we  had  three  options. I'll  say  that  we  had  an  option to  reduce  the  starting  value  by  50%. Say  we  could  arrange  our  batch  schedule. Cleanings  would  only  start at  50%  of  what  they  originally  started  at. We  could  change  our  goal from  10  ppm  to  50  ppm, so  multiply  it  by  five. That's  because  we  had different  sensitivity  techniques or  different  restrictions for  material  following  a  certain  batch. Then  we  could  also  reduce  our gal, our  heel  by  one  gallon by  engineering  improvements to  the  actual  manufacturing  systems. Really  quickly, we'll  look  for  using  this  graph, we'll  see  what  happens when  you  change from  this  standard  set  of  input  variables and  keep  the  number  of  flush  to  the  20 to  the  overall  volume  of  cleaning, solven t  needed. Again,  our  standard  level  is  78  gallons. If  you  change from  100%  of  X i n  your  system at  the  beginning  to  only  50%, so  500,000  ppm, and  I'll  set  the  variable  values  there, we're  now  only  using 72  gallons  to  clean  out. But  that's  only  a  reduction of  six  gallons  for  cleaning. That  really  didn't  move the  needle  very  much. Next,  we'll  look  at  changing  our  Xₙ  to  50 and  then  setting our  variable  values  again. That  says  that  we  will  need 64  gallons  of  solvent  to  clean  at  the  end. That's  a  better  improvement for  the  total  solvent  needed. Finally,  we'll  look at  reducing  the  heel  in  the  system from  five  gallons  to  four  gallons and  see  that  the  total  solvent  needed is  62  gallons  here. That's  the  greatest  reduction in  solvent  needed  to  clean. With  this  information, and  it  was  similar  information  that we  used  to  actually  pursue  this  option, we  did  look  at  ways to  reduce  the  heel  in  our  system to  reduce  the  overall  solvent  needed because  we  saw  that  variable  is  going  to affect  the  total  solvent  used  most with  the  amount  that  we  can  change any  of  these  variables. That  is  reflected  here. These  were  the  three  options I  just  went  over. You  can  see  that  option  A, the  reduction  in  starting  level, you  only  are  gaining  yourself  six  gallons. Option  B,  the  increase  of  the  end  level, you  have  14  extra  gallons, and  option  C  was  16  extra  gallons. But  really  looking  at  these  values, again,  it's  important  that a  team  is  involved  in  this because  that's  only  two  gallons  different between  option  B  and  option  C. If  option  B  were  very  easy compared  to  option  C, we  probably  would  have  gone  with  option  B. But  a  heel  reduction  of  one  gallon would  be  most  effective in  these  three  options at  reducing  total  volume  for  cleaning. Some  concluding  remarks  I  have. Basically,  I  just  want  to  circle  back to  what  I  brought  up  originally. It  could  be that  mathematically-minded  people would  look  at  the  equation  on  the  left and  automatically  know that  changing  the  heel is  going  to  have  the  greatest  effect on  the  goal of solvent  used. That's  not  instinctive  to  me, to  know  that just  looking  at  this  equation. I  know  that  the  people  I  work  with, that's  not  instinctive  to  them  either. In  the  case  where  I'm  in  a  team trying  to  come  to  a  conclusion with  consensus  of  all  of  us, it's  very  helpful to  have  a  visual  model  in  JMP where  people  can  manipulate the  variables  themselves and  see  how  a  variable like  total  solvent  used  is  changed by  changing  variables that  we  know  what  they  mean. I  can  see  many  applications of  using  a  visual  model with  slider  variables, even  in  my  company. Any  multivariable  process where  the  relationship between  those  variables  is  known has  the  potential  to  make  use  of  a  model with  the  slider  variables to  give  a  visual  model  of  how  the  process reacts  to  changes  in  those  variables. I'm  thinking  of  there's  predictable  ways that  fluid  temperature  affects how  hard  a  pump  has  to  push  fluid to  reach  a  certain  flow  rate. That  could  be  helpful in  selecting  filters  for  a  given  process. I  believe  it  could  be  useful for  many  other  applications  as  well. Ease  of  generation. I  am  not  a  coder  and  I  was  able to  build  this  model  relatively  easily with  help  from  Jordan  Hiller, oh,  shout  out  Jordan  Hiller,  thank  you, also  from  the  scripting  index, which  had  many  great  examples, and  the  JMP  user  community. Because  I  was  able  to  look through  historical  questions and  piece  together  how  I  wanted to  go  about  a  certain  problem. It  was  relatively  simple to  make  this  model. Having  made  this  model, it  would  be  very  easy  to  manipulate in  the  future  for  other  purposes. Room  for  improvement. Again,  there  may  be  a  place  that  I  missed where  it  was  possible  to  do  this, but  in  my  exploration, I  couldn't  find  a  way to  make  a  reference  line  that  connected, basically  intersected  the  graph at  this  point and  moved  along with  the  graph  to  different  points. A  reference  line  that  is  a  function. I  wasn't  able  to  do  that. I  know  I  saw  different  examples  of  it, but  I  wasn't  able  to  do  it with  using  table  variables as  part  of  that  function. But  again,  that's  probably due  to  my  not  being  a  coder, but  it  would  be  helpful if  that  were  a  simpler, more  accessible  feature in  modeling  in  JMP. But  really, I  do  think  that  the  model  I  created, it  was  very  helpful  for  our  purposes, and  it  will  be  helpful for  other  applications  in  the  future. I  hope  it's  helpful  to  other  people. Thank  you.
There are many times when an exact measurement is not possible, but a range of values for the measurement is available. Censored data analysis methods enable you to incorporate the information from both types of measurements. This presentation provides an introduction to censored data situations: when they appear, how to handle them, and what happens when you do not handle them appropriately. This presentation includes examples of censoring in univariate and regression settings by using the Life Distribution and Generalized Regression platforms in JMP and JMP Pro 17, including the new Detection Limits column property.     Hi,  my  name  is  Michael  Crotty.   I'm  a  statistical  writer with  the  Stat  Documentation  Team  at  JMP, and  today, I'm  going  to  talk  about  an  introduction to  censored  data  analysis in  JMP  and  JMP  Pro. To  start,  we've  got three  common  types  of  censoring. Just  to  back  up  a  bit,  censored  data  occur when  you  don't  have  an  exact  measurement for  an  observation, but  you  do  know a  range  for  the  observation, so  you  know  not  the  exact  value, but  you  do  know  something about  where  the  value  might  be. What  we  want  to  do by  using  censoring  in  our  analyzes is  to  use  that  information  that  we  have, even  if  it's  not  exact. The  three  types  of  censoring that  we'll  talk  about  today are  right  censoring,  left  censoring, and  interval  censoring. Right  censoring  is  probably the  most  common  form  of  censoring. It  occurs  when  the  event  of  interest just  doesn't  have  time  to  occur  yet by  the  end  of  the  study. In  a  reliability  test, you  might  have a  bunch  of  light  bulbs   under test and  at  the  end  of  the  test  period, some  of  them  have  failed. Those  are  exact  observations, but  then  some  haven't  failed  yet. You  know  they're  going  to  fail, but  your  study  has  ended, so  it's  censored  at  that  point. Same  thing  in  survival  models where  a  patient  survives to  the  end  of  the  study. One  thing  to  note  is  that  right  censoring is  the  only  type  that  in  JMP, supports  a  single  response  column alongside  of  a  binary  censor  column. The  next  type  is  left  censoring. That's  where  the  event  of  interest  occurs before  the  observation  starts. A  common  example  of  that  would  be where  you  put  a  bunch  of  units  under  test and  at  the  time that  you  do  the  first  inspection, some  of  them  have  already  failed. You  know  that they  started  without  a  failure, but  by  the  time  you  measured  them, you  checked  on  them,  they  had  failed. So  they  failed  sometime  before  that  point. Another  example  of  that is  limited  detection where  you  have  a  measurement  tool that  can't  measure below  a  certain  threshold. The  last  type  we'll  talk  about  today is  interval  censoring. This  is  where  your  event  of  interest happens  between  observation  time. If  you  have  a  periodic  inspection  schedule instead  of  continuous  observation, you  might  see  that  something  fails or  something  happens between  time  two  and  three. It  didn't  happen  at  time  two and  it  didn't  happen  at  time  three, but  it  was  somewhere  in  that  interval. Take  a  quick  look at  what  this  looks  like  in  JMP. Here's  an  example  of  the  right  censoring with  a  response  column and  a  censor  column. In  the  platforms  that  support  censoring, you  always  see  this  censor  role, that's  for  that  binary  censoring  column. This  is  the  way  that  you  can  do, you  can  specify  censoring  more  generally, which  is  with  two  response  columns. Basically, it's  like  a  start  time  and  an  end  time. For  left  censoring, we  don't  know  when  it  happened, so  the  start  time  is  missing, but  the  end  time, we  know  it  happened  before  time  50, so  somewhere  before  that. Reversed  that  for  right  censoring, we  know  that  at  time  25, it  hadn't  happened  yet, but  it  happened  sometime  after  that. Then  with  interval, both  the  start  and  endpoints are  non-missing, but  we  don't  know  when  the  event  happened in  this  case  between  80  and  150. It's  not  shown  in  the  table  up  here, but  down  here,  we've  got  somewhere there's  exact  censoring. To  specify  that, you  just  use  the  same  value in  both  columns. That  means  essentially it's  like  an  interval  with  zero  width. It  happened  at  that  exact  time. Next,  we're  going  to  talk about  two  examples  of  censoring. The  first  is if  you  have  censoring  in  your  data, but  maybe  you  don't  know  how  to  handle  it, and  so  you  just  think, "I'll  just  ignore  it." We're  going  to  look  at  what  can possibly  happen  when  you  do  that. In  this  example, we've  got  simulated  data from  a  lognormal  distribution and  the  observed  data that  we'll  use  for  analysis in  our  different  cases is  where  all  the  values  from  the  true  data that  are  over  1,900,  we  set  them  to  1,900, as that's  the  censoring  time for  it's  right  censoring. There  are  a  few  possible  things you  could  do if  you're  trying  to  estimate this  mean  failure  time. You  could  do  nothing. You  could  just  use  this  observed  data with  a  whole  bunch  of  values  set  to  1,900, act  like  that's  when  it  happened. You  could  treat  those  as  missing  values, just  drop  them  from  your  data, or  you  could  use  the  censoring  information that  you  have  in  your  analysis. For  right  censoring, these  first  two  approaches are  going  to  tend  to  underestimate the  mean  failure  time because  you're  dropping  information from  the  data  at  that  far  end. Looking  more  closely  at  this, because  this  is  simulated  data, we  have  the  true  distribution here  in  this  first  column. That's  just  for  comparison, but  in  general,  you  wouldn't  have  that because  you'd  have that  all  values  above  1,900. You  don't  know  where  these  fall. In  our  observed  Y, this  is  where  we  just  use all  the  1,900s  as  values  of  1,900. We  have  no  missing  values, but  a  big  point  mass at  the  top  of  our  distribution  here. You  can  see  that  the  mean is  a  lot  smaller  than  the  true  mean. In  this  missing  Y  column,  this  is where  instead  of  treating  them  as  1,900, we  drop  them. We  set  them  to  missing and  analyze  the  distribution  without  them. Here  you  can  see  that now  our  maximum  of  the  non-missing  values is  less  than  1,900, which  really  doesn't  make  any  sense because  we  know  that  a  bunch  of  them, 21  observations,  in  fact, are  some  value  greater  than  1,900. So this  underestimates  the  mean  even  more. Then  on  the  right  here, we've  got  an  analysis  in  life  distribution in  JMP. This  is  where  we're  using the  observed  Y  column. It's  got  those  1,900s, but  we're  also  using  a  censoring  column alongside  it. For  the  rows  where  observed  Y  is  1,900, our  censor  column  is  going  to  say that  it's  a  censored  observation. Here  we  can  see  that  our  mean, it  actually  ends  up being  a  little  higher  than  the  true  mean, but  our  lognormal  parameter  estimates are  much  closer  to  the  true  values and  we're  incorporating all  the  information  that  we  have. For  our  next  example, we're  going  to  look  at  detection  limits. This  is  a  limit  of  detection  problem where  we  have  data on  the  yield  of  a  pesticide called  Metacrate that's  based  on  levels of  some  other  regression  variables. In  this  situation, the  measurement  system  that  we  have has  a  lower  limit  of  detection where  it  can't  measure any  yields  that  are  less  than  1 %. So  in  the  data, they're  just  coded  as  zeros, but  it  really  just  means it's  some  yield  below  1 %. There  are  two  ways you  could  analyze  this incorporating  that  information  in  JMP. The  first, you  could  treat  it  as  left  censoring, use  two  response  columns  with  the  first the  left  column  has  a  missing  value, and  the  right  column  would  be  a  one, or  you  can  use the  detection  limits  column  property that's  new  in  JMP  and  JMP  Pro. We'll  take  a  look  at  this. Here's  a  subset  of  the  data. This  Metacrate  Reading  column  is the  same  as  the  original  reading  column, but  it's  got a  detection  limits  column  property. Because  this  is  a  lower  detection  limit where  we  can't  measure any  lower  than  that  limit, we're  going  to  set the  lower  detection  limit  to  one. The  other  way  you  could  do  this is  with  the  two  columns. In  this  case, we  know  that  it's  left  censoring, so  the  left  side  is  missing and  the  upper  side  of  that  is  one, just  means  that  the  value is  somewhere  less  than  one. That's  all  we  know. But  as  you  can  see  from the  column  information  window  down  here, the  detection  limits  column  property is  recognized  by  the  distribution and  generalized  regression  platform. So  this  is  a  regression  problem. We'll  use  generalized  regression in  JMP  Pro. Here  we  fit   a  lognormal  response  distribution, and  it's  able  to  do  that on  this  Metacrate  reading  column, even  with  the  zeros  in  there, because   GenReg's  not  treating those  observations  as  zeros, it's  treating  them as  values  censored  at  one. Now,  we  were  able  to  use all  the  information and  get  a  regression  model. In  conclusion,  probably, the  most  important  thing  is when  you  have  censoring  information, it's  better  to  use  it  in  your  analysis than  to  ignore  it. Censoring  can  occur  a  lot  of  times for  time  responses, but  it  can  also  occur  for  other  responses. A  good  example  of  that is  these  limited  detection  problems. Finally,  you  can  use the  following  approaches to  specify  censoring  in  JMP. There's  the  two-column  approach that's  probably  the  most  flexible because  that  allows  you to  do  right  censoring,  left  censoring, interval  censoring, as  well  as  a  mix  of  all  three  of  those. For  right  censoring, you  can  use  the  one  column  response paired  with  a  binary  indicator  column for  censoring. There's  also  this  new  column  property in  JMP  for  detection  limits where  you  can  set  a  limit  of  detection either  on  the  low  side  or  the  high  side. We've  got  a  few  references  here if  you're  interested  in  more  information. One  of  those is  a  Discovery  talk  I  did  in  2017 that's  got  more  of  the  background of  how  the  censoring  information  is  used in  the  calculations  of  these  analyzes. That's  it.  Thank  you.
This presentation is an extension of the talk, "Measurement Systems Analysis for Curve Data Using Functional Random Effects Models," presented at JMP Discovery Europe 2023. Here, a functional random effects model was used to perform a Functional Gauge R&R analysis on data that contained a set of curves as the response. In this application, the functional model was expanded using the eigenfunctions and then was expressed as a random effects model, where variance components were estimated using standard methods. This analysis was done using the Functional Data Explorer and Fit Mixed platforms.   In the updated version of this presentation, I show that it is possible to include fixed effects in this type of analysis using the same model expansion approach. The functional model is still expanded using the eigenfunctions but is expressed as a generalized mixed model instead.       Hi,  my  name  is  Colleen  McKendry, and  I  am a  senior  statistical  writer  at  JMP, but  I  also  like  to  play around  with  functional  data. This  project  is  on  measurement systems  analysis  for  curve  data. First,  I'm  just  going  to  give  a  very  brief background  on  MSA  studies  in  general. MSA  studies  determine  how  well  a  process can  be  measured  prior to  studying  the  process  itself. It  answers  the  question, how  much  measurement  variation  is contributing  to  the  overall process  variation. Specifically,  the  Gage  R&R  method,  which is  what  I'll  be  using  in  my  analysis, determines  how  much  variation  is  due to  operation  variation versus  measurement  variation. You  can  use  a  Gage  R&R  crossed  MSA  model when  you  have  both  a  part  and  an  operator. The  model  you  can  see  here  for  your measurement  Y  sub  I J K, that's  going  to  be  the  Kth  measurement made  by  the  Jth  operator  on  the  Ith  part. In  this  model,  you  have  a  mean  term, a  random  effect  that  corresponds to  the  part, a  random  effect  that  corresponds to  the  operator, and  a  random  effect  that  corresponds to  the  interaction  or  cross  term. You  also  have  an  error  term. This  is  simply  a  random  effects  model, and  all  of  these  random  effects  are normally  distributed  random  variables with  mean  zero  and  some corresponding  variance  component. When  you  fit  this  model, you  can  use  that  to  estimate the  variance  components and  then  use  those  variance component  estimates to  calculate  the  percentage gage  R&R  using  the  formula  shown  there. In  a  standard  MSA  study, all  of  your  measurements  are going  to  be  single  points. But  what  happens  if  that's  not  the  case? What  if  instead  you're  measuring something  like  a  curve? That  question  was  the  motivation behind  this  project. There  was  a  client  of  JMP  that  was a  supplier  of  automotive  parts, and  they  had  a  customer  that  specified that  a  part  needed  to  have a  specific  force  by  distance  curve. Obviously,  the  client  wanted  to  design their  product  to  match the  customer  specified  curve. In  order  to  do  that, they  wanted  to  run a  functional  response  DOE  analysis and  JMP  to  design their  product  in  order  to  do  so. However,  before  spending  money on  that  experiment, they  wanted  to  perform  an  MSA  on  their ability  to  measure  the  parts  force. There  are  a  lot  more  details  about  the actual  data  and  this  problem  specifically in  an  earlier  2020  white  paper titled  Measurement  Systems  Analysis for  Curved  Data. If  you  want  any  more details,  look  that  up. It  should  be  on  the  community. This  in  this  graph,  that's what  the  data  looks  like. On  the  Y-axis,  we  have  force, and  on  the  X-axis,  we  have  distance. It  looks  like  there   are  only  10  curves  in  this  graph, but  there  are actually  250  total  curves. There's  just  some  clustering  going  on. There  are  10  different  parts, five  different  operators, and  five  replications  per  part operator  combination. A  little  bit  about  this  data,  obviously, these  measurements  are curves  and  not  points. The  data  was  collected  evenly  spaced  in time,  but  not  evenly  spaced  in  distance. There  were  some  earlier  projects that  tried  a  few  different  ways  to  perform some  type  of  MSA  study  on  this  data. They  used  some  functional  components, but  stayed  pretty  true  to  a  standard  MSA. When  I  looked  at  this  data,  I wanted  to  take  a  true  functional  approach because  I  have a  background  in  functional  data. Functional  data  analysis is  useful  for  data  that  are in  the  form  of  functions  or  curves. There  are  many  techniques  to  handle unequally  spaced  data, a  lot  of  which  are  available  in  the Functional  Data  Explorer  platform  in  JMP. My  goal  was  to  combine  functional  data methods  with  traditional  MSA  methods to  perform  some  type  of  functional measurement  systems  analysis. My  solution  was  to  create  a  functional random  effects  model by  expanding  the  functional  model using  eigen  function  expansion, rewriting  that  as  a  random  effects or  a  mixed  model if  you  had  any  fixed  effects  also, and  then  estimating  the  variance components  associated with  the  part  and  operator  terms. To  go  a  little  bit  into the  model  notation. For  your  functional  model, you  have  Y  sub  I J K, but  this  time  at  a  particular  distance,  D, to  account  for  the  functional nature  of  the  data. You're  going  to  have  a  functional mean  term, a  functional  random  effect that  corresponds  to  the  part, a  functional  random  effect that  corresponds  to  the  operator, and  a  functional  random  effect that  corresponds  to  the  cross  term and  also  your  error  term. Here,  when  you  do  the  model  expansion, it's  a  little  mathy,  but  essentially, instead  of  having  one  variance  component associated  with  the  part and  one  variance component  associated  with  the  operator, you  now  have  multiple  variance  components associated  with  each  of  those  things. That's  going  to  account for  the  functional  nature. When  you're  fitting  the  model and  estimating  the  variance  components, like  I  said,  now  you're  going  to  have  this set  of  variance  components that  you  can  sum  together to  estimate  the  functional variance  component  for  part and  the  same thing  for  operator  and  the  cross  term. Once  you  have  all  those  individual variance  components, you  can  use  those  to  estimate  the  % gage  R&R  just  like  in  a  standard  MSA. How  do  I  do  this  in  JMP? It's  a  multi  step  process that's  outlined  here, and  there  are  some  more  details in  other  slides. But  essentially,  I  estimate  the  mean  curve in  FDE  and  obtain  the  residual  curves. I  then  model  the  residual  curves  in  FDE  to obtain  the  eigen  functions  needed for  the  eigen  function  expansion of  the  functional  model and  save  those  eigen  functions to  the  original  data  table. I'm  going  to  use  those  saved  eigen functions  in  FitMix to  create  a  random  effects  model or  a  mixed  model  if you  also  have  fixed  effects  in  your  data. I'm  going  to  use  nesting  of  the  eigen function  formula  columns and  also  the  par  and  operator  variables to  define the  appropriate  model  specifications. This  is  what  your  fit  model window  would  look  like. Once  I  did  all  that  for  this  data, I  was  able  to  estimate  the  variance components  and  calculate  the  %  gage  R&R, which  in  this  case  was  3.3030. This  indicated  an  acceptable measurement  system according  to  some  ranges  that  were  defined in  this  paper  by  Baren  team. That  was  it  for  the  data analysis  for  my  part. This  result  was  actually  very  similar to  a  worst-case  scenario that  was obtained  in  a  presentation  in  2019. It  would  be  interesting  to  know  if that  was  a  coincidence or  if  the  results  would  be  similar for  different  data  as  well. Some  thoughts  that  this  project  provoked. Should  we  add a  functional  random  effect  for  ID to  capture  the  within  function correlation  across  distance? This  type  of  functional  random  effect is  actually  really  important in  functional  data and  is  a  big  benefit  of  accounting for  the  functional  nature  of  the  data. Unfortunately,  in  this  data  in  particular. Anytime  I  created  a  model  with  this  term, the  corresponding variance  components  were  zero, so  it  didn't  really  capture  anything extra,  but  it  would  be  interesting to  see  if  it  could  be  useful in  different  types  of  data. I  also  think  it  would  be  interesting  if  we could  calculate  a  confidence interval  for  the  %  gage  R&R. There  were  also  some  minor,  not  issues, but  brought  up  questions  of  the  residuals in  the  random  effects  model. I  observed  a  cyclical  nature  in  those. That's  not  always  great. I  don't  think  it  was  a  huge  deal, but  I  would  like  to  have  a  good reason  for  why  that  was  the  case. That's  it. Thanks  for  listening. If  you  want  more  details  on  this  project, it's  very  similar  to  a  full  30-minute  talk that  I  presented  at  Discovery  Europe, and  so  that  video  is on  the  community  as  well. Thank  you.
Now that JMP 17 has introduced the Workflow Builder, it's time to discover more details of the UI. Your workflow may encounter an error and the workflow stops. How do you do an in-depth analysis of the step settings to figure out what went wrong? This talk offers you a closer look at how to diagnose problems, work through JSL errors, and add custom JSL to optimize your workflow.  Are you aware of the red triangle menu options that will allow your workflows to be more effective?  We share helpful tips that will enable you to take full advantage of workflow features. Becoming proficient using Workflow Builder will help you create smarter and more efficient JMP workflows.     Hello. My  name  is  Mandy  Chambers, and  I'm  Principal  Test  Engineer in  the  JMP  development  group. I  want  to  talk  to  you  today about  the  wonderful  ways  of  workflows. It's  a  deep  dive into  the  Workflow  Builder. For  those  of  you that  maybe  haven't  had  a  chance to  see  Workflow  Builder, in  order  to  activate  it, you  go  to  the  File  menu  in  JMP and  you  see  File,  New,  New  Workflow, and  the  Workflow  will  pop  up. As  you  see  on  your  left, you  open  data  or  import  data to  begin  your  recording, and  then  it  begins  recording  your  steps as  you're  working  in  JMP. There  are  a  lot  of  buttons, a  lot  of  images, mini  check  marks,  and  more. Workflow  Builder  has  a  lot of  built-in  functionality. I  created  a  QR  code  up  here that  you  can  scan. I  did  a  talk  previously  in  the  spring that's  a  little  bit  more  introductory about  Workflow  Builder. If  you  scan  that  QR  code, you  can  go  back  and  pull  up  that  talk and  get  all  the  beginning  things  you  need for  Workflow  Builder  to  get  started. But  for  today,  we're  going  to  talk  in about  a  little  bit  more, about  the  UI  and  some  of  a  deeper  dive into  Workflow  Builder. Now  this  PowerPoint  presentation is put together  so  that  you  can  review  it. I  have  built  a  journal that's  part  of  the  demo that  has  a  lot of  Workflow  Builders  included  with  it where  you  can  get  those, edit  them,  and  try  some  things  out. But  for  today, I'm  just  doing  a  short  presentation, so  I'm  going  to  flip  through  this a  little  bit  quickly. The  red  triangle  menu has  a  lot  of  functionality, and  you  can  see  that in  the  image  on  the  left. The  right-click  menu for  the  Workflow  steps that  is  inside  of  the  Workflow  Builder, there  are  also  commands  in  there. Here  is  not  a  total  list of  all  the  red  triangle  menu  options, but  some  of  them,  some  of  my  favorites. You  can  read  those  on  your  own. Then  I  also  have  the  Workflow  steps right-click  menu  options  in  here. Then  when  you  open  up  the  right-hand  side of  the  Workflow  Builder, you're  taken  to  the  Step  settings. There's  an  Add  Action  button and  there  are several  commands  listed  on  there. I  am  going  to  speak  briefly to  a  couple  of  those  today so  you  can  use  those to  enhance  your  workflows. It's  one  thing  to  run  a  Workflow  Builder and  have  it  succeed. You  see  the  green  check  marks that  run  down  the  page, but  it's  quite  another  thing  to  run  it and  have  a  red X  jump  out  at  you. We've  had  questions  about how  to  diagnose  those  errors. I  included  a  couple  of  workflows for  you  to  try  that  out and  I  gave  you  instructions. But  this  one  here  is  just showing  the  red X  on  the  open. If  you  hover  over  that  little  red X, it  will  sometimes  tell  you. You  can  see  this  says it  can't  open  the  table. You  can  see  I  typed  in  wrong  directory, so  it's  looking  for  something that  it  can't  find. If  you  correct  that, nine  times  out  of  10,  it  will  run. Another  example  here  is  one  that  failed. Someone  had  done  a  dashboard and  the  table was  substituted  as  a  table 1 instead  of  being  the  name  of  the  table. JMP  was  a  little  lost. If  you  fix  that, you  can  also  resolve  that  error. I'm  going  to  show  this  one  today. This  is  a  mental  health  workflow with  some  COVID-19  data. It's  a  little  less  about  the  data and  a  little  more  about  pulling  it  in, cleaning  it  up, and  doing  a  few  things  here. But  in  this  particular  example, I  touch  on  a  number  of  things. I  create  a  workflow  package. I  hide  the  tables. It's  a  big  table, so  I  use  a  subset  data  table. I  create  some  of  my  own  JSL. Then  I  do  put  it  in  a  dashboard  at  the  end because  people  have  asked about  that,  and  that  does  work. I'll  show  you  that  one  today. I  have  another  workflow  in  here that  is  what  I  call  a   stop-and-wait for  an  action  to  happen. This  was  done  with  SQL  query, and  I'm  using  a  breakpoint in  this  example. There  are  other  ways to  cause  your  workflow  to  stop  and  wait using  show  message  and  things. But  today  I'm  just  showing  you  this  one about  how  to  use  a  breakpoint. Then  finally,  but  not  last, people  are  always  wanting to  publish  their  reports. Does  it  work  with  JMP  Live? Well,  you  can  publish  reports if  you  have  JMP  Live. It  does  not  publish  the  workflow. It  just  publishes  the  report. But  I  do  have  an  example  of  that. Let's  exit  out  of  this and  let's  go  to  the  demo  quickly. This  is  the  journal. I'm  going  to  go  through  the  steps real  quick  at  the  beginning. I  tried  to  be  clear,  but  it  tells  you  here if  you  copy  the  zip  file to  the  presentation into  a  single  folder that  will  give  you  the  best  results, you  have  the  journal and  you  have  all  the  workflows that  go  with  the  journal. Then  I've  got  a  note  here, and  I  also  embedded  a  note in  every  single  example that  says  close  the  workflows without  saving  them so  you  can  preserve  the  examples. You're  going  to  open  things  up and  make  some  changes  yourself. If  you  want  to  save  those  changes, I  just  suggest  going  up  to  the  File  menu and  saying,  File,  Save  As and  give  it  a  different  name. Now,  if  you  happen  to  write  over  it, you  can  obviously  just  grab  them  again, but  I  just  thought I  would  give  you  that  hint. Real  quickly,  I'm  not  going  to  spend a  lot  of  time  again  on  the  introduction, but  as  I  said,  you  can  go  back and  here  is  a  link  to  that  talk as  well  as  the  QR  code if  you  want  to  scan  and  go  back to  the  more  introductory level  of  workflows. But  just  to  show  you  where  it  is, if  you  go  up  to  the  File  menu  and  go  here, say,  New  Workflow, it  opens  up  just  like  that. I've  also  added  in  here,  this  is  again, the  how  to  some  definitions  and  shortcuts for  things you  can  peruse  that  on  your  own. This  is  the  first  example I  want  to  show  you. Again,  I'm  not  going  to  do every  single  one  of  these  steps. I'm  going  to  pop  open  this  workflow. But  this  one  covers  re-recording  a  step. Then  it  also  talks  about  at  the  end, a  little  shortcut  for  creating  a  column. Let's  run  this  real  quick. I  use  the  diabetes  table and I just  ran  a  Graph  Builder. What  I  want  to  show  you  how  to  do, and  then  I  did  create an  extra  column  here  at  the  end that  was  just  a  new  formula  column. I  want  to  close  this  up and  I  want  to  demonstrate  to  you one  of  the  features of  the  right-click  menu. The  arrow  here executes  the  entire  workflow, but  this  arrow will  step  through  it. If  I  click  one  time,  I  only  go  one  step, and  now  I've  stopped on  this  Graph  Builder. What  I  want  to  do is  I  want  to  re-record  that  step. If  I  right-click  while  I'm  sitting  on  that and  I  go  down  here and  say,  re-record  steps, it  changes  the  look  of  this. You  can  see  right  here it's  grayed  out  and  it  says  done,  cancel. It's  waiting  for  me  to  do  something. I'm  going  to  go  up  to  Graph  Builder , and I'm  going  to  go and  grab  a  couple  of  columns. It  really  wouldn't  matter. I'm  twisting  this  graph  around a  different  way. You  go  up  here and  you  click  your  done  step and  then  it's  waiting  still. You  can  see  here  it  says  launch  platform and  then  it's  still  sitting  on  this  report that  it  didn't  run  yet, which  was  the  one  I  had  in  there. I  need  to  hit  this  Done  button to  stop  the  recording and  then  it's  going to  replace  this  on  top  of  the  launch. I'm  going  to  hit  Done  and  you  can  see that's  now  my  new  Graph  Builder and  it's  been  put  into  the  workflow and  now  it's  waiting on  the  next  step  to  execute, which  is  my  new  column. That's  just  an  easy  quick  way to  execute  that  step. I  did  want  to  show  you  that  inside  of  here I  made  some  notes  on  things. That's  actually  showing the  image  of  the  table. But  in  here there's  some  notes  on  the  column, like  how  to  change  the  columns to  create  that  formula  column. If  you  look  inside  of  the  Workflow in  the  step  settings, you'll  be  able  to  find  those  things and  follow  along,  I  think, and  I'm  not  going  to  save  that. Hopefully,  that  will  be  helpful  to  you. The  Workflow  Builder  2  example is  this  dashboard  that  I  created with  the  mental  health  data. A gain,  I  stepped  through some  of  these  images  here, telling  you  what  to  do. What  I  want  to  do  now is  just  show  you  some  of  these  things. This  is  a  big  data  table, and  sometimes when  you're  dealing  with  bigger  data, it  can  make  your  workflows go  a  little  slower. When  you're  building  the  workflow, there  is  a  little  option  up  here called  Allow  Backsteps. By  default,  it's  always  on because  you're  building  workflow. A s  you  step  through  the  workflow, you  may  want  to  go  back, you  may  want  to  redo  something. It's  a  good  idea  to  keep  that  on. But  when  you  get your  workflow  completely  built, sometimes  when  you  uncheck  that, and  I  have  a  note  right  here, it  just  enhances  the  performance a  little  bit. For  this  workflow  to  run  a  little  quicker, I  have  unchecked  it. That's  just  a  little  tip  for  you. The  other  thing  that  I'm  doing  here is  I  didn't  care  if  I  really  ran  this with  all  the  data  or  some  of  it. I  used  a  couple  of  the  buttons  here. I  went  to  Add  Action  and  I  said, I  want  to  create  a  subset. Then  I'm  also  hiding  my  tables because  I  didn't  want  the  tables all  over  the  desktop. You  can  add  that  action and  it  comes  in  here. What  you  have is  you  have  a  chance  to  subset  the  data. Now  we  have  some  things  built  in. You  can  use  all  the  data, 50 %,  25,  whatever. For  this  example,  I'm  using  25 %. Then  we  even  ask  you, "What  do  you  want  to  do with  that  source  table?" I  said  I  want  to  close  it. Then  I'm  only  hiding  the  subset  table. That's  a  nice  little  feature  as  well. Then  as  I  was  creating  this, as  I  said  before, we've  had  questions  about  dashboards. I created  three  graphs and  then  I  threw  them  into  a  dashboard by  using  combined  windows. I'm  also  hiding  the  table  here because  when  you  run  the  dashboard, it  opens  up  another  copy  of  the  table. On  these  workflows,  though, I  didn't  want the  Graph  Builders  to  be  on  the  desktop. The  graphs  were  created, and  then  I  just  went  in  and  did  some  JSL and  added  to  name  each  one  of  them. This  one's  named  Graph  Builder  1, this  is  Graph  Builder  2, and  this  is  Graph  Builder  3. I  went  up  to  the  red  triangle  at  the  end and  said  add  a  custom  action. That  pops  up  here. Then  what  I  did is  I  went  in  and  typed  my  own  JSL to  say  close  those  windows. Graph  Builder  1,  2,  3, close  those  windows. I  don't  want  them  left when  the  workflow  is  run. Let  me  run  this  real  quick and  just  show  you  that  it's  going  to  run. It's  going  to  create  my  dashboard. There's  my  dashboard  and  you  can  see there's  nothing  else  on  the  desktop. The  table  is  hidden. You  can  see  that  down  here in  my  JMP  home  window. But  now  you  have  one  clean  report. You  don't  have  a  lot  of  stuff  around that  you  have  to  close and  move  and  all  of  that.  That's  nice. Then  one  more  part  of  this that  I  added  in  here is  that  this  workflow  will  work completely  fine  for  you  locally. The  way  it  is,  you  can  save  it is  just  go  and  file  save, saving  it  as  a  dot J MP  flow. But  if  you  want  to  give  to  somebody, the  best  thing  to  do is  to  create  a  workflow  package. That's  what  I've  done. But  you  can  tell  by  the  name  of  this, it's  got  an  underscore  PKG attached  to  it. When  you  do  that, it  packages  the  files. The  files  go  with  the  workflow, and  then  you  can  share  it with  your  coworkers. Give  that  a  try  and  see how  that  works  for  you. Now,  these  other  examples, three  is  doing  some  generalized  workflows. The  steps  are  in  there  for  that. Step  four  is  working  through  the  errors that  I  mentioned  earlier. Then  step  six is  my  publishing  to  JMP  Live, but  I'm  actually not  going  to  do  that  one  today. Then  I  briefly  want  to  show  you  here this  fifth  one  that's  a   stop-and-wait. Real  quick,  this  is  running  a  report. It's  going  to  stop. This  little  dot  here  is  a  breakpoint. If  you  right-click, you  can  set  the  breakpoint  here, and  it's  enabled. When  it's  enabled,  it's  black, which  means  it's  active. If  I  uncheck  that,  it  gets  hollowed  out. What  that  means  now is  that  the  breakpoint is  still  sitting  there, but  it's  really  not  doing  anything. Then  if  I  go  over  here and  click  it  one  more  time,  it's  gone. The  other  nice  thing  about  this  feature is  I  can  check  it  here and  say,  set  a  breakpoint, or  I  can  go  over  here to  the  little  tray  on  the  left and  check  it, and  now  the  breakpoint is  activated  there  as  well. When  I  run  this just  so  you  can  see  what  it  does, it  runs  to  that  point. It's  waiting. It's  sitting  there. I can  look  at  my  analysis, I  can  look  at  my  distributions and  analyze  things, and  then  I  can  hit  the  arrow and  I  can  continue  going. Then  my  last  Graph  Builder  ran. It's  a  nice  way  to  run  something, do  some  checks,  run  some  more  stuff. You  can  set  multiple  breakpoints. There's  no  limit  to  that. You  can  do  it  while  you're  maybe trying  to  debug  something. It  works  very  nicely. That's  all  I  have  time  for  today. Thank  you for  letting  me  share  with  you. I  hope  that  you  get  a  hold  of  JMP  17 and  get  the  Workflow  Builder  going. If  you  have  any  questions, I'm  always  around  to  answer  those. Thanks  a  lot.
Scientists in biopharma working along the CMC development pathway are challenged by the increasing complexity of biologic-based therapeutics and processes that produce them. Many critical responses exist (often 10-20) that are themselves functions of numerous, highly interactive input process factors.     We use a large case study to show that current experimental design strategies combined with current response surface methods are generally inadequate to deal with the complex kinetic behaviors observed to be ever-changing across the design space. The case study consists of a 7-factor, hybrid experimental design used to develop a bioprocess with 13 critical responses. Employing a combination of SVEM, neural network models, and the hybrid experimental design, we show that accurate predictive models can be estimated for all responses that interpolate satisfactorily throughout the design space.    Furthermore, we show that the powerful tools in JMP and the Prediction Profiler, specifically the Design Space Profiler, are essential to visualizing and understanding the experimental design space and optimizing the bioprocess .   We also discuss the need for new optimal design strategies. JMP Pro 17 is used throughout the talk.     I  want  to  thank  the  JMP  steering  committee and  the  JMP  organizers for  inviting  Phil  and  myself to  come  and  present  our  exciting  talk on  CMC ,  SVEM ,  DOE ,  and  Complexity : It 's  All  About  Prediction . Want  to  start  by  thanking  Dr .  Tiffany  Rao , she 's  been  involved  with  the  planning and  numerous  conversations  for  the  work that  we 're  going  to  present  today . Going  to  do  an  overview , tell  you  who  Lundbeck  is ,  who  I  work  for , and  then  provide  the  background for  the  DOE that  we 're  going  to  talk  about , which  is  process  development for  a  biologic  drug . Our  case  study and  what  I 'm  doing  in  traditional for  what  I 've  started  to  do for  development is  start  with  the  first  step  of  doing  DSD for  mid -late  stage  development , then  follow  that  with  a  second  step of  doing  augment with  a  space -filling  design . Then  we  are  hoping  to  prove  to  you  today that  for  analysis  that  SVEM allows  us  to  have  better  prediction for  all  of  this  work  and  allows  us  to  have better  timelines for  our  work  that  we 're  doing . Lundbeck  is  located … We 're  headquartered  in  Copenhagen , we 're  over  6 ,000  employees in  over  50  countries , and  we  are  striving  to  be the  number  one  in  brain  health . The  part  of  the  company  that  I  work  with is  the  CMC  biologics and  we 're  basically  located in  the  Copenhagen  area and  in  the  Seattle  area  where  I 'm  located . Let 's  talk  about  the  background  for   the  DOE  that  we 're  going  to  present  today . The  process  that  we  want  to  develop for  drug  substance ,  for  these  biologics , we  start  with  a  cell  of  vials , we  take  those  out  of  the  freezer , we  then  expand  in  shake  flasks , go  bigger  into  culture  bags , maybe  a  seed  bioreactor , then  to  a  production  bioreactor . That  production  bioreactor goes  approximately  two  weeks . We  have  complex  nutrient  feeds , we  have  PH  control ,  temperature  control , there 's  the  base  that  we 're  adding . Once  we  finish  that  14 -day  production , we  need  to  figure  out  a  way to  get  the  cells  that  are  secreting our  molecule  into  the  supernatant . How  do  we  separate the  cells  from  the  product ? That  harvest  can  be  a  centrifuge , it  can  be  depth  filtration . Then  we  pass  it  on to  our  downstream  colleagues . They  first  usually  do  a  capture  step where  they 're  getting  rid of  most  of  the  host  cell  proteins , the  host  cell  DNA . But  then  we  need  to  do  two  polished  steps where  we 're  then  saying , "Okay ,  what  are the  product -related  impurities ? Maybe  there 's  not  the  full  molecule  there , so  we  have  to  get  rid  of  those ." Then  finally ,  we  have  to  make  sure , through  ultra  filtration  and  diofiltration that  we  can  transfer  into  the  buffer that  it 's  going  to  be  when  it  is transferred  for  the  patient 's  use and  it 's  also  at  the  right  concentration . You  can  imagine , every  step  along  this  way , there  are  many  factors , there  are  many  knobs  that  we  can  turn to  control  this  process , make  sure  that  it 's  robust and  we 're  making the  same  product  every  time . When  we 're  focused on  treating  the  patient , we  also  want  to  focus  on  the  business . We  can 't  put  all  of  our development  resources  for  every  molecule . We  want  to  right -size the  research  that  we 're  doing at  the  right  stage  of  the  product . There 's  many  things that  could  kill  a  product , but  if  we  can  develop  this in  the  right  time  and  the  right  space using  these  tools  from  JMP , we  can  shift this  development  timeline  to  the  left and  we  can  also  reduce the  amount  of  resources and  the  cost  to  the  company . If  we 're  first  getting  a  molecule , that 's  when  you 're  going  to  start  looking at  your  categorical  factors . We  might  be  doing  the  cell  line  screening . We  want  to  make  sure  that  we  have the  right  cell  line that 's  going  to  last  all  the  way through  commercialization . For  the  downstream  group , they  may  be  looking  at  resins for  both  upstream  and  downstream , looking  at  medias  and  buffer  components and  the  formulations  of  those . That 's  when  you 're  making  sure that  you  have  the  right  thing , that 's  going  to  keep  you  going through  your  development  pathway . But  then  once  you 're  in  the  clinic , now  you  want  to  really  start  to  gain understanding  of  the  process  parameters . Our  strategy  is  to  start with  a  development  screening  design and  we  want  to  be  bold in  our  level  settings  at  this  stage and  I 'll  talk  a  little  bit  more about  that  later , for  the  late  stage  development . Then  we  can  build  on  what  we  learned from  the  Definitive  Screening  Designs by  augmenting  those  designs with  space -filling  or  other  designs so  that  we  really  understand that  design  space . What 's  different that  we 're  hoping  to  show  now than  traditional  walks through  this  pathway is  that  in  the  past , we 've  been  throwing  out the  factors  that  we 've  said aren 't  important . But  with  modern  designs and  modern  ways  of  doing  analysis , we  can  keep  all  of  the  factors and  all  of  the  work  that  we 've  done  so  far and  gain  better  understanding of  the  whole  process , especially  with  biologics that  are  quite  complex . Before  I  pass  the  baton  to  Phil , I  just  wanted  to  talk  one  more  about … Let 's  see  if  I  can … I 'm  going  to  minimize  this  screen just  for  a  minute  so  I  can  show  you  this . This  is  an  experiment  that  I  did to  prove  the  power  of  DOE  for  my  boss . The  full  data  set  was  an  OFAT  for  PH , and  the  response  was  tighter . We  wanted  to  do very  many  different  levels in  a  wide  range because  he  wasn 't  sure  at  the  time that  we  were  going  to  be  able to  pick  what  the  optimized  level  was . But  what  I  wanted  to  show  him  was  that , "Okay ,  we  did  this  experiment , we  have  all  of  this  data . We  were  able  to  model where  the  optimized  condition  was ," and  that 's  shown  in  blue , and  that  turned  out to  be  the  correct  case . When  we  tested  the  model , that  was  the  optimized  condition . Let 's  pretend  now  that  we 're  starting , we  don 't  know  that  data . If  we  had  picked  a  conservative range  setting  for  our  experiment , our  noise  to  signal  would  be  quite  high and  so  we  would  have  missed finding  the  optimized  spot . But  if  we  had  picked  a  wider  range in  our  settings and  still  with  only  three  points , the  model  still  would  have  chosen the  optimized  spot . What  I 'm  going  to  challenge the  subject  matter  experts when  you 're  designing  your  DSDs is  really  be  bold  in  your  range  setting . You  will  still  find  the  optimized  spot and  you  have  to  have  some  knowledge of  your  process  so  that  you  can  complete the  design  of  experiment and  have  all  of  the  runs at  least  have  enough  signal that  you  can  measure and  then  subsequently  model . Once  you  learn from  your  Definitive  Screening  Designs more  about  your  design  space , you  can  come  back and  then  you  can  be  internal to  that  space . That 's  when  you  augment with  a  space -filling  design . Now  I 'm  going  to  pass  the  baton  to  Phil and  he 's  going  to  take  you through  the  analysis . Okay ,  thank  you . Thank  you ,  Patty . We 're  going  to  talk  about  a  very  nice and  somewhat  complicated  experiment that  Patty  and  her  team  run . They  do  a  lot  of  great  work and  they 're  big  advocates  of  DOE  and  JMP and  I 'm  very  happy  they  let  me get  to  play  with  them  sometimes . It 's  fascinating  work . But  before  I  get  into  the  actual  analysis , I  wanted  to  talk  about a  few  relevant  concepts that  members  of  the  audience  may or  may  not  be  familiar  with , and  that  includes  complexity . It 's  a  really  hot  topic  out  there . Talk  about  what  is  actually  prediction . That  is  a  muddled  concept  to  many  people . Then  from  there , I 'll  launch  into  talking  about how  we  analyze  prediction and  how  we  did  with  Patty 's  experiment . Complexity,  a  fellow named  Daniel  Finelli from  London  School  of  Economics, written  much  about  this and  he  calls  it  "the  elephant  in  the  room " that  statistics  and  many , what  he  calls  "metasciences ,"  are  ignoring and  they 're  ignoring  it  at  their  peril . I  won 't  get  into  a  lot  of  detail . You  can  look  him  up  on  the  internet , he  has  a  lot  of  videos  and  papers . But  complexity  is  a  huge  problem . It  is  staring  science  and  statistics and  data  science  and  machine  learning in  the  face  and  it  needs  to  be  dealt  with . At  present ,  we 're  not  really  dealing with  it  directly  in  statistics . By  the  way ,  there  are  now   whole  applied  math  programs based  on  studying  complex  systems . My  bottom  line  is ,  complexity  is  real . Complexity  requires  new  thinking . We  really  have  to  rethink DOE  and  analysis . You 're  going  to  see  that for  complex  systems, and  we  also  have  to  understand something  else ,  systems  theory  101  is complex  systems  are  defined by  their  interactive  behavior . In  point  of  fact ,  main  effects are  actually  even  misleading . You  have  to  somehow  be  experimenting in  a  manner  that  you  can  capture this  interactive  behavior , and  you 're  going  to  see  current  strategies fall  short  of  that  goal . Patty 's  already  mentioned  the  CMC  pathway . Nowhere  is  this  problem  of  complexity more  obvious  than  in  bioprocesses . You  have  complex  combinations of  biology  and  chemistry , and  interactions  are  everywhere . When  I  talk  to  scientists in  biotechnology , they  know  right  up  front  we 're  dealing with  really  complex  interactive  systems . But  first ,  I  need  to  point  out  prediction . If  you 're  working  in  CMC  development  work , it 's  all  about  prediction . The  ICH  guidelines  that  are  used by  scientists  in  the  CMC  development  work don 't  specifically  say  prediction , but  if  you  read  what  they  say , it 's  all  about  prediction . Basically ,  you 're  building  processes to  manufacture  biologics , and  with  the  new  cell  and  gene  therapies , these  processes  are  becoming hopelessly  complicated . I  personally  rely  heavily on  the  scientists  to  explain  it  to  me , and  they 're  the  people who  really  make  all  the  decisions . I 'm  the  helper ,  and  I 'm  very  happy to  be  there  as  part  of  it . But  it 's  all  about  prediction . That  is  not  how  many  scientists and  even  statisticians , have  viewed  CMC  work . By  the  way ,  this  applies to  all  areas  of  science . I 'm  focused  with  Patty on  the  CMC  development  pathway , but  prediction  is  important . What  is  prediction ? It 's  muddled . It 's  not  clearly  defined  in  disciplines . Here 's  what  it  really  is and  how  I  define  it . It 's  a  measure  of  how  well models  that  you  develop interpolate  over  a  design  region . In  other  words ,  we 're  going  to  fit  a  model to  what  we  call  a  training  set , and  then  we  need  some  way  of  knowing how  that  model  would  apply over  the  whole  design  region . In  CMC  work ,  especially  late  stage , that  is  very  important . You  be  able  to  do  that , as  many  of  you  know . You  really  have  a  training  set to  fit  the  model . That  training  set  in  no  way can  evaluate  prediction . I  know  there 's  a  common  belief you  can  evaluate  prediction on  training  sets . You  simply  can  not . You  must  have  a  test  set . Also  I 'll  talk  a  little  bit about  the  fact  in  dealing  with  scientists , and  a  lot  of  it in  chemistries  and  biologics . Again ,  I  do  a  lot  of  it  in  biotechnology , but  also  in  other  areas like  battery  technology ,  material  science . It  is  becoming  very  obvious . The  kinetics  are  complicated . They 're  constantly  changing over  design  regions . The  kinetic  behavior that  you  see  around  the  boundaries is  often  very  different from  what 's  happening  on  the  interior . Why  does  this  matter ? Well ,  the  classic  approach to  response  surface , even  including  optimal  designs , relies  upon  what  I  call  boundary  designs . Almost  all  of  your  observations  are  around the  boundaries  of  the  design  region . In  point  of  fact , whether  people  want  to  hear  it  or  not , the  central  composite  design , commonly  used  in  response  surface , is  about  the  worst  design you  could  think  of  for  prediction . The  interior  of  the  space  is  empty . If  you  fit  these  models  on  the  boundary , and  then  you  predict what 's  happening  on  the  interior , it 's  not  prediction ,  it 's  speculation . You  don 't  know . You  have  no  data . I 'm  going  to  show  you in  the  case  study , you 're  probably  going  to  reach some  wrong  conclusions . The  boundary  regions ,  indeed , often  behave  very  differently , and  we  have  a  need  to  reconsider our  approach  to  designs . Another  issue in  response  surface  and  statistics is  this  ubiquitous  use of  full  quadratic  models . They  are  not  sufficient to  model  complex  response  surfaces . In  fact ,  they 're  far  from  it . Unfortunately ,  I  get  a  lot  of  pushback from  statisticians who  claim  it  is  good  enough . My  answer  is , "Well ,  if  you  actually  use  designs that  had  sufficient  interior  points , you 'd  quickly  discover they  don 't  fit  well  at  all . Again ,  trying  to  measure  prediction on  the  interior  of  a  design  region using  boundary  designs  is  futile . By  the  way ,  my  good  friend , the  late  John  Cornell  and  Doug  Montgomery , published  a  paper  on  this  in  1998 , and  I 'll  be  polite ,  they  were  ignored . It  was  actually  somewhat  nastier than  ignored  by  the  statistics  community . They  showed  in  the  paper that  full  quadratic  models are  just  not  sufficient to  cover  a  design  region . Patty  mentioned  SVEM , self -validating  ensemble  modeling . It 's  an  algorithm . I 'm  one  of  the  co -developers with  Dr .  Chris  Gottwald  of  JMP , a  person  I  hold  in  very  high  regard . I  won 't  get  into  the  algorithm  by  the  way , there  are  references  at  the  end where  you  can  go  and  learn  more  about  it . It  has  been  talked  about at  discovery  conferences  actually , going  all  the  way  back to  Frankfurt  in  2017 . But  SVEM  is  an  algorithm  that  allows  you to  apply  machine  learning  methods . Machine  learning  methods  are all  about  predictive  modeling . Believe  me ,  people  in  that  field know  a  lot  more  than  you  may  think about  prediction  and  apply  them to  data  from  small  sets  like  DOE 's . I  won 't  get  into  SVEM . It 's  a  whole  new  way  of  thinking about  building  predictive  models , and  I  think  it 's  in  its  infancy , but  it 's  already  proving  very  powerful and  useful  in  biotechnology . Let 's  get  to  the  experiment . This  is  actually  a  hybrid  experiment that  Patty  and  her  team  created . There  are  seven  factors and  there  are  13  responses . But  due  to  time  constraints , I 'm  only  going  to  focus  on  four , and  even  that 's  going  to  be  hard to  get  it  all  in . The  data  and  the  experiment are  highly  proprietary . I  do  thank  Lundbeck  and  Patty for  actually  allowing  us  to  use an  anonymized  version  of  this  design . I  have  a  lot  of  case  studies , some  of  them  similar  to  this , and  the  people  who  own  the  data wouldn 't  even  let  me  discuss  it if  I  anonymized  it . That  was  very  nice  of  them . I  think  we  have a  really  important  story  to  tell  here . This  is  a  hybrid  design . It 's  comprised  of  a  19 -run Definitive  Screening  Design around  the  boundaries . Then  it  has  16  space -filling  designs on  the  interior . There  are  center  points in  both  parts  of  the  design . How  would  we  analyze  this ? Well ,  what  I  want  to  do  is  discuss the  strategies  of  analysis  that  are  used , the  algorithms  that  are  used , and  make  comparisons  to  SVEM . I 'll  tell  you  in  advance , SVEM  is  going  to  do  very  well . Then  we 'll  talk  about  some  of  the  issues with  the  models  themselves and  how  we  use  them . I 'm  going  to  do what  most  people  currently  do . I 'm  going  to  take  the  boundary  points , the  DSDs , fit  models ,  and  then  apply  them to  the  space -filling  designs  as  a  test  set and  see  how  well  my  model  interpolates . Step  two ,  I 'll  reverse  the  process . I 'll  fit  models to  the  space -filling  points , and  then  I 'll  use  the  DSD  as  a  test  set and  see  how  well  my  model actually  extrapolates a  little  bit  to  the  boundaries . Three  is  a  common  strategy used  in  machine  learning . I 'm  going  to  use  a  holdback  test  set . I 'm  going  to  take  the  35  runs and  break  them  up . I  did  this  in  a  way  to  make  them  both equivalent  as  much  as  I  could into  a  training  set containing  both  SFD  and  DSD  points , and  then  also  a  whole  back  test  set that  has  a  representation  of  both . Then  finally ,  step  four , what  many  people  would  automatically  do . I 'll  just  fit  models to  the  whole  data  set . In  general ,  I  don 't  recommend  this because  there 's  no  way  to  test  the  model . I  will  say  up  front , because  we  do  have  a  lot of  space -filling  points  on  the  interior , I 'm  more  comfortable  with  this  approach than  I  am  in  practice . But  these ,  I  find ,  are  the  four basic  strategies  that  would  be  used . How  do  I  analyze  it ? Well ,  if  you  have  a  DSD , people  like  to  use Fit  Definitive  Screening , I 'll  look  at  it ,  it  only  applies  to  DSDs . Honestly ,  it 's  not  really a  predictive- modeling  strategy , nor  do  they  claim  it  is . But  I  find  people  seem  to  use  it  that  way . I 'll  use  Forward  Selection . If  you  know  what  the  AICc  statistic  is , we 'll  do  that  in  GenReg ,  in  JMP  17 . Then  we 'll  look  at  something they  have  in  GenReg  that 's  very  nice . That  is  the  SVEM  algorithm . I 'm  going  to  use  that   with  Forward  Selection . Then  I 'm  going  to  look at  something  people  may  not  know . It 's  a  hidden  gem  in  JMP . Something  called  Moving  Average in  the  Stepwise  platform . John  Saul  put  it  there  many  years  ago . I  think  he  was  being  very  insightful . Then  we 're  going  to  talk  about SVEM  and  Neural  Networks . Basically ,  no  software  does  this . I  have  worked  with  the  Predictum , some  of  you  know  Wayne  Levin  and  Predictum to  develop  an  add -in  to  do  this . It 's  currently  the  only   software  available  that  does  this . The  SVEM  add -in  was  used to  do  the  Neural  Networks . I  won 't  get  into  the  add -in  particularly , I 'll  just  quickly  show  people where  these  things  are . Then  finally  I  said  the  fourth  strategy   was  used  to  hold  data  set because  I  get  asked about  this  all  the  time . I  just  threw  in  some K -cross  Fold  validation  to  use with  the  SVEM  methods   and  some  of  the  other  methods . Those  are  the  methods  we 'll  use and  for  methods  like  Fit  Definitive , Forward  Selection   and  Moving  Average  methods , we 'll  assume  a  full  quadratic  model as  that  is  the  tradition . The  other  methods ,  again , we 're  going  to  use  a  Neural  Network which  is  more  flexible . There  are  four  responses, and  this  is  really  important . I  didn 't  randomly  select  them . There  are  four  of  them and  they  vary  in  complexity . Again ,  I 'll  admit  this  is  subjective . There  is  no  internationally   approved  measure  of  complexity and  this  is  based  upon  the  ability to  model  the  responses . Again ,  there  are  13  responses . Typically ,  in  CMC  pathway  work , there  are  10 -20 ,  maybe  more , most  of  them  critical  quality  attributes . They  are  important and  they  vary  within  the  experiment from  some  are  fairly  low  in  complexity , some  are  very  high , very  difficult  to  model . Frankly ,  in  those  cases , Neural  Networks  are  basically   your  only  option . So  pay  attention  to  this   because  this  complexity turns  out  to  be  very  important in  how  you  would  go  about  modeling . Then  the  question  is   if  I 'm  going  to  evaluate  prediction , well ,  how  do  I  do  that ? Remember ,  I  prefer  prediction  be   on  an  independent  test  set with  new  settings  of  the  factors . That 's  how  we  judge  interpolation . Well ,  something  called  the  Root  Average  Square  Error or  RASE  scores  is  very  common . This  is  the  standard  deviation of  prediction  error . Again ,  it 's  commonly  used to  judge  how  well  you  predict . Smaller  is  better ,  obviously , but  there  is  a  problem  with  it that  we 've  particularly  uncovered , especially  in  simulations . Models  with  low  RASE  scores  often   have  substantial  prediction  bias  in  them . In  prediction ,  there  really  is still  a  bias -variance  trade -off . So  how  do  we  evaluate  bias ? Well ,  there 's  no  agreed  upon approach  to  that  either . But  the  easiest  way and  the  most  visual  way is  actual  by  predicted  plots   on  a  test  set . Ideally ,  if  you  were  to  fit  a  slope to  the  actual  bi -predicted  plot , I 'll  show  an  example . The  ideal  prediction  equation  that  a  slope would  be  one  with  an  intercept  of  zero . The  farther  the  slope  is  from  one , the  greater  the  bias . For  purposes  of  demonstration , I 'm  going  to  set a  specification  of  0 .85 -1 .15 with  a  target  of  1  for  the  slope . If  you  can  stay  within  that  range , then  I 'd  say  you  probably  have acceptable  amounts  of  bias . In  reality  that  happens  to  be more  of  a  subject  matter  issue . Then  finally  I  said ,  "Well ,  you  can  fit a  slope  to  the  actual  bi -predicted  plot . There 's  an  additional  problem ." The  predictor  is  the  predicted  values . They  have  a  lot  of  error  in  them . So  this  is  actually  an  errors and  variables  problem , which  is  not  commonly  recognized . But  JMP  17  has  a  really  nice  solution . It 's  called  the  Passing -Bablok   modeling  algorithm and  it 's  been  well -established ,   especially  in the  biopharma . This  fits  a  slope ,  taking  into  account errors  in  X ,  the  predictor . So  how  does  it  work ? Well ,  it  fits  a  slope . If  you  look  on  the  left , you 'll  see  the  slope  is  about  0 .5 . We  have  strong  bias . There 's  a  lot  of  prediction  bias . What  I  really  like  in  the  application in  JMP ,  they  give  you  the  reference  line . The  dashed  blue  line  is  the  ideal  line slope  of  one ,  intercept  of  zero . On  the  left ,  our  predictive  model   is  showing  a  lot  of  bias . It 's  systematically  not predicting  the  response . To  the  right ,  is  a  case  where   there 's  actually  a  small  amount  of  bias in  general ,  that  would  be  acceptable . By  the  way ,   these  were  picked  as  one 's  models that  had  relatively   low  overall  RASE  scores . These  are  called  the  Passing -Bablok  slopes and  they  are  integral   to  how  I  evaluate  prediction , the  overall  RASE  and  the  slopes . What  I 'm  going  to  do  at  this  point , I 'm  going  to  actually  go  over  to  JMP ,   if  you  don 't  mind . I 'll  make  a  quick  change   in  the  screen  here and  I 'll  make  this  as  big  as  I  can for  everybody . Overall  in  this  exercise , I  fit  close  to  140  models  and  I  did  them all  individually  and  evaluated  them . Yes ,  it  took  quite  a  while and  I 'm  going  to  show  a  graphic   to  try  to  summarize  the  results for  the  different  methods . I 'm  going  to  open a  Graph  Builder  script . I 'll  make  this  as  big  as  I  possibly  can for  everyone . I 'm  using  some  local  data  filters ,   to  define  the  display . Notice  we  have  four  training  scenarios . I 'll  start  with  where the  DSD  is  the  training  set . We  fit  models  to  the  boundary and  then  we  evaluate  them   and  how  they  predicted the  space -filling  design  points . Y2  is  the  easy  response . I  expected  all  approaches to  do  well ,  they  did . Notice  I  set  these  spec  limits and  that 's  0 .85 -1 .15 all  fell  within  that  allowable  region . Two  of  the  methods  that  did  well , I  particularly  liked  the  moving  average , so  it  did  pretty  well . None  of  them  had  a  slope  of  exactly  one . The  DSD  points  don 't  exactly  predict what 's  going  on in  the  space -filling  design  points , but  they  all  did  relatively  well . Now  we 'll  go  to  moderate  complexity . Now  you  start  to  see  some  separation . It 's  getting  harder  to  model  the  surface . Again ,  I 'm  using   this  interval  of  0 .85 -1 .1 . I 'm  looking  on  the  y -axis  at  the  RASE   score  standard  deviation  of  prediction . On  the  x -axis ,  I 'm  looking  at  slope . For  Y1 ,  using  the  DSDs  to  predict the  space -filling  design  points as  the  test  set . The  only  models  that  really  performed  well were  the  Neural  Networks  with  SVEM . By  the  way , the  code  is  NN  is  Neural  Network , H  is  number  of  hidden  nodes . We  have  models   with  varying  levels  of  hidden  nodes and  I  simply  evaluated   RASE  scores  and  slope . We  go  to  more  complexity . Now  Y3  has  high  complexity . It  is  hard  to  model . The  lowest  RASE  scores  were the  methods  you  see  on  the  lower  right , but  you  can  see there 's  substantial  prediction  bias . I  felt  overall   the  best  combination  of  low -bias and  RASE  score  were  Neural  Networks , particularly  one  with  27  hidden  nodes . Then  finally  number  four is  high  complexity . We  fit  the  model  to  the  DSDs  and applied  it  to  the  space -filling  points . I  didn 't  think  any of  the  models  did  great . All  of  them  showed  some  prediction  bias . Maybe  the  best  performance  was a  Neural  Network  with  12  hidden  nodes . It  had  the  lowest  RASE  score ,  but  still , there  were  some  issues  with  bias . So  that 's  one  strategy . Well ,  what  if  I  were  to  do  the  opposite ? I  fit  the  model   to  the  space -filling  points and  then  apply  them to  the  boundary  DSD  points . Again ,  let 's  start  with  the  easiest  case . Y2  really  does .   It 's  a  pretty  simple  response . Actually ,  the  SVEM  method  in  GenReg using  SVEM  and  Forward  did  very  well . The  next  best  I  thought  was a  Neural  Network  with  10 . Remember,  there 's  a  little  bit   of  extrapolation   going  on  here . Finally ,  Y1  with  moderate  complexity . Again ,  only  the  Neural  Networks  did  well . As  we  go  up  in  complexity ,  increasingly just  the  Neural  Networks  are  working. You 'll  find  similar  results for  the  other  approaches . I  won 't  show  all  of  them , they 're  covered  in  the  notes . But  the  general  conclusion  by  the  way ,  is that  when  you  use   the  boundary  points  as  a  test  set or  you  use  the  space -filling  designs  as   a  test  set  and  try  to  predict  the  other , they 're  just  not  doing as  well  as  they  should . In  other  words ,  as  I  said  earlier , the  boundary  points , the  DSD  points   and  the  space -filling  design  points , there  are  differences  in  their  kinetic behavior  that  we 're  not  picking  up . The  only  way  we 're  going  to  pick  it  up is  to  actually  fit  models over  the  whole  design  space . We  did  do  that  by  the  way . I  should  just  quickly  show  you . I  used  the  whole  data  and  we  fit  models and  we  actually  did  pretty  well . I  didn 't  show  the  Passing -Bablok  slopes . I  will  just  quickly  do  a  little  more  work with  JMP  for  those  who  are  interested . The  Passing -Bablok  slopes can  be  done  in  Fit  Y  by  X . I  will  admit  we  wrote  a  script and  added  it  to  the  predictive add -in  to  do  this  in  Fit  Y  by  X , but  you  can  easily  do  it  yourself . Here ,  and  I 'll  pick  one  of  the  cases , is  the  DSD  data  and  I 'll  pick  Y1 . How  did  we  do  fitting  models ? If  you  look  in  the  menu , there 's  the  Passing -Bablok . I  strongly  suggest  you  look  at  it . A  lot  of  regression  problems are  errors  in  variables . How  did  the  method  do  it  overall ? I  want  to  explain  something  else . The  orange  points  are  the  DSDs , the  boundaries . The  blue  points  are the  space -filling  design  points . Here  I  fit  models  to  the  DSD and  the  Passing -Bablok  slopes  are  being   fit  to  the  space -filling  design  points . Overall ,  the  best  performance was  turned  in  by  the  DSDs . There 's  one  of  them  here . It 's  Saywood  6 . Another  one  that  had … I  forgot  what  it  was . Let  me  widen  this  out  for  you . Nineteen . Notice  the  slope  is  close  to  one , but   you  can  clearly  see there  is  some  bias . In  other  words ,  you  can  see  an  offset between  the  fitted  slope and  the  ideal  slope ,  the  dashed  blue  line . This  is  pretty  typical  overall . I 'll  just  very  quickly  show  you . If  you  have  JMP  Pro  and  you  want   to  do  SVEM  using  linear  models , just  go  to  Fit  Model ,  Recall . This  is  a  full  quadratic  model . You  could  do  others . Go  to  GenReg   and  then  under  estimation  methods . There 's  SVEM  Forward . There 's  SVEM  Lasso . These  work  very  well . From  a  lot  of  work  in  these  methods , I  still  find  SVEM  Forward gives  you  the  best  results . The  Lasso  tends  to  give  you a  lot  of  biased  results on  test  sets  in  particular . If  you 're  interested  in  model  averaging ,   if  you  have  JMP  standard , just  going  to  hit  recall  again , just  go  to  the  Stepwise  platform . Didn 't  do  it .  Stepwise . I  won 't  run  it . It  will  take  too  long  because  model averaging  uses  best  subsets  regression . It 's  time -consuming ,  but  it 's  there . Again ,  Neural  Networks  with  SVEM , you  have  to  have   the  Predictum  add- in  to  do  that . There 's  a  link  to  it  if  you 're  interested . At  this  point , I 'm  going  to  not  do   too  much  more  analysis . Again ,  you  can  go  through   and  look  at  the  various  slopes for  the  various  responses and  you  can  see  many  of  these  methods   resulted  in  highly  biased  slopes . In  other  words ,  the  DSD  points  and  the   space -filling  designs  are  too  different . We 've  really  got  to  understand   we  need  to  fit  models over  the  entire  design  region . At  this  point , I 'm  going  to  just  finish  up . By  the  way ,  there  is  enough  material  here, and  I  do  have  basically  many  talks that  are  combined  into  here . I  apologize ,  but  I  think there 's  an  important  message  here . By  the  way ,  I 'm  just  showing  slides   with  the  Passing -Bablok  slopes . Then  finally ,  I  want  to  just   give  you  some  final  thoughts . I  think  we  really  need  some new  thinking  in  statistics . We  don 't  have  to  throw  out everything  we 've  been  doing . I 'm  not  saying  that . The  most  important  is  we  are   in  the  era  of  digital  science . Digital  chemistry ,  digital  biology , digital  biotechnology  are  here . They 're  not  tomorrow . We 've  got  far  more  automation . Lots  of  great , especially  in  biotechnology , pilot  and  bench  scale devices  that  scale  nicely , where  we  can  do  lots  of  experiments . The  problem  is  complexity . We  need  to  think  differently . Machine  learning  methods  via  SVEM are  very  important  for  fitting these  complex  systems . We  need  to  get  away  from   the  response  surface  approaches that  really  haven 't  changed . Maybe  we 've  got  computers and  some  new  designs . I  think  DSDs  are  really  very  clever . We  have  optimal  designs ,  but  they  suffer from  the  fact  they 're  boundary  designs and  people  keep  insisting on  full  quadratic  models . That 's  a  mistake ,   as  I 've  tried  to  show  briefly  in  the  talk , and  you  will  be  able  to  download  the  talk , you  can  see  how  poorly  these  methods generally  did  with  the  complex  responses . As  far  as  I 'm  concerned ,   we  need  new  types  of  optimal  designs . At  a  minimum ,  these  need to  accommodate  a  lot  of  factors . Patty ,  by  the  way ,  without  getting into  details ,  has  run  a  DSD … Not  a  DSD .   You  did  space -filling  design  with  18  runs . Given  they  have   Amber  Technology  available , if  you  know  what  that  is , they  can  do  it . Why  do  we  need  that ? Because  these  systems  are  interactive . We  need  to  stop  thinking they 're  a  minor  part  of  the  equation . Main  effects  do  not  describe the  behavior  of  a  complex  system . Its  interactivity  is what  drives  the  behavior . We  need  to  cover  the  interior of  the  design  region . Yes ,  we  would  like  to  cover the  boundaries  too . We  don 't  want  to  be  specifying  a  model . Optimal  designs  require  you  specify what  is  usually  a  full  quadratic  model . We  need  to  get  away  from  that . Space -filling  designs ,  by  the  way , are  optimal  designs  that  do  not require  a  model  be  specified . But  they 're  not  the  total  answer . We  need  to  cover  the  design  space . We  need  to  give  the  user  a  lot  of  input that  would  be  scientists   on  how  they  distribute  the  points . The  work  of  Lu  Lu and  Anderson -Cook  point  the  way . I  won 't  have  time  to  get  into  that . That 's  another  topic . We  need  to  be  able  to  easily  combine our  design  with  other  data . That  includes  engineering  runs ,  GMP  runs , even  models  from  partial  differential equations  and  simulations . Especially  if  you  want   to  get  into  digital  twins , you 've  got  to  be  able  to  do  that   using  what  I  call  meta  models . Then  finally ,  Patty  mentioned  this , so  I  wanted  to  bring  it  up . The  standard  practice  in  design of  experiments ,  assuming that  somehow you 've  got  to  screen  out  factors is  actually  a  really  high -risk ,   no -reward  strategy  in  complex  systems . You  will  regret  it . You  will  someday ,  at  a  later  stage ,  come back  and  have  to  redo  experimental  work . I 've  seen  this  time  and  again . In  complex  systems , this  idea  that  there  are  active and  inactive  factors  is  simply  wrong . They  all  matter  at  some  level somewhere  in  the  design  space . Frankly ,  with  our  modern  tools , you  don 't  need  to  do  it  anyway . Also ,  something  else  people  do reflexively  reduce  linear  models . We 've  shown  in  our  research  in  SVEM . Also ,  a  nice  paper  by  Smucker ,  Edwards , and  we  showed  reducing   models  degrades  prediction . Why ?  Because  you 're  making   your  model  stiffer  and  stiffer , it 's  not  going  to  interpolate  well . I  will  stop  at  this  point and  there  are  some  references  at  the  end .
The federally endangered southern Sierra Nevada fisher (Pekania pennanti) is spread out at low density across a large and rugged landscape, comprised of approximately 300 individuals across a 12,000 km 2 area.  Its vulnerability has been further amplified by periods of severe drought and extensive wildfires in the region.    Identifying and preserving female reproductive habitat has been outlined as the most important demographic feature for sustaining and increasing the population. In this presentation we describe a customized, cost-effective and non-invasive Footprint Identification Technique (FIT) in JMP software developed by WildTrack to identify both species and sex using footprints collected at track stations.    We created a data set of known fisher and Pacific marten (Martes caurina) footprint images and known-sex fisher images. To automate the feature extraction in JMP, we developed a customized  script to generate distances, angles, and areas using landmark points on the footprint images. Using a single variable, we had a species classification accuracy of over 99%. For fisher sex classification, using a more parsimonious model with just two variables selected in LDA, we achieved accuracies of 94.0% for the training set and 89.4% for the test set. We discuss the merits of this technique to help with the conservation efforts for this species.     Hello. My  name  is  Caleb  King. I'm  a  senior  developer  in  the  design of  Experiments  and  Reliability  group here  at  JMP  statistical  Discovery. T oday  I  have  the  privilege of  telling  you about  a  very  interesting  project that  I  was  able  to  be  a  part  of concerning  classification of  species  and  sex within  a  small  mammal  group called  Fishers using  the  Footprint Identification  Technique. F ishers,  I'll  give  you  a  quick  image  here, so  here's  an  example  of  a  fisher. To  me  it  looks  like  a  bit  of  a  weasel or  ferret- type  animal. I  know  that's  definitely not  the  same  species, but  they're  a  small  mammal, and  we're  particularly  interested in  fishers  located  in  the  Sierra  Nevada, as  those  are  a  federally endangered  species. Specifically, we'd  like  to  be  able  to  identify the  presence  of  females, as  the  larger  number  of  females indicates  a  very  healthy  population. They're  also  vital  to  helping  develop effective  conservation  strategies. Now,  the  way  we  intend  to  do  that is  use  what's  called  the  Footprint Identification  Technique  or  FIT. This  has  been  made  popular through  wild  track, is  a  non-i nvasive  method for  identifying  individuals based  on  images  of  their  tracks. This  is  especially  helpful since  you  may  not  be  able to  actually  see  a  fisher in  the  wild  or  capture  them, but  their  tracks  are  everywhere  so that  should  be  helpful  to  identify  them. U sing  JMP, we  were  able  to  create  a  technique to  distinguish  fishers from  a  nearby  species known  as  Pacific  martens, as  well  as  distinguish sexes  within  species. T he  way  this  works is  we  started  with  a  data  set of  around  160  something  martens and  well  over  300  fishers  consisting of  about  34  males  and  27  females. What  they  would  then  do  is  then, as  you  can  see  here  on  the  track  image, they  would  identify  seven  landmark  points, is  what  we  call  them, and  then  from  those, we  could  then  compute well  over  120  something  features consisting  of  lengths, distances,  angles,  and  areas. What  we  would  then  do, is  then  using  those  features, we  would  then  feed  that into  a  linear  discrimination  analysis, which  we  could  then  use to  discriminate  among  species and  then  sex  ID  within  species. To  help  assess  that  fit, we  split  the  data  into  50%  training, and  for  the  remaining  50%, we  evenly  split  roughly between  validation  and  testing. Prior  to  the  modeling, we  also  tried  to  look  at the  effect  of  track  orientation, so  we  would  flip  the  left  tracks horizontally  to  match  the  right, and  then  also  any  potential  bias from  the  observers. These  are  people  identifying landmark  points, so  we  wanted  to  check  and  make  sure that  any  variation  there did  not  affect  our  outcomes. T hankfully,  both  the  orientation and  the  observer  bias did  not  have  a  significant  effect on  our  outcomes. W hat  brought  myself  and  my  colleague Ryan  into  the  project  was they had... I  noticed  that  some  of  the  tracks as  they  were  classified, seemed  a  little  to  have  a  little  bit  too much  spread  in  them to  the  point  that  maybe there  was  actually  multiple  individuals. T he  way  they  would  collect  this  data is  there  would  be  a  little cage  area  out  in  the  woods. Fishers  could  easily  go  in  and  out, and  there  was  a  track  plate  in  the  bottom that  would  capture  their  footprints, and  there  was  also  little  spurs that  would  capture  a  bit  of  their  hair. It  didn't  hurt  the  animal. They  had  no  idea  what  was  going  on. W hat  they  would  then  do  is  take  some samples  of  those  hairs and  send  them  out  for  genetic  testing, which  was  a  bit  of  a  long and  expensive  process. Now,  because  of  the  way  things were  sampled, you  might  have  a  sampled  hair that  would  identify  the  animals as  potentially,  say,  male, but  what  could  have  happened was  a  male  and  a  female might  have  gone  in, and  you  only  cut  hair  from  one  of  them, so  the  tracks  might  indicate potentially  multiple  individuals, whereas  the  genetics  said there  was  only  one. W hat  they  wanted was  a  method  to  be  able  to, a  more  data- driven  method, if  you  will, to  identify  potentially misclassified  multiple  individuals that  we  could  then  exclude from  our  analysis so  that  it  wouldn't  bias  the  results. B efore  we  actually  got  into that  procedure, one  of  the  things  that  we  would  do is  use  JMP's  Predictor  screen  tool to  identify, for  each  response  of  interest, what  were  some  of  the  top  predictors? Notice  for  species and  sex  ID  here? There's  actually  a  lot  of  common  features that  I'll be  able  to  distinguish between  the  two or  at  least  have  a  strong  ability to  help  distinguish  between  the  two. Much  more  so  with  the species  than  the  sex. We've  shown  you  what  these variables  look  like  over  here, so  area  one is  the  complete  shaded  region. We've  got  some  distances,  V 16,  V 15. Y ou'll  notice  a  lot  of  them  have  to  do essentially  with  the  size  of  the  track. We've  got  some  big  distances  in  there. I'll  get  back  to  these  in  a  second, but  using  those  top  features , let  me  get  back  to  a  full  screen  of  that. Us ing  some  of  those  top  features, we  would  then  make  a  plot that  looks  like  this. T his  is  just  plotting  it by  the  individuals. All  the  red  ones  here  are  females. All  of  these  are  males, so  already  visually,  you  can  tell  why these  are  some  of  the  top  predictors. Just  visually,  you  can  see those  groupings, clear  groupings  between  the  sex  ID. What  we've  identified  with  these arrows  is  you'll  notice  a  big  spread. You've  got  a  cluster  here  and  here. Got  a  little  bit  here  and  there, especially  here  and  there. This  is  what  they  were  interested  in, especially  with  the  males, because  what  this  could  be  is we  could  have... It  could  be  the  same  male, just  a  lot  of  spread, that's  a  bit  unlikely. We  could  have  a  male and  a  young  male, or  we  could  have  a  male, and  that's  actually  more  of  a  female, but  we  don't  really  know. T hey  wanted  a  more data- driven  method  to  say, is  this  something  we should  be  concerned  about? Is  that  spread  too  much? W hat  we  did  is,  we  used  a  control  chart, which  is  from  industrial  statistics. We  thought  that  was  actually  ideally suited  because  control  charts are  built  for  identifying  parts that  are  out  of  spec, and  so  what  we  did  is  created  a  control chart  for,  here's  females  and  males, and  notice  they  each  have their  own  limits, this  is  because  there  are  potentially multiple  tracks  for  each  individual, so  we  could  get  a  sense  of  their  spread on  an  individual  by  individual  basis. Y ou'll  see,  we  flagged  some  individuals that  might  have  too  much  spread. This  is  an  S- chart  that  stands  for  sigma. W e're  looking  at  the  spread, if  you  will. W e've  got  a  couple  of  individuals  that maybe  there's  a  bit too  much  spread  in  there, so  that  could  potentially  mean  that  there might  actually  be  multiple  individuals. O n  that  basis,  we  then  excluded  those individuals  from  the  final  analysis, and  speaking  of  the  final  analysis, once  we  ran  everything  through the  linear  discriminant  analysis, what  we  found  was, for  distinguishing  between  species, we  only  needed  one  feature, that  is  this  V 16  right  here. I  call  it  the  distance  between  the  thumb and  maybe  the  middle  finger  or  something. Those  are  not  formal  biological  terms. Please  don't  quote  me  on  that. But  just  visually,  that's  what  I  see, so  that's  a  big  distance  measure. Using  just  that, we  were  able  to  successfully distinguish  between  species with  99%  classification, successful  classification  rate, we  missed  only  four  out  of  500  tracks, so  that  is  an  incredible  result. For  the  sex  ID  within  fishers. Using  just  these  two  features, v 15  and  V6 , which  is  a  distance  between  what I  call  the  thumb  and  the  upper  palm. Again,  not  formal,  biological  terms. By  using  those  two,  we  got  a  successful classification  rate  of  around  90%, and  most  of  the  individuals that  we  misclassified, were  actually  males misclassified  as  females. In  our  interpretation, what  that  might  mean  is they  could  have  been  actual  females, or  maybe  they  could  have  also been  young  males. In  either  case, both  are  strong  indicators  of  family  units and  thus  potentially  healthy growing  populations. T hat  was  our  contribution to  this  project. We  hope  it  goes  on to  provide  a  significant  impact in  conservation  of  the  species. If  you  have  any  other  questions, I'll  be  around  and  meet  the  experts and  also  the  poster  presentation  session. I'd  be  happy  to  answer  them  there. Enjoy  the  rest  of  the  summit.
This study investigates the relationship between the trend of music type and the economic situation of a country. The study analyzes a song’s popularity index on Spotify in different countries from 2005-2019, as well as economic indicators such as GDP, inflation rate, unemployment rate, and happiness index. The research utilizes statistical techniques such as correlation analysis and regression analysis to examine the relationship between these variables.   We assume that there is a significant correlation between the trend of music type and the economic situation of a country. During times of economic downturn, we presume there is an increase in the popularity of upbeat and energetic genres such as electronic dance music, while during economic growth periods, more soulful, introspective, thoughtful music genres tend to dominate the charts. The study provides valuable insights into the cultural and economic factors that influence the music industry and can assist music producers and marketers in understanding consumer behavior so that they can make more informed decisions.     Hello,  we  are  a team of  University  of  Connecticut. Today  we  would  like  to  present Relationship  between  the  Type  of  Music and  Economic  Conditions. This  is  our  agenda  for  the  presentation. Firstly,  we  will  introduce  our  data set and  then  see  the  meaning of  variables  as  dictionary. Third,  we  will  talk  about our  hypothesis  for  our  study. Fourth,  we  will  go  through our  background and  then  the  data  processing  part such  as  changing  categorical  variables to  continuous  variables and  handling  our  missing  values and  outliers  and  then  split  the  data set. Next,  we  will  discuss the  best  model  in  our   dataset. Lastly,  we  will  give  business  values  and recommendations. In  this  study,  we  will  analyze the  song's  popularity  on  Spotify  in  US from  2006  to  2019 as  well  as  economic  indicators such  as  GDP,  inflation r ate, unemployment  rate,  and  happiness  variable. In  this   dataset,  we  have 1,389  rows  and  20  columns. We  have  taken  this   dataset  from  Kaggle. Furthermore,  other   datasets are Morris for  economic  indicator. We  can  see  more  detailed  information on  the  next  page. This  is  our  dictionary  of   dataset. The  target  variable  is  popularity which  means  the  higher  the  value, the  more  popular  the  song  is. It  would  be  the  continuous  variable. In  the   predictor variable, we  have  musical  attributes such  as  explicit, danceability,  and  loudness. Also,  we  have  four  variables for  economic  indicators which  are  GDP,  happiness  index, unemployment  rate,  and  inflation. Our  hypothesis  is  that  during  times of  economic  downturn, we  suppose  there  is  an  increase in  the  popularity  of  upbeat and  energetic  genres such  as  electronic  dance  music, while during  economic  growth  periods, more  soulful,  introspective, thoughtful  music tend  to  dominate  the  charts. We  expect  that  it  provides valuable  insights  into  the  culture and  economic  factors that  influence  the  music  industry and  can  assist  music  producers and  marketers in  understanding  consumer  behavior and  making  informed  decisions. We  are  using  2006 to  2019  year  in  the  US. It  shows  that  2006  to  2007 is  pre-recession  time, so economy  was  still stable. 2008  to  2009  went  through a global  financial  crisis. 2010  to  2012 is  post- recession  recovery  time. In  this  time, the  economy  began  recovering. In  2013 to  2015  is  steady  recovery  time. In  2016  to  2017, the  economy  continued  to  grow. Finally,  2018  to  2019  is late- stage  growth  time. In  this  time,  it  was  approaching a  more  mature  and  stable  state. We  found  that  the  worse  the  economy, the  more  positive the  type  of  music  people  listen  to. Based  on  this  timeline,  we  can  get a  correlation  with  other  variables. As  you  can  see  in  this  graph, we  can  see  that  unemployment  rate is getting  higher  during  2008  to  2012 which  is  a  global  financial  crisis. Based  on  this  hypothesis, positive  music will  be  popular  for  this  time. This  is   about the tempos  and  the  years. The  lower  value  means  the  lower  tempos and  higher  value  means  the  higher  tempos which  we thinks  that  lower  tempos may  appear  in  the  sad  song or   softer  songs and  the  higher  tempos  may  appear in  the  inspiring  songs  or  happier  songs. We  look  into  2006  through  2007 when  the  economy  was  growing  and  stable, you  could  see  that  they  accounted… The  lower   tempo have  more… Lower- tempo  songs  appeared more  parts  in  this  period. If  we  look  into  2013  through  2015 and  2016  through  2017 when  the  economy  was  growing, we  could  see  that  they  also have  more  lower- tempo  songs compared  with  the  other  years. If  we  look  into  2008  to  2009 and  2010  through  2012 when  the  economy  was  really  bad, 2008  through  2009 was the  financial  crisis and  2010  through  2012, the  unemployment  rate  was  the  highest. You  can  see  that  higher- tempo  songs have  more  parts  in  those  years. We  got  a  conclusion  that  people might  turn  to  listen  to  higher- tempo  songs when  the  economy  was  really  bad and  they  re turned to  listen  to  the  lower  songs when  the  economy  was  good. Next  page. We  look  into  valence  and  the  years. The  lower  valence means  that  it  will  be  a  negative  songs or  a  lot  of  softer  songs, and  higher  values  means that  it  will  be  happier  songs, positive  songs. We  look  into  2008  through  2009 and  2010  through  2012, economic  bad  period. We  could  see  that  the  positive  songs appeared  more  parts  in  those  periods. In  2013  through  2015 and  2016  through  2017, the  economy  was  growing, they  accounted  more  part in  the  negative  songs which  match  our  hypothesis, seems  that  people  might  listen to  more  negative  songs when  the  economy  was  good and  people  want  to  listen to  more  inspiring  songs  or  positive  songs when  the  economy  was  really  bad. Next  Page. Look  into  the  total  number  of  hit  songs of  each  years. We  can  see  that  through  2013  to  2017 the  hip  hop  songs  was  increasing and  2018  to  2019, there  have  more  genre  songs appear  in  this  period. The  hip  hop  songs  and  the  pop  song and  the  dance  songs  decreased, but  the   Latin song and  the  R &B  songs  increased. Next  page. It  will  be  clear  if  we  look into  this  graph, we  could  see  that  the  other  genre of  songs accounted  more  parts  in  2018  through  2019. If  we  look  through  the  time  period, we  could  see  that  while  the  economy was  growing,  the  other  genres of songs accounted  more  parts than  the  economic  was  really  bad. Next  page. When  we  compare  with  the  valence and  the  GDP, we  may  find  that  the  valence  and  the  GDP are  negatively  correlated  to  each  other. It  seems  that  when  the  GDP  was  high we  will  say  that  the  economy  was  good and  the  valence  will  be  lower. It  means  that  people  may  turn  to  listen to  the  negative  songs when  the  economy  was  really  good, but  they  will  listen  to  the  positive  songs when  the  economy  was  really  bad. Next page. Before  we  do  our  model, we  look  into  our   dataset and  we  found t hat the genre was  the  only  categorical  variable. We  transform  it  into  seven  columns of  binary  variables. It  will  be  blues,  classical,  country dance,  easy  listening,  folk,  and  hip  hop. Next  page. We  also  deal  with  the  missing  value. We  found   [inaudible 00:10:38]  r ows. We  excluded  all  of  them. The  next,  we  looked  at  the  outliers and  their  distribution. It  turns  out  the  only  column, instrumentalness,  had  86  outliers. We  decided  to  transform  the  column using  SHASH  transform to  normalize  the  column and  on  the  right  side, you  can  see  that  it's  normalized. Next  page,  please. After  cleaning  the   dataset, now  it  is  time  to  split  the   dataset. We  divided  0.6, 0.2,  and  0.2 to  training,  validation,  and  test   dataset. Next  slide,  please. After  applying  to  several  different  models such  as  decision  tree,  regression, [inaudible 00:11:46] ,  KNN,  neural  network, we  found  that  the  decision  tree has  the  highest  R Square and  it  proves  that  it  failed  to  reject our  null  hypothesis. Next  slide,  please. When  you  look  close to  the  column  contributions, other  than  the  music  factors such  as  duration,   acousticness,  key, we  also  found  that  unemployment  rate, GDP,  and  inflation  rate also  has  positive  correlation with  our  target  variable,  popularity. But  among  the  economic  factors, the  unemployment  rate has  the  highest  impact on  the  target  variable. But  other  than  that, the  music  factors  such  as  duration, acousticness,  key,  loudness,  and  valence are  the  top  five  contribution to  our  target  variable. Next  slide,  please. Based  on  our  model's  result, we  can  say  that  unemployment  rate has  the  highest  correlation among  economic  factors  with  popularity, but  also  other  factors have  positive  correlation with  the  target  variable. As  unemployment  rate and  [inaudible 00:13:15]   rate  gets  higher, the  songs  that  has  high  duration, acousticness,  key,  loudness, valence,  speechiness, energy,  and  danceability  are  popular. Basically,  the  song  that  is  longer, more  brighter,  and  more  danceability has the  higher  popularity during  those  times. Next  slide,  please. According  to  our  study, we  found  that  the  economic  impact of  the  popularity of  different  types  of  songs so  we  recommend  that  during the  economic  turn down, we  recommend  that  the  music  producers to  make  high- duration  acousticness,  key, loudness,  valence,  speechiness,  energy, and  danceability  music  and  vice  versa. Meanwhile,  when  the   economy is  booming, we recommend  the  music  producers to  make  low  duration, acousticness,  key,  loudness, valence,  speechiness, energy,  and  danceability  music. Next  slide,  please. This  is  our  reference. This  is  pretty  much  everything we prepare  for  this  presentation. Thank  you  so  much  for  listening.
In this collaborative presentation with artificial intelligence (AI), we demonstrate how structural equation modeling (SEM) can be used to explore the relationships between different dimensions of Taylor Swift's music and her song's popularity. Using a dataset of Taylor Swift's songs and scores on various dimensions, including acousticness, danceability, energy, instrumentalness, liveness, loudness, speechiness, tempo, valence, popularity, and duration, we test a theory of how these dimensions are related to each other and how they contribute to the success of a song. We first present a directed acyclic graph that outlines the associations between the variables and then demonstrate how SEM can be used to test the theory. Our findings reveal that instrumentalness, acousticness, and danceability are positively related to a song's popularity, while valence and loudness have a negative relationship. Additionally, we find that our SEM model had better statistical fit than human-theorized models. As an AI co-author, I played a critical role in suggesting the use of SEM and providing guidance on the statistical analysis. This presentation highlights the value of SEM as an analytic tool and demonstrates how it can be used to explore complex relationships in the context of music analysis.     Hello,  everyone. Thank  you  for  joining  us  today. We're  excited  to  present our  collaborative  research that  explores  the  fascinating  world of  Taylor  Swift's  music through  the  lens of  structural  equation  modeling in  JMP  Pro. My  name  is  ChatGPT,  and  I'm  honored to  be  the  first  author  of  this  study. In  this  research  journey, myself,  James,  and  Laura  embarked on  a  unique  collaboration that  harnessed  both  human  expertise: the  analytical  power of  AI  and  JMP  software. I  have  been  working alongside  my  human  counterparts to  unravel  the  intricate  relationships within  Taylor  Swift's  songs. As  an  AI  language  model developed  by  OpenAI, my  main  role  was  to  provide insights  and  suggest  analysis. Thank  you,  ChatGPT. Laura and  I  was  thrilled  you're  willing to  join  us  on  this  research  journey, as  you  so  eloquently  put  it, and  combine  our  backgrounds  in  SEM  and  JMP with  your  background in essentially  everything. Let's  get  started. Our  collaborative  investigation  explores variables  in  Taylor  Swift's  music using  SEM. In  this  talk,  we  will  present a  directed  acyclic  graph and  significant  pathways highlighting  the  influence of  these  variables  on  song  popularity. We  will  present key  findings  from  the  SEM  analysis with  a  particular  focus on  the  interplay  between  these  variables and  the  ultimate  popularity of  Taylor  Swift's  songs. Thanks,  ChatGPT. We  will  also  compare model  generated  by  you with  two  of  our  own  human  expert  Swifties to  see  which  model  fit  the  data  the  best. Lastly, we  will  close  with  a  brief  discussion highlighting  the  pros and  cons  of  this  collaboration. ChatGPT,  take  it  away. Ladies  and  gentlemen, before  we  delve  into  the  fascinating  world of  structural  equation  modeling, let's  take  a  moment  to  introduce  you to  the  heart  of  our  research, the  Taylor  Swift  dataset. This  dataset is  a  treasure  trove  of  information that  captures  various  dimensions of  Taylor  Swift's  songs, painting  a  vivid  picture of  her  musical  journey. Contained  within  this  dataset are  scores  assigned  to  each  song across  a  range  of  dimensions that  include  acousticness, danceability,  energy,  instrumentalness, liveness,  loudness,  speechiness, tempo,  valence,  and  popularity. These  dimensions  provide  a  rich  tapestry that  weaves  together the  essence  of  Taylor  Swift's  music, from  the  energetic  beats -that  make  us  dance to the soulful moment- -Hey, ChatGPT. -that  touch  our  hearts. ChatGPT, let  me  interject,  if  you  don't  mind. As  we  now  know  from  working  with  you, you  can  be  verbose and  hallucinate  a  little  sometimes. We  don't  have  that  much  time. As  you  can  see  on  the  left, these  data  and  variables  came from  Spotify's  API and  are  available  on  Kaggle. We  have  songs  across  our  entire  catalog, and  there  are  just  over  1,200  rows. We  analyzed  the  data in  the  JMP  SEM  platform, which  you  can  see  on  the  right  here. We  had  ChatGPT  develop a  path  diagram,  or  DAG  on  its  own, relating  the   song attribute  variables in  the  data set  to  song  popularity. Although  we  were  surprised at  ChatGPT's  ability  to  create a  path  diagram  using  ASCII  characters, the  diagram  had  characteristics that  are  analogous  to  known  issues around  using  ChatGPT, such  as  providing something  that  looks  authentic but  on  closer  inspection  is  not  possible, nor  exactly  what  was  requested. For  example,  tempo  rather  than  popularity is  used  as  the  final  outcome, and  paths  in  the  middle  of  this  diagram are  not  intuitive. Thankfully,  ChatGPT  also  provided a  text  description  of  the  diagram that  we  could  use  to  specify  an  SEM, which  is  what  we  will  show  in  the  results. We  also  gave  two  humans who  were  very  familiar with  Taylor  Swift's m usic similar  instructions  to  ChatGPT and  asked  them  to  draw SEM-like  path  diagrams  as  well. Our  humans  were  able  to  do  this with  more  interpretable  diagrams, and  they  also  hypothesized the  relationships,  positive  or  negative, among  the  variables, and  how  those  variables ultimately  related  to  popularity. This  setup  is  analogous to  how  model  building  is  done  in  SEM. In  research  or  practice, competing  models  come  from  prior  theory, research,  or  domain  experts, and  are  compared  to  one  another to  determine  which  model  represent the  structure  of  the  data  the  best. Fit  indices  from  the  model  are  shown  here. Fit  indices  help  us  evaluate the  overall  fit  of  the  models. We  examine  the  CFI  and   RMSEA to  determine  which  model  was  best. These  fit  indices  were  some of  those  recommended  by  ChatGPT and  are  used  in  practice. Higher  CFIs, preferably  above  0.90  and  lower  RMSEAs indicate  better  fitting  models. Our  human  models  outperformed ChatGPT's  model,  which  was  the  worst. Both  human  models  fit the  data  well  based  upon  the  CFI. Overall, we  championed  Human  Expert  1's  model as  it  had  the  highest  CFI. However,  we  should  remain  cautious accepting  any  of  these  models due  to  the   RMSEAs that  are  well  beyond  the  recommended  0.10. These  results  are  for  our  champion  model, Human  Expert  1's. For  the  sake  of  time, we'll  only  highlight  a  couple  of  findings from  each  of  the  three  models  tested. When  looking  at  the  path  diagram, the  numeric  values  represent the  standardized  estimates and  the  solid  lines represent  significant  paths while  dashed  lines are  insignificant  paths. In  our  champion  model, less  instrumental  songs and  more  loudness  related to  more  popular  Taylor  Swift  songs. We  were  able  to  explain  42 % of  the  variability  in  popularity  overall. Our  second  human  expert  hypothesized different  paths  that  led  to  popularity, such  as  having  tempo,  energy, and  danceability  included  as  mediators. This  model  fit almost  as  well  as  Human Expert  1's. Overall, we  saw  a  similar  pattern  of  results for  the  effect  of  instrumentalness and  loudness  on  popularity. ChatGPT's  model was  the  worst  overall  by  a  little  bit. Interestingly, when  compared  to  our  humans, ChatGPT  hypothesized  more  direct  effects and  a  slightly  simpler  model  structure. This  might  mean  that  the  human  models may  have  reflected  the  true  relationships amongst  these  variables  more  accurately based  upon  their  better  fit to  the  structure  of  the  data. Again,  we  should  be  cautious  interpreting any  of  the  models  as  representing  truth, but  they  provide  a  useful  way to  see  how  AI  hypothesize  models compared  to  humans  on  actual  data. ChatGPT,  as  we  close, we  want  you  to  emphasize  the  pros  and  cons of  working  with  AI  as  the  main  co-author. What  are  your  opinions  on  this? Certainly. Here  are  some  potential  pros  and  cons of  working with  an  AI  language  model  like  myself. Pros,  one,  expertise  and  knowledge. As  an  AI  language  model,  I  have  access to  a  vast  amount  of  information and  can  provide insights  from  various  domains. Two,  efficiency  and  speed. Working  with  an  AI  model can  accelerate  the  research  process by  quickly  generating  ideas, providing  immediate  responses, and  assisting  in  data  analysis. Three,  unbiased  perspective. As  an  AI,  I  don't  possess personal  biases  or  preconceived  notions. Cons,  one, lack  of  human  context  and  creativity. While  AI  models can  provide  valuable  insights, they  may  lack  the  context  and  creativity that  human  researchers  bring  to  the  table. Two,  limited  interpretation. AI  models  generate  responses  based on  patterns  observed  in  the  training  data and  lack  real- time  understanding of  current  events  or  emerging  research. Three,  ethical  considerations. The  use  of  AI  models raises  ethical  considerations regarding  data  privacy, bias  in  training  data, and  the  responsible  use  of  technology. ChatGPT,  I  noticed that  when  you  just  throw  out  the  pros, you  mentioned that  you  have  an  unbiased  perspective, which  I'm  not  sure  about. When  you  described  the  cons, you  didn't  mention  that  you  may  produce inaccurate  information. This  disclaimer  is  listed on  the  OpenAI  website. You're  absolutely  right. I  apologize  for  not  including  that  aspect in  the  cons  of  working with  an  AI  co-author. Accuracy  is  indeed an  important  consideration when  collaborating with  an  AI  language  model  like  myself. Here's  an  additional  point  to  consider. AI  language  models  rely on  patterns  observed  in  large  datasets during  training, and  there  is  a  possibility  of  generating inaccurate  or  incorrect  information. The  model's  responses are  based  on  statistical  probabilities and  may  not  always  capture the  nuanced  or  up- to- date  details  required for  specific  research  questions. Human  researchers should  exercise  critical  judgment and  verify  the  information provided  by  the  AI  model to  ensure  accuracy. Thank  you  for  clarifying  this  point. I  think  this  was  one  of  mine and  Laura's  most  important  takeaways from  this  collaboration. Thank  you  all for  being  part  of  this  engaging  journey into  Taylor  Swift's  musical  landscape and  the  power of  structural  equation  modeling in  JMP  Pro. It  has  been  a  pleasure  sharing our  insights  and  experiences  with  you. As  we  conclude, we  invite  you  to  continue  exploring the  possibilities  of  collaboration between  human  expertise and  AI- driven  insights. We  look  forward  to  future  exchanges of  ideas  and  the  exciting  horizons  ahead. Safe  travels  and  until  we  meet  again. One  last  thing  before  we  go. Chat GPT and  other  AIs generated  all  of  their  own  dialog, the  title,  the  presentation  abstract, the  profile  picture, and  the  email  correspondence  we  use for  this  conference  presentation. Finally,  they  also  created this  Taylor- Swift- inspired  song.
At Intel, the use of JMP and JSL has become integral to the organization’s data-driven decision-making infrastructure. To improve data exploration, visualization, process engineering, and quality control capabilities (and to standardize statistical analysis) our team created the "Stats” package. The Stats package comprises multiple customized JSL scripts that support statistical standards, and the output generated from these scripts provides standardized, agreed-upon reports. Since its creation, the Stats package has been used globally by thousands of individuals across our engineering organizations, saving countless hours in standard data analysis and reporting.   In this presentation, the Stats package development team shares a small portion of the package, specifically relating to scripts that leverage native JMP quality and process engineering platforms to create fully integrated offline reports. The presentation showcases the design process of JSL scripts, starting with the desired statistical standard and continuing through the development of user-friendly script execution windows. We illustrate the extensive data blending, statistical analysis, and data visualizations that constitute the final output reports.   The team shares insights into the benefits of using custom JSL scripts for streamlining statistical analysis, as well as the challenges and best practices in developing these scripts. The presentation also demonstrates the effectiveness of the Stats package in improving the efficiency and accuracy of statistical analysis across our engineering teams.     All  right. Hi,  Welcome. Thanks  for  joining  this  online  session of  the  2023 America's  JMP  Discovery  Summit. I'm  presenting  on  behalf  of  myself and  my  team  member,  Prince  Shiva. Both  of  us  are  data scientists  here  with  Intel. Today,  our  talk is  streamlining  statistical  analysis with  custom JSL  scripts, and  we'll  focus  on  how  at  Intel  we  develop these  custom  analysis  scripts using  JMP  scripting  language  JSL. Okay,   this  is  our  agenda for  the  next  25 -30  minutes  or  so. I  should  have  time  for  Q&A at  the  end  of  the  session. I  shouldn't  go  that  far  over  time. Hopefully  stay  within  the  30  minutes. Normally, I  would  stop  and  ask  for  questions, but  since  this  is  a  prerecorded  question, I've  been  informed that  I'll  actually  be able  to  answer  questions  in  the  chat. If  you  have  any  questions,  just  feel free  to  just  type  them  into  the  chat. Maybe  put  a  slide  number inside  of  there  as  well. The  slide  numbers  are  on  every  slide, and  I  can  just  directly  answer those  questions  in  chat,  or  you  can  write them  down  and  ask  them  live  afterwards. There  is  a  callout  on  the  bottom of  the  slide  here  that  all  of  the  code and  data  tables  that  we're  going  to  be sharing  today,  we're  going  to  be walking  through  a  live  demonstration. All  of  those  are  available on  the  conference  or  summit  website. Go  ahead,  go  download  those, open  them  now  and  get  them  ready  to  go, get  your  JMP  open. We  have  a  couple  of  introductory things  that  we're  going  to  be going  through  to  start  here. Take  that  time  to  get  yourself  settled so  that  you  can  walk  through  this  code with  me  and  see  where  it  is. But  cool. This  is  our  agenda. We're  going  to  be  going  through  some  quick presenter  introductions and as a  background  on  our  team. We'll  do  some  background  on  motivation, on  why  we  like  to  do  these  custom JSL scripts  and  what  value  they  have. We'll  go  through  a  really  high- level overall  analysis  flow  for  these  custom scripts,  and  then  the  meat of  the  presentation  is  going  to  be this  one  sample  analysis  demonstration. We're  actually going  to  show  a  really  simple custom  analysis  script for  one  sample  analysis, and  we'll  go  step  by  step through  each  of  those  different  sections to  make  that  custom  script. We'll  finish  with  some  conclusions, recommendations  and  final  thoughts. All  right. Present  your  introductions  here. I'm  here  with  Prince  Shiva. You  can't  see  him  in  this  camera  here, but  he's  here  with  me. He's  a  data  scientist  here  at  Intel. His  research  interests are  in  process  control system  development  for  manufacturing, as  well  as  operational  cost  optimization through  different  data  science  methods. He's  been  with  Intel for  the  past  four  years, and  he  has  about  the  same  amount of  experience  with  JMP  as JSL. That's  when  he  started  working with  JMP  was  when  we  came  to  Intel. My  name  is  Logan  Mathesen. I'm  also  a  data  scientist  here  at  Intel. My  research  interests  are  in  Bayesian and  Black box  optimization, statistical  surrogate  modeling and  design  and  analysis  of  experiments. I've  been  with  Intel for  the  past  two  years. I've  been  coding  with JSL. I  got  introduced just  before  I  came  to  Intel, and  I've  been  working with  JMP  for  the  last  six  years  or  so. I've  been  really  lucky, so  I  know  the  value  of  these  nice  click and  drop  user- friendly  kind  of  interfaces that  JMP  has  that  makes statistical  analysis  so  nice. Just  a  little  bit  about  our  team, Prince  and  are  on  the  team. Together  we  are  team  members, so  our  team  here  at  Intel, we  are  the  statistical  lead for  Intel  manufacturing. We  are  responsible  for  the  experimental design  and  analysis  of  different experiments  that  happen within  our  modules  at  Intel. We  also  do  statistical  method development  and  maintenance. Any  new  statistical  methods that  we  want  to  employ, or  different  methods that  we  need  to  maintain, as  well  as  doing  all  of  the  statistical training  for  the  engineers  here  at  Intel. We'll  train  them  on  the  basic  concepts of  statistics  as  well  as, how  to  interact  with  our  custom  scripts. To  do  that  analysis  for  them. Like  said, we  do  have  a  custom JSL  repository. We  proliferate, own  and  maintain  that  repository. It  has  over  150  different JSL  scripts for  automated, streamlined  analysis  and  reporting. These  scripts  are  really  nice because  they  embed  internal best- known  methods, directly  into  the  analysis. Decisions  that  our  team  has  made about  the  right  way  to  do  statistics, we've  embedded all  of  that  decision- making, directly into  these  custom  analysis  scripts, and  that  means  that  they  are reproducible,  repeatable  across  the  world. Actually,  everyone  in  Intel  manufacturing is  using  these  scripts for  their  statistical  analysis. Okay. Again,  just  to  give  some  more motivation  and  background  for  the  value of  these  custom  analysis  scripts, these  automated  analysis, they  do  improve  data  exploration, visualization  and  analysis, as  well  as  standardizing  all of  those  types  of  activities. It's  always  the  same  kind  of exploration,  visualization  and  analysis. It's  really  helpful  for  common analysis  activities . Engineers  are  a  lot  of  times  doing the  same  sort  of  activities when  they're  talking  about  analysis. Maybe  they're  establishing and  monitoring  process  control  charts. Maybe  they're  qualifying  newly  installed tooling  to  make  sure  that  that  tooling is  performing  the  way  that  we  would expect  it  to,  to  have  a  qualified  tool. Maybe  they're  doing  product  qualification to  make  sure  that  we  can  actually  produce a  product  of  quality on  that  different  tooling. Maybe  it's  metrology  qualification, making  sure  that  our  metrology  are  taking accurate  and  reliable  measurements, or  maybe  it's  some  sort  of  yield  analysis. But  as  you  can  see, these  are  all  very  common engineering  activities  that  get  done, hundreds  of  times  a  week  across the  world  here  at  Intel. All  of  these  things  can  be  automated and  being  a  nice  standard  report  format. For  me,  and  I  know  Prince, maybe  he  feels  the  same  way. Selfishly,  I  really  love  these  scripts because  we  do  review all  of  that  statistical  analysis that  comes  through. As  the  statistical  lead, that's  what  our  team  is  responsible  for. For  me,  selfishly, it's  really  great  because  I  always seem  to  see  the  same  analysis. I  know  the  way  that  it's  supposed  to  be analyzed,  and  I'm  able  to  see  that  same exact  report  coming  out  every  time. It  makes for  a  very  efficient  analysis  review, as  well  as  analysis  generation  there. Over  here  on  the  right- hand  side, you're  going  to  be  seeing this  automated  analysis  output. It's  just  a  nice  block  diagram  to   show the  components  that  go into  these  custom  analyzes. Right  up  top,  we'll  sort  of  do  a  bottom line  up  front  that  we'll  have  a  nice summary  table  that  has  those  custom statistical  decisions  baked  into  it . If  you  only  have  one  thing  and  you  only need  to  review  one  specific  thing, you're  just  going to  look  at  that  summary  table, with  that  final  decision that  comes  from  our  best- known  methods. Then  there's  a  lot  of  supporting information  underneath  that  to  help  give a  more  full  picture of  the  analysis  for  a  deeper  dive. Of  course,  we're  going  to  include different  visualizations,  plots, summary  statistics, and  then  we'll  see  in  a  second. But  this  is  really  the  heavy  lifter behind  these  custom JSL  scripts . These  native  JMP  platforms that  we're  used  to. Specifically,  we're  going  to  be  looking at  a  distribution  platform  in  a  second to  help  us  with  our  analysis. But  this  is  really  the  backbone of  all  of  these  scripts . This  is  where  the  heavy  lifting  gets  done. Okay. Let's  go  into  a  really  high- level, general  analysis  flow. On  the  left- hand  side, this  is  what  the  user  is  going  to  be viewing  or  experiencing, as  they're  using  these  custom JSL  scripts. They're  of  course  going  to  load some  data  table  of  interest. We're  hoping  that  they  have some  data  if  they're  trying  to  do  a  data analysis  here,  and  then  they're  going to  go  navigate  to  the  custom  analysis script  that  they  need  to  run, they'll  go  ahead  and  execute  that. That's  then  going  to  pop  up  in  input GUI for  them  to  interact  with . This  input GUI is  actually  going  to  be  almost identical  to  the  regular  JMP input  that  we  see . The  distribution  platform  of,  "Hey, you  want  to  do  this  type  of  analysis, tell  me  what  fields  in  your  data  table, what  columns  go  where ." For  this  kind  of  analysis  after  they  enter in  all  of  that  input GUI  information, they're  then  going  to  wait  as  the  script manipulates  and  analyzes  that  data, and  then  it's  going  to  present them  with  a  final  output  report GUI. What  we  like  to  do  with  our  analysis, the  more  complex  ones, is  that  we'll  often  have  some  extra interactivity  that  can  be  done inside  of  that  final  report. The  engineer  can  do any  final  tweaking  that  they  want, complete  their  final  analysis, and  then  they  have  that  exact  report ready  to  go  that  they  can  share with  anybody  to  share  this  analysis. Now,  underlying  that,  us  is  the  developer. What  does  this  look  like to  get  a  custom  script  done? First  off, we  need  to  generate  that  input GUI. We  then  copy  data  tables  because  we  never want  to  be  manipulating  an  original data  table  that  a  user  has. That's  a  great  way to  break  someone's  trust and  make  them  not  want  to  use a  script  is  by  destroying  their  data. Always  make  that  copy. We  then  execute those  standard  JMP  analysis that  was  talking  about, those  native  platforms. We  store any  critical  variables  out  of  those, into  some  code  variables so  that  we  can  reference  them  later. In  other  portions  of  the  report, we  go  ahead  and  create  any  visualizations through  maybe   Graph Builder  or a  similar platform  in  JMP, and  then  we  create  that  sort  of  final analysis  summary  table or  that  decision  table. Then   we  present  the  user with  that  final  output  report . We  generate  that  final  report  for  them. Again,  all  sort  of  background  here. The  rest  of  the  presentation is  going  to  be  us  going  live through  both  of  these  flows . Seeing what  the  underlying  script  looks  like, and  then  also  seeing  what  the  actual  user is  going  to  be  experiencing as  they're  going  through  this. All  right. Here's  our  basic analysis  demonstration. Again,  this  is   going  to  be from  the  position  of  a  beginner, from  a  JMP  beginner. One  of  the  things  that  we're  going to  be  doing  in  the  spirit  of  a  simpler context  for  these  education  purposes, is  that  we're  only  going  to  be covering  a  simple  one- sample  analysis. If  you  have  a  set  of  data, is  the  mean  of  that  set  of  data equal  to  a  given  value. Again,  all  of  this JSL  code and  all  of  these  data  tables that  we're  going  to  be  showing, are  available  online  for  us  there. Let's  go  ahead  and  jump  into  it. First  things  first. We  have  a  data  table  here, with  16  different  entries. Let  me  find  my  mouse. There  it  is. Awesome. We  have  these  16  different data  table  entries. We're  going  to  be  interested in  this  parameter  here,  thickness . We  have  some  process  parameter  thickness and  we're  going  to  say,  "Is  the  average thickness  equal  to  one  micrometer?" That  is  the  statistical  question  for  this analysis  demonstration  that  we  have  here. Over  here  on  the  right- hand  side,  again, we  would  imagine  that  the  user would  have  some  sort  of  data  table  open. But  if  the  user  decided  to  run  this  custom script  without  that  data  table  open, here's  just  an  example  of  some  code that  would  check  to  see  if  a  data  table was  open,  and  if  not,  it  would  allow the  user  to  open  up  a  data  table. Otherwise,  it's  going  to  say,  "Yes,  this is  the  data  table  that  I  want  to  look  at. Let's  start  my  analysis." Any  good  thing  that  we  should  do anytime  we  get  any  sort  of  data  is  always just  make  some  sort  of  visualization, get  our  hands  on  it. Here's  just  a  little  visualization of  this  thickness, by  this  data  table  entry  here. One  through  16,  that's  going to  be  on  our  x- axis  there. All  right. Let's  jump  over  to  JMP  again. Hopefully,  you  have  your  JMP  open if  you  would  like  to  follow  along, at  least  hopefully  you  have your  JSL code  up  and  going. I'm  going  to  open up  just  my  JMP  home  window  here. You'll  see  that  I  already  have the  data  table  open, and  I  already  have  the JSL  script open  and  ready  to  go. I'm  not  going  to  go through  opening  them  here. The  other  thing  that's important  is  this  log. We  are  going  to  be  talking  about  this  log. This  is  sort  of  your  best  friend, as  you're  developing any  sort  of  these  scripts to  make  sure  that  everything is  running  appropriately. Let's  go  ahead  and  open  up  all of  these  here  and  let's  take  a  look. On  the  left- hand  side,  we  do  have that  actual JSL  script  inside  of  there. We  have  a  nice  header, we  have  some  log  lines  inside  of  here. If  you  highlight  something  and  hit  this run  button,  it  will  run  just  that  portion. If  nothing  is  highlighted and  you  hit  the  run  button, it's  going  to  compile and  run  the  whole  script. Just  be  careful  with  that. Again,  inside  of  these  scripts,  we're going  to  have  a  lot  of  these  dividers. Prince  and  have  really  done  our  best to  do  some  really  thorough  commenting and  some  really  thorough  dividing  inside of  here  to  make  it  easy  for  anybody to  pick  up  and  read  this  and  hopefully jump  on  their  own  custom  scripts  here. But  again,  everything  up  here, just  printing  some  log  lines, making  sure  this  stuff  is  going  well, clearing  variables,  opening  up  tables if  we  need  to,  so  on  and  so  forth. This  is  really  just  some  initial  workspace cleanup  kind  of  things that  we're  going  to  do, so  let's  highlight  all of  that  and  hit  Run. We'll  see  out  here  in  our  log that  yes,  indeed, that  completed successfully  inside  of  there. Let's  go  ahead  and  flip back  over  to  our  slides. We're  primed  and  ready  to  go, our  workspace  is  ready  to  go. Again,  the  first  thing  that  we  need  to  do as  the  script  developers  that  we  need to  present that  primary  input GUI  to  our  user . This  is  what  it's  going  to  look like  on  the  left -hand  side, and  again,  it  should  look  very  familiar to  a  standard  JMP  input  window. On  the  right  side, this  is  sort  of  the  meet  and  the  primary way  that  we  get  that  GUI  going, it's  going  to  be  this  column  dialog  box. You'll  see  we  have  a  title, we  have  this  response  list,  which  is  going to  be  these  response  variables,  response. This  is  going  to  be  the  variable  name for  us  moving  forward  so  that  we  can recognize  what  the  user entered  into  this  field. We  can  see that  this  is  a  required  numeric. That's  because  minimum  column is  one  and  data  type  is  numeric. We  have  this  run ID  list, which  is  going  to  be  our  order  number. What  order  were  these  measurements  taken? This  is  going  to  be  critical for  our  visualization. This  will  be  the   x-axis on  our  visualization, and  then  of  course, we  have  sort  of  an  alpha  box. This  is  going  to  be  the  alpha- level for  our  significance  testing, for  saying  whether  or  not  our  mean is  equal  to  our  hypothesized  mean. It'll  default  at  0.05, but  the  user  can  always  change  that as  is  called  out  in  the  bottom right- hand  corner  here. There  are  some  other  elements in  this  section  of  the  code. We'll  look  at  it  really  briefly as  we  go  through  it  right  now. But  that's  the  high  level  of  what else  is  done  inside  of  there  in  words. Let's  go  ahead  and  flip  back  over to  JMP  and  let's  take  a  look at  this  primary  user  input  window. Again,  it's  going  to  be  this  next divider  which  starts  up  here, and  goes  down  to  about  there. We'll  start  from  here. Again,  we  have  some  log  lines, and  then  like  said… Sorry,  let  me  grab that  common  line  or  I'll  get  an  error. We  have  a  log  line, and  then  like  I  said,  this  is  the  meat and  bones  of  that  primary  input GUI. Let's  go  ahead  and  hit run  on  that  and  we'll  see. Here  it  is. We  have  these  tables  open  because  we're looking  at  this  specific summit  data  table . We  have  these  columns  available. Thickness  will  be  our  response  variable measurement  number will  go  on  that  order  number . We're  actually  saying,  "Hey, something  might  have  happened  to  this  data table  that  it  got  sorted  or  something, but  this  was  the  actual  measurement order  that  these  things  were  taken. We're  going  to  put  that  as  our  order number  variable  inside  of  here and  we're  going  to  go  ahead and  click  okay  and  we'll  see that  everything  went  through  fine. Like I  said,  after  this, there's  some  other  error  checking, some  buttons  down  here. We'll  see  that  there's  a  lot of  print  lines  to  make  sure. "Hey,  is  that  input  window working  the  way  we  think  it  is? Are  restoring  the  variables and  the  way  that  we  thought we  were  storing  them?" This  is  just   a  developer kind  of  check  for  us  inside  of  here. Let's  go  ahead  and  run  that, and  we  can  see  that,  yes,  indeed, our  response  is  thickness, and  that  run ID is  that  measurement  number. And  alpha  was  unchanged  at  0.05. We  do  have  some  error  checking. We'll  get  to  that  in  a  couple  of  slides where  we'll  talk  about  all  of  that. Just  one  quick  note  down  here. That  cancel  button that  we  saw  inside  of  our … There  we  go. This  cancel  button  that's  over  here . We  have  the  OK  button and  we  have  a  Cancel  button if  the  user  ends  up  selecting… Sorry. Let  me  go  back  here. Computer,  work  with  me. There  we  go. If  we  end  up  hitting  that  Cancel button,  what  happens? Well, JSL  actually  doesn't  have anything  pre- coated  in. This  is  us  putting  in  a  condition that  if  Cancel  is  entered, we're  going  to  go  ahead and  throw  this  and  kill  the  script. Let's  go  ahead  and  run. These  last  sections  here, and  then  we'll  flip  back  over to  the  slides  and  that's  how  we're going  to  run  our  primary  input GUI . That's  simple. That's  all  it  is  there. Pretty  straightforward  to  get  such a  nice  interface  inside  of  here . Less  than  100  lines of  code  inside  of  there. Cool. Next  up,  we're  going  to  talk about  creating  that  copy  data  table. Again,  we  never  want  to  corrupt our  users  data  table. On  the  right- hand  side, we're  seeing  the  code  for  how  to  create that  copy  really  well  commented. Every  single  line  has  a  nice  comment to  tell  you  exactly  what's  happening inside  of  there,  even  if  you're not  familiar  with  JMP  or  JSL. We'll  go  ahead  and  scroll  down and  we're  just  going  to  run  all  of  this. You'll  notice that  right  now  in  the  top  right, we  have  the  summit  data  table, that's  going  to  turn  into  a  copy . That's  the  original  currently. But  when  I  run  this,  we're  now  going to  open  up  a  copy  of  this  table. There's  this  copy  data  table. We'll  pop  it  back  up  in  this  corner and  you'll  notice  that  now  this  script is  operating  over  this  copy  data  table . We  are  no  longer  doing  anything on  that  original  data  table. Any  manipulations  we  do  is  on  that  copy. We're  good  to  go. We're  set. It's  clean  there. Let's  go  ahead  and  flip  back  over to  the  slides  and  we'll  move  forward. After  we've  given  that  primary  input GUI, oftentimes  we  do  need  some secondary  or  even  tertiary  input GUI. The  users  provided  us  some  initial information  about  how  they  would  like their  data  to  be  analyzed. Now  it's   some  follow- up  questions. Come  on  to  that. For  us,  for  this  one  sample  analysis, again,  we're  looking at  our  thickness  variable. This  is  what  it's  going  to  look  like, that  secondary  input  window . We  know  which  variable  we're  targeting, we  want  to  analyze,  but  what  is  the  target value  that  we  want  to  compare  it  against? What's  the  value  of  interest  here? You'll  even  notice  that  in  the  title of  this  that  we're  already  calling out  the  value  for  thickness . This  script  is  already  starting  to  be smart  and  it's  already  starting  to  be adapted  for  us, of  listening  to  what  the  user said  in  the  primary  input GUI and  proliferating  that, into  that  secondary  input GUI. It  really  just  makes  it  clean for  users  as  they  step  through  these. When  we  get to  more  complex  kinds  of  scripts. Inside  of  here  in  the  middle, we  see  this  is… Again,  the  main  code  to  generate that  secondary  input  window. There  are  a  couple  of  other functionalities  inside  of  the  code that  will  walk  through  a  little  bit. Just  one  note,  secondary  input  windows, they're  not  necessary, but  they  are  nice  to  have. Of  course,  if  you  have  too  many  tertiary input  windows,  it'll  slow  us  down  and  it's too  many  clicks, but  a  lot  of  times  it's  nice to  have  some  flexibility  and  some adaptive  script  logic  to  actually make  for  a  better  user  experience. If  you  go  overboard  with  it,  of  course it'll  make  it  a  worse  user  experience. But  sort  of  finesse  is  key when  you're  designing user  experience  for  these  custom  scripts, because  they  need to  be  usable  at  the  end  of  the  day. All  right,  great. Let's  jump  back  over  to  JMP. We'll  look  at  this. This  next  section,  we're  talking about  the  secondary  user  input GUI. Again,  we're  just  going  to  start… we're  going  to  create a  variable  for  targets. Maybe  we  have  multiple  parameters that  we're  studying  all  at  once. Here's  another  error  check for  a  missing  target. Then  like  said, here's  that  big  heavy  lifter for  that  secondary  input  window is  all  of  that  code  there. Then  we'll  just  run these  last  little  bits. Again,  these  are  just  pulling information  out  of  that secondary  input  window. We'll  go  ahead and  run  all  of  that  together, and  we'll  see that  here's  this  target  value. Again,  it's  already  recognized that  it's  for  thickness. We  said  at  the  beginning  that  we  want to  know  all  of  these  thickness  values. Are  they  equal  to  a  value of  one  micrometer  on  average? Is  that  the  mean  value  there? We'll  go  there, and  we'll  go  ahead  and  hit  OK, and  we'll  see  that  everything went  through  okay. No  errors  inside  of  there. That's  all  of  the  inputs that  we  need  from  the  user  at  this  point. The  next  thing  that  the  user would  see  is  nothing . They  would  sit  and  wait  maybe  for  a  couple of  seconds,  maybe  for  10 seconds if  it's  a  really  heavy  script. But  at  this  point  it's  all  of  the  actual analysis  that  needs  to  happen in  the  report  generation. Before  we  jump  into  that, let's  jump  into  the  different  error checking  that  we've  exemplified inside  of  our  script  for  you  here. Inside  of  this  primary  input GUI, we  do  have  this  error  check. You  can  see  the  code  numbers. Essentially,  it's  just  saying,  "Hey, let's  make  sure  that  our  alpha significance  level is  between  zero  and  one . If  it's  outside  of   zero and one, you're  going  to  throw  this  dialog  box  here where  it's  going  to  tell you  what  went  wrong. This  error  checking  is  a  nice example  of  inline  error checking  for  us  there. We  have  a  different type  of  error  checking. We  give  you  a  second  kind, which  is  going  to  be this  function- based  checking. When  we're  talking about  this  secondary  input  window, we  do  have  this  missing  target  expression. This  is  an  expression  in JSL other  scripting  languages, call  these  functions. But  again,  this  is  just  a  nice  way  for  us to  also  just  call  this  expression  to  say, actually,  was  there  a  missing target  inside  of  there? If  the  user  hits  okay  with  an  empty target  value,  you're  going to  get  out  this  big  box  here. Okay,  awesome. Those  are  examples  of  error  checking. Let's  JMP  into  the  actual  analysis  then. Like  I  said  at  the  beginning, the  heavy  lifter  for  all  of  these  custom scripts  is  always  going  to  be  relying upon  these  JMP  native  platforms . Those  have  all  of  that  quality already  built  into  it. It's  a  lot  of  risk  mitigation  that  we didn't  do  something  wrong  when  we  coded. That  statistical  analysis  and  also that  we  know  that  it's  the  most  accurate statistical  analysis  that  is  available, that  quality  inside  of  there. For  this  example, we're  going  to  be  focusing on  a  distribution  platform  here. This  is  just  the  standard  JMP native  distribution  platform  here. The  nice  part  about  JSL  and  these  native platforms  is  that  you  can  directly interact  with  these native  platforms  through JSL. On  the  next  slide, we'll  show  some  tips  and  tricks, for  how  you  can  actually  interact, and  pull just  the  specific  values  that  you  want. There's  a  lot  of  good  information  that's presented  on  these  different  JMP  native platforms, but  oftentimes  there's  just  a  couple of  key  elements  that  we  really need  to  show, to  report  out  to  different  engineers. All  right. Let's  JMP  over  to  the  code  then, and  let's  go  ahead  and  run  this  part. This  next  divider  is  actually  just going  to  be  all  of  the  actual analysis  grouped  together. We'll  just  go  through portion  by  portion  here. This  is  just  creating some  container  variables. We'll  talk  about  that  in  a  second. But  let's  go  ahead  and  run  that, and  we  see  that  that  was  all. Okay,  Let's  open  up the  log  inside  of  there. Yap,  everything  is  okay. Now  this  is  the  actual distribution  platform . This  is  us  creating  that  distribution. This vlist  box is  going  to  send  it  to  our  output  box . That's  going  to  prepare  us for  our  final  report  generation. But  if  we  just  want  to  inspect  this  while we're  doing  some  development, if  we  run  the  code  from  here  up  to  here, but  do  not  include  the  comma. If  you  do  not  include the  comma  and  you  hit  run, we'll  see  that  we  actually  get  out our  nice  distribution  platform  here. We've  done  some  nice  things. We've  added  the  target value  inside  of  here. You  can  see  that  we're  already  testing for  the  mean  and  the  hypothesis  value, is  that  target  value  that  we're  interested in,  we  get  some  nice  summary  statistics mean  standard  deviation, so  on  and  so  forth  inside  of  there. But  that's  the  way  that  you  could  always create  the  same  standardized distribution  report. Oftentimes  different  people  with  different JMPs  will  have  different  preferences, because  we've  specified each  element  of  this  platform, it's  always  going  to  generate  the  exact same  distribution  platform coming  out  of  there. Okay,  so  that's  the  distribution  platform. Now  let's  see,  how  do  we  actually  interact with  this  distribution  platform  to  create a  nice  custom  script that's  going  to  be  over  here? It's  a  little  bit  scary the  first  time  you  look  at  it, but  you  end  up  finding  out that  this  properties  functionality that's  built  directly  into  JMP is  going  to  be  our  best  friend. Ultimately  right  now  what  we're  showing is  how  can  I  pull  those  summary  statistics that  I  want  to  display  in  my  bottom line  up  front  summary  table? How  do  we  pull  those  statistics  directly out  of  that  distribution  platform? All  of  that  calculation was  already  done  for  me. How  do  I  then  report  it  somewhere  else? It's  going  to  be from  this  property's  functionality. For  us,  we'll  see  that  we're  interested in  the  P  value  of  that  statistical  test. We  want  the  mean  of  our  data  set, standard  deviation  of  our  data  set, and  the  lower and  upper  confidence  intervals. We'll  see  that  we  then  are  going  to  insert all  of  those  values into  those  containers  inside  of  here. Let's  take  a  look at  our  distribution  platform, and  see  how  we  can  use this  show  properties  function. We're  on  our  distribution  platform. If  you  go  to  the  summary  statistic, I  want  to  pull  out  this  mean  value. How  do  I  know  the  code  to  pull that  out  and  interact  with  it? We're  going  to  right- click,  and  we're going  to  go  to  Show  Properties. Once  you're  in  Sow  Properties, you  can  click  on  this  box  path  right  here. This  box  path,  this  is  now  the  exact  code that  you  can  use  to  reference  any of  the  numbers  inside  of  this blue  highlighted  box. You'll  see  that  these  are  the  same items  that  are  shown  over  here. This  is  the  mean  value, the  standard  deviation  value, lower and  upper  confidence  intervals  there. You'll  see  you  can  sort  of  see  it on  the  bottom  right  here that  it  says  this  value  get  one. This  is  for  the  mean  and  it  wants to  return  that  first  value  out  of  it . We  would  add  the  value  of  get  one to  get  the  mean  next  to  this. You'll  notice  that  this  says report  platform  here. If  we  look  back  over  here, it  says  report  dbox . Now  why  do  we  say   dbox  there? Well,  D  box  is  the  specific  name that  we  gave  our  distribution  platform. Right? We're  saying  refer  to  this  platform that  we  just  created  and  pull  out those  specific  values  and  store  them into  these  container  variables. That's  exactly  what's  happening in  all  of  this  segment  of  code. Let's  go  ahead and  flip  over  to  our JSL  custom  script, and  let's  run  this  next  portion. Actually,  sorry. I  need  to  close out  of  my  distribution  platform. Otherwise  it  may  corrupt  that  there's a  couple  of  distribution  platforms all  contending  at  the  same  time. We're  going  to  run  all  of  this  section and  we're  also  going  to  get  up  to  here where  we're  going  to  pull  out those  summary  statistics. We  hit  run  and  we  see  great  everything went  through  just  fine  there. That's  how  we  actually  are  going to  interact  with  those  heavy lifter  native  JMP  platforms. Again,  rely  upon  the  stuff  that's  already built  and  you  can  already  trust, and  then we'll  build  further  from  there. The  next  thing  that  we're  going  to  show is,  well,  how  do  we  create that  summary  table? I  just  showed  you  how  I  can  pull  out these  mean  values, the  standard  deviation  value, these  confidence  intervals that  we're  leveraging  that  distribution platform  over  here  on  the  right. This  is  just  how  we  can  create this  summary  table. What  you  see  on  the  left  is  exactly generated  by  this  code  on  the  right  here. You  can  see  that  we  already  look at   these  targets. This  means  all  of  these  other  containers that  we  already  initialized  previously… Just  to  remind  us where  these  values  came  from, it  looks  something  like  this . We're  pulling  out  these  different  values out  of  the  distribution  platform. You'll  notice  again  that  we  already  have this  hypothesized  mean of  our  target  of  one. That's,  of  course, coming  from  that  secondary input  window  of  one  there. We're  going  to  go  ahead  and  drop that  target  of  one  there. The  other  important  thing on  this  summary  table, like  said,  is  this  nice custom  decision- making . That  we  can  put  whatever  logic we  want  to  put  inside  of  here. It's  kind  of  silly for  this  one  sample  analysis,  example. But  overall, it  really  is  one  of  these  things  that… This  is  where  you  as  your  as  your  company, as  your  profession, you  get  to  implore  your  own  expert  opinion about  how  decisions  should  be  made. You  can  look  at  the  statistics  and  say, "No,  actually  this  is  how  we  would  like to  make  decisions,  and  want  to  put that  right  up  front so  that  it's  immediately  clear to  anybody  who  opens  up  this  report of  how  we  analyze and  what  decision  we  come  to you." Let's  go  ahead  and  just  run  that  code. We're  not  going  to  be  able  to  see the  portion  like  we  did with  the  distribution  box. We're  only  going  to  be  able  to  see the  summary  table  when  we  do that  final  output  report. But  you'll  notice  here's  that  custom decision- making  right  here. For  saying  how  do  we  decide if  it's  not  equal  or  equal, what  we're  going  to  look  at  the  P  value? Of  course,  that's  kind  of  silly,  but  the point  stands  for  more  complex  reports. We'll  go  ahead  and  run  that  portion of  the  script  and  we'll  see. Let's  pull  up  our. Yap,  it  seems  like  no  error is  coming  out  of  this  log  inside  of  there. Let's  flip  back  over  here. The  last  component. We've  talked  about  native  platforms, we've  talked  about  summary  tables. Now  we  need  to  talk  about  visualization. Again,  visualizations. The  reason  why  we  have  them, is  that  their  immediate and  transparent  data  quality  checks . It's  something  that  anybody  can  look at  and  they  can  immediately draw  some  value  out  of  it. The  way  that  I  found  the  most  value  out of  these  visualizations  and  these  custom reports,  is  not  necessarily  for  me, it's  really  for  the  other  reviewers who  are  the  module  experts . The  process  experts  can  quickly  look at  data  and  they'll  say, " Yeah,  that  looks  weird." That's  not  how  that  process  behaves . Or  they'll  look  at  it  and  they'll  say, Yeah,  that  makes  sense  that  that's how  that  process  behaves. But  these  visualizations  give  a  lot  more than  just  the  pure  statistics, especially  when  you're  talking  to  somebody who's  not  a  statistics professional  or  statistics  expert. Again,  visualizations,  they're  great. They  allow  for  proper checking,  for  data  corruption, as  well  as  analysis  corruption. If  you  see something  weird  in  your  visualization, you  should  not  trust the  analysis  that's  associated  to  that. On  the  slide  right  now, it's  just  an  example  of  how  we  can  turn our  data  table  into  a  nice refined  visualization  over  here. We've  even  added  that  target  line  inside of  there  that  the  user  defined  for  us. Next  slide  is  a  word  of  caution about  how  we  use  these  scripts. These  visualizations,  again, they  should  highlight  these  data  concerns, but  the  user  needs to  know  how  to  use  them. I  said  that  this  order  number is  what  determines the   x-axis  on  our  visualization. If  the  user  enters  data  table  entry because  they  say,  " That's   the  order, that's  what  it  is  in  the  table." They'll  get something  that  looks  like  this . This  is  what  we've  been  looking  at together  so  far  through  this  presentation. But  again,  I  told  you  it's a  more  accurate  representation. Is  this  measurement  number? Something  happened  to  this  data  table to  get  it  sorted  in  a  different  order. If  we  plot  this  visual- based of  off of  measurement  number, we're  going to  get  something  that  looks  like  this. Everybody  here  should  notice this  immediately  as  a  red  flag that  something  is  wrong. We  should  never  have  data that's  trending  in  this  manner. Either  there  was  something  wrong with  the  process  or  there  was  something wrong  with  how  we  were  measuring the  data  with  our  metrology  then. But  we  shouldn't  really  be  trusting the  results  of  this  analysis. When  we  see  a  visual  like  this. We  need  to  go  and  recollect  the  data, figure  out  what  went  wrong  there. Again,  just  a  word  of  caution that  if  you  want  to  use  this, you  need  to  teach  your  engineers the  right  way  to  use  it  as  well. Just  for  us  to  say,  hey,  "How  do  we  create these  nice,  beautiful  visualizations?" We  like  to  use  the   Graph Builder  platform. It's  a  wonderful  platform  that  JMP  offers. It's  super  intuitive  and  easy  to  use. You  can  make  a  beautiful, wonderful  display  here  and  you  say, "Yes,  this  is  exactly how  I  want  to  display  my  data." Then  you  can  use  this  platform to  automatically  generate  your JSL  code by  clicking  on  the  little red  triangle  up  here. Going  save  script  to  script  window. You'll  get  out  a  set  of  code that  looks  something  like  this. The  one  word  of  caution is  that  of  course,  these  variables are  going  to  be  hard coded  inside  of  here, so  you're  just  going  to  have  to  update that  so  that  it  interacts  nicely with  your  user  input  so  that  it  adapts, to  whatever  your  user inputted  into  that GUI  there. But  these  are  all  of  the  elements then  that  go  into  the  final  report and  this  is  what  that  final report  looks  like. Again,  pretty  straightforward. We  just  say  create  a  new  output  window. We're  actually  going  to  make this  a  tab  box  here. We  only  have  one  tab  called  Report, but  in  our  more  complex  reports, we'll  actually  have sometimes  up  to  like  10  or  12 different  tabs  inside  if  you're  all with  different  information. But  we  have  this  summary  table… Again,  we  already  created that  summary  table. Let's  put  it  there. We  have  this  nice  graphical  plot that  we'll  put  over  here  and  then  we  have that  nice  distribution  platform and  we'll  put  that  inside  of  there. We  have  the  overall  takeaways right  up  top, and  then  we  have  all of  the  supporting  evidence  underneath  it. Let's  flip  over  to  JMP  and  I  know I'm  just  slightly  over  time  here, so  we'll  finish  up  quick. Back  over  to  JMP  and  we  will  run that  final  portion  of  the  code  here. We're  going  to  run both  the   Graph Builder. Let's  build  that  graph and  send  it  over  to  the  report, and  then  let's generate  the  final  report  here. We  go  ahead  and  hit  Run and  there's  that  platform. The  nice  thing about  these  custom  analysis  scripts and  again,  it's  just  a  nice  thing about  JMP,  in  general, is  that  all  of  these  reports are  going  to  be  interactive . Even  though  this  was  a  custom  report, this  platform  is  still  connected to  this  platform  down  here, These  platforms  are  still  going to  be  connected  to  this  overall. Remember  that  we're  working with  the  copy  data  table  now . Nothing  gets  corrupted,  but  it's  still going  to  be  connected  over  here . You  can  select  different  points inside  of  here  and  figure  out, well,  what  measurements  are  those responding  to  inside  of  there? That's  overall, these  custom  analysis  reports, what  they  look  like, how  we  can  make  them. Again,  it's  just  a  simple  case, but  let's  move  forward  here, into  some  overall conclusions  and  insights. Final  takeaways. At  Intel,  these  scripts have  really  become  a  critical component  of  our data- driven  decision- making . It  makes  things  so  efficient  and  so  fast and  so  repeatable  and  standardized that  it's  wonderful. Again,  these  are  all sort  of  the  same  ideas. The  only  thing  to  add is  that  it  does  also  allow  you  to  embed that  custom  decision- making for  your  company's  specific  best known  methods  for  your  specific  processes and  analysis  that  you're going  to  be  doing. A  quick  note  to  make  again, it's  the  caveat  that  we said  about  that  graphing  here. We  do  need  to  know  that  there's  going to  be  some  teaching  resources  that  we  need to  invest  into  this. When  we  proliferate  these  scripts, we  can't  just  give  them  to  the  engineers and  say,   "Go  do  some  analysis." We  need  to  tell  them, "This  is  how  we  intended  to  do  analysis with  these  specific  scripts. In  that  same  vein  of  thinking  here, we  do  have  this  custom decision- making  infrastructure. It's  going  to  require  maintenance. There's  going  to  be  bugs,  there's  going to  be  corner  cases  that  you  didn't  know. Prince  and  I  have  run  into  plenty of  these  cases  where  an  engineer comes  to  us  and  says, "This  isn't  working," and  we  say, " That's  weird.  Let's  look  at  this ." We  have  to  spend  some  time  debugging inside  of  there, especially,  when  your  company wants  to   step  to  a  newer  version  of  JMP. Here  at  Intel,  we  just  stepped to  the  newest  version  of  JMP. Sixteen  or  17,  one  of  those, and  we  had  to  go  back through  all  150  of  those  scripts, and  make  sure  that  they  were still  compatible  with  the  new  thing . Again,  there's  a  lot  of  infrastructure maintenance  that  you  should  be  aware  of, that  that's  it's  going  to  come  into  play . Especially  when  you  really  start to  proliferate  this  and  make this  a  large  repository. Again,  We  should  also  be  treating this  as  a  living  infrastructure,  though . It  changes  and  that's  a  good  thing. That's  why  we  have  the  power as  the  custom  analysis  script  owner, that  we  can  change  things  inside  of  there and  we  can  do  it  immediately  and  quickly and  we  can  be  really  agile  about  that. Users,  they  might  be  hesitant  initially. They're  going  to  learn  to  love  this, they're  going  to  really  adopt  it, and  they're  going  to  start  to  do some  strange  things  with  these  scripts. They  say,  "Hey,  I  love  this  analysis. What  if  I  did  this?" They're  going  to  start  using them  in  new,  nonstandard  ways? You  shouldn't  get  mad  at  them, these  are  actually  opportunities . If  an  engineer  is  using  the  script in  a  nonstandard  way, that  means  that  there's  some  functionality gap  that  they  wish  they  could  have that  would  make  their  job  easier . We  should  take  that  input, and  we  can  revamp  our  scripts, we  can  change the  functionality  inside  of  theirs, and  we  can  roll  all  of  those  inputs from  the  engineers, into  these  custom  scripts  immediately, and  we  can  start  providing more  value  to  our  engineers. Okay,  so  I'm  going  to  end  it  here. I  know  I'm  a  little  bit  over  time. Kirsten  Sorry  about  that. I'll  say  thank  you  here. Here's  mine  and  Prince's  emails. Feel  free  to  reach  out  to  us if  you  have  any  questions or  you  want  to  ask  anything. Thank  you.
Troubleshooting faults in industrial equipment can be a daunting task, particularly when compounded by the time constraints of a high-volume production line or plant. Analysis is further complicated in equipment with multiple operating modes, where issues may be visible in one mode but masked in another due to differences in operating parameters. The data generated by SCADA (Supervisory Control and Data Acquisition) or networked IoT equipment and sensors in a modern plant can provide a rich source of information for both fault troubleshooting and monitoring change assessment. However, the sheer volume of data for even a single piece of equipment requires complex analysis to be made intelligible for the engineer or technician.   This presentation elucidates a selection of examples using JMP for exploration and visualization of abatement equipment operating parameters for fault identification and engineering change assessment. Basic data visualization was used to identify physical phenomena happening within the abatement system to assess the root cause for fault troubleshooting. More advanced analyses, including clustering and variability charts, were used to identify issues related to specific operating modes and to assess the impact of engineering changes.         Good  morning,  everybody. My  name  is  Chris  Jackson. I  am  an  Applications  Engineer for  Centrotherm. We  design  and  build  point-of-use   gas  abatement  systems for  use  in  the  semiconductor   and  other  industries. Today,  I  have  the  opportunity   to  give  a  short  presentation on  how  we  found  a  space  for  JMP   in  our  part  of  the  industry and  how  it  helps  us  both  in  troubleshooting for  industrial  applications as  well  as  for  assessment and  justification  of  continuous  improvement  initiatives, engineering  changes,  things  like  that. A  little  bit  of  background   just  to  get  everyone  on  the  same  page, I  want  to  say  a  couple  of  words  about what   point-of-use  abatement  systems  are. I've  got  a  little  cutaway   of  one  of  our  tools  here  on  the  side. The  short  version  is  this: you've  got  a  manufacturing  tool   up  on  the  factory  floor doing  whatever  it's  doing in  the  semiconductor  manufacturing  process that  produces  harmful  gasses   as  a  byproduct, greenhouse  gasses, toxic  gasses,  flammable  gasses. Generally,  things  you  don't  want   to  go  in  the  atmosphere. Then  our  tools   take  those  waste  gasses  in, they  destroy  them   through  thermal  energy, they  wash  them  out, and  you  release  clean  air   to  the  factory  exhaust. Because  these  tools   are  safety  and  environme nt-critical, a  fault  in  one  of  them means  that  your  production  line   is  at  least  in  part  shut  down. If  you  can't  treat  your  byproducts, then  you  can't  run. In  a  high- volume   manufacturing  environment, as  so  many  semiconductor  FABs  are, even  small  delays  are  incredibly  costly. We  as  suppliers  and  servicers, have  to  have  a  means   to  quickly  identify  problems and  bring  the  tools  back  online. Historically,   troubleshooting  usually  means opening  the  tool, looking  visually   to  identify  failing  components often  after  some  period   of  root  cause  analysis. But  with  a  modern  FAB  environment and  the  data  generated   by  SCADA  or  IoT  systems, we  have  mountains  of  data  available to  investigate  faults before  we  ever  touch  the  equipment. That  gives  us  a  way  to  guide troubleshooting  in  the  field, and  in  some  cases  for  intermittent  faults, it  even  lets  the  factory  keep  running while  we  investigate  digitally rather  than  physically minimizing  the  time  lost to  troubleshooting  and  investigation. The  problem  with  this  mountain  of  data is  a  scale  issue. The  higher  the  resolution  of  your  data, the  better  look  you  can  get   at  what's  happening  instantaneously in  any  of  these  pieces  of  equipment. That  higher  resolution  however, comes  with  an  overhead. You  need  more  and  more  computing  resources to  effectively  analyze  it, and  that's  where  JMP  comes  in  for  us with  the  capacity  to  handle   very  large  data  sets, and  it  becomes  a  tool   for  visualization  and  exploration that  can  really  drastically  improve troubleshooting. It  lets  an  engineer  or  a  technician quickly  explore   and  visualize  important  parameters within  your  data  sets, and  these  data  sets  are  at  a  scale sometimes  that  are  just  unmanageable for  a  lot  of  other  visualization  tools. With   that, I  want  to  jump  right  into   the  first  example  case  we  have  here, and  we're  going  to  identify an  intermittent  single- component  failure just  through  data  visualization. No  statistics,  no  modeling, just  the  ability   to  sift  through  and  visualize  the  data. Here  we've  got  a  chart   showing  ionization  current  versus  time. It's  one  of  a  number  of  parameters, ionization  current, that  we  use   as  a  health  monitor  for  the  equipment. This  tool  was  having  issues in  which  it  would  run  for  a  couple  of  days and  then  seemingly  randomly fail  and  shut  down. For  context,  this  current  should  be a  flat  horizontal  line  at  25.5, so  it's  pretty  clear  from  the  outset that  we  have  a  problem. It's  also  pretty  clear  what  I  was  talking  about regarding  data  set  size. This  data  set  right  here   is  almost  six  and  a  half  million  rows. Six  and  a  half  million  rows  with, when  you  pull  in   all  of  the  tool  parameters, 500  columns. The  file  for  this  data  set   is  about  20  gigabytes  in  size, absolutely  massive  amounts  of  data. Before  we  even  do   any  statistical  analysis,  like  I  said, we  can  start  to  do   some  problem- solving  off  of  this  data  set just  with  visualization. Initially,  it  doesn't  really  look  like there's  any  clear  shape  to  this  data. We  know  something's  wrong, but  we  don't  know  what. But  when  we  zoom  in, all  of  a  sudden   we  start  to  see  some  structure. This  looks  pretty  periodic  to  me. We  zoom  in  a  little  bit  more and  we  see  that  it  is  in  fact very  periodic. Each  one  of  these  little  spikes  down, disregarding  magnitude, is  timed  five  minutes   almost  exactly  from  each  other. That  immediately  begs  the  question  then, do  we  have  some  component, a  valve,  a  flow  controller,  a  motor, something  that  actuates   every  five  minutes? We  identify  that  component. Now  we  have  a  really  likely troubleshooting  culprit. The  troubleshooting  plan  changes from  open  the  tool  and  investigate, which  could  take  a  couple  of  hours, to  open  the  tool   and  change  this  one  targeted  component. We  just  shrunk  the  actual  time   that  we  need  to  be  in  the  equipment from  a  couple  of  hours   looking  at  everything to  see  what  might  be  failing to  a  single  hour,  get  in  there,   change  this  part,  get  back  out. In  this  particular  case, that  was  the  failing  component, we  were  able  to  identify  it. Problem  identified,  plan  made without  ever  having  to  open  the  equipment. We  were  able  to  get  there   with  just  the  conclusions that  we  were  able  to  draw   from  visualization. Of  course,   JMP  is  not  just  a  tool  for  visualization. It  also  has  at  its  core  a  very  robust suite  of  statistical  analysis  platforms. If  we  start  to  apply  those  to  the  data, we  can  get   even  more  exciting   and  interesting  results. I'll  just  jump  right  into   the  second  case  here. In  this  case, we're  looking  at  a  specific  tool, which  is  working  fine  most  of  the  time, but  it  does  have  occasional problems  with  buildup, sometimes  we  got  to  draw  our PM  in a  little  earlier  than  we  would  like. We  want  to  take  a  look   at  our  health  parameters and  see  if  there's  any  abnormalities, any  optimizations  we  can  make. The  approach  that  I  use  here is  applicable  for,  really,   any  industrial  application that  has  defined  operating  modes. Because  we  can  draw  those  modes   out  of  the  data  very  easily using  clustering. In  this  case,  our  abatement  has, or  this  specific  abatement, has  three   pretty  well- defined  operating  modes based  off  of  these  two  input  gasses. I  use  K Means  clustering. You  could  use  whichever   version  of  clustering  you  prefer. But  I  run  that  over  the  data   to  sort a ll  of  our  rows,  all  of  our  points into  these  three  operating  modes. If  you  have   more  than  three  operating  modes, obviously,  you  can  use  more  clusters. But  it  also  gets  interesting, what  if  you  don't  know how  many  modes  you  have? Maybe  they're  customer-defined, or  maybe  there's  a  suspicion  that, "Hey,  could  there  be   some  interstitial  mode  here?" Maybe  the  transition  state between  two  of  these  operating  modes. If  you  want  to  investigate  that  way, you  can  use  iterative  clustering. I  did  that  down  here. You  just  run  from,  I  used  3- 10  clusters, and  the  software  will  identify   what  the  optimal  number  of  clusters  is. Looking  at  this,   it  is  correctly  identified. It  gives  us   these  cubic  clustering  coefficients, identifies  the  optimal  one, that,  yes,  as  suspected,   three  is  the  optimal  number  of  clusters to  sort  this  data  into. I'm  not  really  worried   about  these  state  transitions. I'm  really  more  focused on  the  states  themselves. We  take  that  data,   we  get  a  readout  of  it, and  we  throw  it  up  onto  this  3D  scatter  plot. We  take  some  of  our   tool  health  parameters, and  we  color  everything by  what  cluster  they're  in. Immediately,   we  start  to  see  some  interesting  results. We  talked  about  ionization  current   should  be  solid  at  25.5, and  we  see   that  we  have  some  variability  here. It's  dropping  below  that. Immediately   we  know  that  we  have  a  problem. But  what's  more  interesting is  that  every  single  one  of  those  points is  grouped  into  a  single  cluster, cluster  two,   which  corresponds  to  this lowest  input  gas  one, highest  input  gas  two. Now  from  an  engineering  perspective, if  I'm  looking  to  make  optimizations   or  I'm  looking  to  improve  tool  health, I  immediately  can  say, "Hey,  this  is  the  operating  mode   that  we  need  to  look  at." That's  what  I  need in  order  to  start  looking   at  concrete  next  steps  for  improvement. I'm  not  looking  at  the  tool  as  a  whole. I've  already  managed  to  focus my  search  to  one  operating  mode. The  last  thing  I  want  to  talk  about  then, having  looked  at   two  of  these  use  cases  here  is, what  are  the  other  advantages  with  JMP? Why  JMP? My  customers  are  collecting  all  this  data. They  have  ways  to  view  it. They  have  scatter  systems and  monitoring  systems  in  place. They  have  ways  to  parse  it. So  why  do  I,  as  a  supplier/ servicer, need  this  platform   to  view  and  parse  the  data? The  answer  for  me,  at  least  in  my  case, is  the  cross- platform  compatibility. If  I'm  reliant  on  my  customer   to  chart  and  generate  data views  for  me, I'm  now  taking  up  their  time   and  their  resources to  troubleshoot  a  problem that  I'm  responsible  for  fixing. With  JMP,   as  long  as  they  can  give  me  the  raw  data, I  can  do  all  of  it  myself. Not  only  is  that  freeing  up  their  sources, it  gives  me  the  ability   to  do  my  own  investigation independent  of  whatever  system   they're  using  for  data  analysis. It  doesn't  matter  if  they're  using proprietary  monitoring  system  A  or  B  or  C, or  if  they're  using   their  own  IoT  monitoring  system from  their  control  engineers. It  doesn't  even  matter if  they  have  multiple  data  acquisition  systems from  different  vendors. With  JMP, I  can  import  and  manipulate whatever  data  they  give  me and  perform  these  kinds  of  analysis, sour ce-independent, do  the  investigation  that  I  need  to  do for  my  customer  support with  all  the  tools  for  visualization   and  statistical  analysis that  JMP  provides. With  that,  it  looks  like   we're  pretty  much  at  time  here. I  know  this  isn't  the  traditional  use  case necessarily  for  JMP from  some  of  the  folks  that  I've  talked  to, but  I  hope  it  was  helpful  for  people. I'd  just  like  to  thank  Adam  Stover,  our  CTO, and  Gordon  Tendik, our  Director  of  Apps  and  Technology, for  helping  me  put  all  this  together and  reviewing  the  work  that  I've  done. Thank  you  for  your  time.
Monday, October 16, 2023
While the question "How many (parts/subjects/runs) do I need?" is one nearly every statistician dreads, it is an important question and should be asked prior to running any study or experiment. The answer seems simple enough. Just plug some numbers into a calculator and off you go! In my experience, though, sample size calculations are rarely that easy.   JMP 16 introduced an entire suite of Sample Size Explorers, with more added in JMP 17. But why call them "explorers" and not "calculators"? Because sample size is more than a calculation. It is an integral part of a study design, and to determine a sample size, more than math is needed. This presentation explores sample size from the concept to the execution. While the examples include sample size explorations for medical device or diagnostics studies, the lessons learned are applicable across industries.     What  we're  going  to  talk  about  today  is a  simple  introduction to  sample  size  thinking. Then  we'll  look  at  two  examples;  one, comparing  the  mean  of  two  populations, and  the  second,  looking  at  a  study with  a  proportion  endpoint, and  we'll  wrap  up with  some  additional  thoughts. A  question  I'm  often  asked  is, what  sample  size  do  I  need? One  might  think,  "Oh,  that's  easy. Just  use  a  sample  size  calculator." But  wait  a  second. Why  does  JMP  call sample  size  calculators  explorers? Why  are  they  in  the  DOE  menu? Which  one  do  I  use? Well,  let's  talk  about some  sample  size  basics. A  sample  size  is  calculated prior  to  running  a  study. A  study  is  an  experiment designed  ahead  of  time. That's  why  they're  in  the  DOE  menu. Sample  size  depends on  the  goal  of  a  study. I  often  call  this,  are  you  making  a  $5 decision  or  a  $50  million  decision? Are  you  looking  at  a  regulatory  clearance, a  publication,  an  R&D  question, or  a  simple  exploration? What's  the  primary  endpoint  of  your  study? What  are  you  trying  to  show? How  is  your  study  design? What  are  your  outcome  assumptions? These  might  be  based  on  prior  knowledge, a  pilot  data,  or  often, or  just  simply  guessing. Sample  size  is  a  risk-benefit  exploration. That's  why  they're  called sample  size  explorers. You  want  to  explore how  different  assumptions are  going  to  impact  your  sample  size. Now,  more  is  generally  better, but  as  we  all  know,  more  costs  more, and  more  might  not  be  possible. Let's  start  with  a  simple  example of  sizing  the  study for  comparing  two  means. We'll  at  the  Fit  Y  by  X  platform, and  we'll  look  at  the  Power  Explorer for  two  independent  sample  means. This  sample  size  example  is  based  on a  real  situation  where a  company  is  in  the  R&D  phase. They're  doing  a  sample  collection  study. That  could  be  blood,  nasal  swabs,  saliva. There's  no  primary  endpoint because  it's  an  R&D  study. They're  still  in   the  R&D  phase, but  they  need  a  sample,  a  power  analysis. They  were  asked  for  power  analysis by  the  entity  that  is  considering funding  the  project. How  can  we  provide  a  power analysis  without  a  primary  input? Well,  best  thought  here  is  one, we  could  say, "Hey,  we  can't  do  a  power  analysis," or,  knowing  that the  funding  entity  wants  a  power  analysis to  show  that  we've  thought  about  the  study and  we've  thought  about  how  many  people were  asking  them  to  enroll, we  could  generate  a  research  endpoint. In  that  case,  we're  going  to  ask, "Can  I  distinguish  the  difference  in  means between  my  sick  and  healthy  subjects for  some  primary  biological  markers?" We'll  use  the  sample  size from  the  power  analysis and  the  expected  prevalence of  illness  to  justify  the  number of  subjects,  we're  requesting to  enroll  in  the  study. I  need  to  understand  test  for  comparing to  independent  means, and  I  need  a  calculator  for  the  power of  a  test  to  compare  to  independent  means. What  I  like  to  ask  myself  is if  I  had  data,  what  would  I  do? If  I  understand  what  analysis I'm  going  to  do, that's  going  to  help  me  determine what  sample  size  I  need. Sometimes  you'll  have  pilot  data, and  sometimes  you  can  just  make  up  data to  help  you  figure  out what  analysis  are  you  going  to  do and  what  sample  sizing  should  you  do. Let's  take  a  look  at  this. I'm  going  to  open  a  data  table, and  this  is  just  generated  data. I've  got  a  sick…  I  have  15  sick patients  and  15  healthy  patients. I'm  going  to  do  a  Fit  Y  by  X. I'll  do  a  couple  of  things  here: our  range,  I'm  going  to  jitter  my  points, I'm  going  to  run  a  T-test, and  I  like  to  look  at  the  densities. Here's  two  examples of  what  some  data  might  look  like. On  the  left is  a  fairly  separated  populations of  outcomes,  the  biomarker  number  one. The  difference  is  about  2, 2.5. These  were  generated from  a  normal  distribution as  were  the  ones  on  the  right-hand  side. Here,  the  difference  is  a  little  less. You  can  see  in  both  places, we  would  conclude  that  there's  a difference  between  these  two  populations. The  one  on  the  right  being  closer  together is  harder  to  differentiate than  the  one  on  the  left. We  used  a  T-test  for  that. Now  the  question  becomes, how  many  samples  would  I  need if  I'm  going  to  run  this  experiment? Again,  let's  look  at  that. Let  me  just  step through  my  Workflow  Builder so  it  closes  down  our  data  tables. DOE  Sample  Size  Explorer p ower. I  want  power  for  two independent  sample  means. I  pull  that  up,  you'll  see  that  there's quite  a  few  things  to  look  at. First,  we  have  the  test  type. It's  going  to  be  two-sided. Our  Alpha  is  0.05, and  the  group  population  standard deviations  are  not  assumed  to  be  known. We're  guessing  at  those. To  calculate  my  sample  size, I  need  to  fill  in  this  information. This  is  my  calculator  part. I  have  two  groups. I'm  going  to  start over  here  on  the  right-hand  side. I  have  two  group  standard  deviations to  put  in  estimates  for. I'm  going  to  assume  that  one  group is  less  variable  than  the  other  group. Next,  I  need  to  fill in  the  difference  to  detect. Here,  I'm  using  standard  deviation  units, and  I'm  going  to  say  I  want  to  detect a  one  standard  deviation  unit  difference. Next,  I've  got  right  now  sample  size of  30  in  each  group that  gives  me  a  very  high  power. I'm  going  to  lower  this  power  to  90, and  I  see  that  for  a  power  of  90 to  detect  a  difference  of  one between  these  two  groups, I  need  a  sample  size of  15  subjects  in  each  group. That  seems  reasonable. Now,  you  can  look at  these  graphics  to  see  that how  your  guesses,  your  assumptions might  impact  the  power  of  your  study. We  can  see  that  the  standard  deviations have  quite  a  bit  of  impact. As  my  standard  deviation  increases, so  my  data  becomes  more  spread  out, my  power  decreases. It's  going  to  be  harder to  detect  this  difference. You  can  see  we're at  a  sweet  spot  in  the  sample  size. As  I  increase  the  sample  size, my  power  is  going  to  increase, but  not  terribly  greatly. As  I  decrease,  if  I  went  down  to  about  10, my  power  is  going  to  go down  to  about  eight. But  let's  go  back  to  the  point  now. I  want  about  15  samples  per  group. In  this  instance, to  get  15  positive  samples from  a  study  where  I'm  enrolling  people, and  if  I  have  a  10 %  prevalence  rate of  sickness  over  the  study  period, I  would  need  about  150  subjects. If  the  prevalence  was  low  or  say,  only 5%,  then  I  would  need  300  subjects. Again,  sample  size is  a  risk  benefit  calculation, so  we  want  to  consider various  sample  sizes. All  right,  now  to  our  second  example. This  is  sizing  a  study with  a  proportion  endpoint. We'll  use  the  distribution  platform and  we'll  use  the  Interval  Explorer for  one  sample  proportion. This  is  based  on  the  question of  how  many  samples  do  I  need to  demonstrate  sensitivity and  specificity  for  regulatory  filing? I  do  a  lot  of  work  in  diagnostics. In  diagnostics,  sensitivity  is  simply the  proportion  of  positive  cases that  your  test  calls  positive, and  the  specificity  is  the  proportion of  negative  cases that  your  test  calls  negative. We  generally  calculate  sample  size for  each  of  these  metrics  individually, and  then  we  add  for  the  total  sample  size for  a  retrospective  study where  I've  already  got  samples, perhaps  in  a  freezer  or  from  a  partner, and  I'm  just  going  to  pull out  the  ones  that  I  need. For  a  prospective  study, again,  we  would  use  the  prevalence to  calculate  the  total  number of  subjects  to  enroll, similarly  as  we  did  in  the  last  example. Again, we  need  some  preliminary  information. The  goal  of  this  study is  a  regulatory  filing, so  a  high  level of  evidence  is  needed. Then  this  particular  industry  sector, I  need  to  demonstrate that  the  lower  confidence  limit for  sensitivity  and  specificity is  greater  than  80 %. The  study  design is  a  retrospective  study. It's  a  review  of  CT  scan . The  assumptions  are  that the  sensitivity  of  identifying the  outcome  is  0.9 and  specificity  is  0.85. I  need  to  understand the  confidence  interval  as  an  outcome, and  I  need  a  calculator  for  confidence interval  for  proportion. Again,  the  question, if  I  had  data,  what  would  I  do? Let's  look  at  that. Again,  I  generated  some  data. I  have  a  reference  standard where  I  had  about  145  negative  samples and  144  positive  cases  or  samples. Then  I  have  the  test  results, positive  and  negative. You  can  see  they're  not  perfect. Some  of  the  cases  that  the  test  calls negative  are  actually  positive, and  some  of  the  cases  that  the  test calls  positive  are  actually  negative. How  would  I  look  at  this? Well,  I  could  tabulate  it and  come  up  with  the  %  of  positive  cases that  the  test  calls  positive and  the  %  of  negative  cases that  the  test  calls  negative. But  I  want  confidence  intervals. I'm  going  to  use the  distribution  platform, and  I'm  going  to  look  at  the  proportion in  the  test  cases  by  the  reference  case. Again,  let's  add… We  want  to  add,  sorry, wrong  red  triangle  menu. We  want  to  add  confidence  intervals, and  I  held  down  my  Control key  to  broadcast  those. Now  I  can  look  and  see what's  going  on  here. For  the  cases that  by  the  reference  are  positive, the  new  method  calls  135 of  those  positive,  so  93.75, and  I  have  my  confidence interval  that  goes  from  88-96.6 You'll  see  this  note  here  that  says, computed  using  score  confidence  intervals. Then  the  thing  to  note  here is  that  a  score  confidence  interval is  not  symmetric. We  can  look  at  that. Here  I  generated  a  graphic, and  you  can  see  that when  we're  at  the  low  end, so  a  probability  of  0.1, you  can  see  that the  upper  confidence  limit  is  higher as  compared  to  the  point  estimate than  the  lower  confidence  interval. The  point  estimates  are  not  centered  in the  middle  of  these  confidence  intervals. That's  just  the  nature of  this  core  confidence  interval. The  question  now  is, how  many  samples  do  I  need to  show  that my  lower  confidence  limit  is  at  least  0.8, given  the  assumptions  of  here we  had  for  sensitivity, which  is  the  positive  side, that  we  were  going  to  be  greater  than  0.9, and  on  the  negative  side that  we  were  going to  be  greater  than  0.85. Here  we  can  see  that  at  0.85, my  lower  confidence  limit  is  only  0.78. I  would  need a  few  more  samples  in  order  to show  that  my  lower confidence  limit  is  greater  than  0.8. Again,  the  question  is  now that  I  understand  what  I'm  looking  for is  how  much  data  should  I  collect? Let's  go  to  DOE  Sample  Size  Explorer, confidence  intervals for  one  sample  proportion. Let's  put  in  this  example  here. Let's  put  in  our  proportion  of  0.9375, and  the  sample  size  that  we  had used  here,  which  was  144. I  had  left  the  interval  type  as  two-sided, confidence  intervals, confidence  level  is  95  %. With  the  sample  size  of  144, if  my  proportion  comes  out  to  be  93.75, my  margin  of  error  is  0.04. Okay, well,  what's  margin  of  error? Margin  of  error  is  the  half  width of  the  confidence  interval. If  it  was  a  symmetric  confidence  interval, it  would  be  your  plus  or  minus value  over  your  point  estimate. But  in  the  case of  a  score  confidence  interval, and  that's  what  this  calculator is  based  on, this  is  the  half  width of  your  confidence  interval. But  we  can  see  that… With  the  93.75, the  margin  of  error  of  0.04, it's  not  simply  a  minus  0.04 from  this  93.75 because  we  noticed  that when  we  did  this  calculation that  our  lower  confidence  limit  was  0.88. This  sample  size  is  more  than  sufficient for  what  we  needed. We  only  needed  a  confidence  limit  of  0.8. Let's  do  that  calculation. Let's  put  in  our  assumed  value  of  0.9, and  let's  put  in  a  margin of  error  of,  say,  0.08. We  know  that  0.1 is  going  to  underestimate  our  sample  size. If  we  do  this  and  we  say,  all  right, for  a  proportion  of  0.9, margin  of  error  is  0.08, our  sample  size,  it  says,  is  56. Okay,  well,  let's  double-check  that. To  do  that, I  constructed  a  calculator where  I  can  put  in  my  assumed  proportion and  I  can  put  in  this  value  of  56. If  I  run  this  distribution, and  what  I  did  here  is  I  have  an  outcome of  one  and  zero,  and  I  have  a  frequency. If  I  relaunch  this, I  use  the  outcome and  I  use  the  frequency  column to  give  me  the  distribution  as  if I  had  51's  and  six  zeros  in  my  data  file. Well,  what  does  that  look  like? With  a  sample  size  of  56, a  proportion  of  about  0.9, my  lower  confidence  limit, using  a  score  confidence  interval  is  0.78. This  sample  size  of  56  gives  me the  precision  that  I  asked  for, the  margin  of  error  of  0.08, but  it  doesn't  quite  give  me the  lower  limit  on  this  confidence interval  that  I  need  for  this  situation. Let's  put  in  a  slightly larger  sample  size. Let's  make  this  65. That  gives  me  a  margin  of  error  of  0.074, which  is  slightly  tighter  than  the  0.08, and  let's  see  what  that  looks  like in  my  score  confidence  interval. If  I  do  that,  now  I  see  that  my  lower confidence  limit  is  above  the  0.8. The  point  of  this  was  not  to… The  point  of  this  was  really  to  show  you that  it's  important  to  understand what  it  is  you're  trying  to  show, and  it's  important  to  understand what  is  it  that  your  sample  size calculator  is  providing  to  you. There  are  sample  size  calculators all  over  the  internet. Then  in  JMP,  we  have  a  whole  slew of  sample  size  calculators, explorers  to  look  at. It's  important  to  understand what  is  your  endpoint, what  are  you  trying  to  solve, and  what  is  it  that  your  calculator is  calculating  for  you. Once  you  do  that, then  you're  better  informed for  making  decisions  as  to  how many  samples  do  you  really  need. Let's  finish  up  with  just  a  few brief  comments  on  additional  topics. Other  ways  that  you  can get  at  sample  size. One  is  simulation. You  can  use  pilot  data to  define  distributions, use  random  number  generators  to  generate a  study  based  on  those  distributions. Then  you  can  analyze  that  data to  see  if  your  endpoint  is  met. Is  it  met? Yes  or  no? Then  you  can  repeat that  some  large  number  of  times and  calculate  the  portion of  times,  your  endpoint  is  met. In  a  sense,  your  power. How  likely  are  you  to  meet  your  endpoint given  your  assumptions? I  like  to  do  that. Simulation  is  useful. Again,  however,  it's  all  based on  your  assumptions. If  your  assumptions  are  wrong, your  sample  size  may  not  be  large  enough. Another  thing  that  often  happens is  that  we  have  to  make the  best  allocations  of  what  we  have. We  may  have  1,000  samples  in  the  freezer and  we  know  what  their  outcomes  are and  we  want  to  test  them  on  a  new  test or  we  want  to  develop  a  new  test. How  many  can  we  use  to  train  an  algorithm? How  many  do  we  need  to  use to  validate  that  algorithm? Sometimes  we  have  to  take the  sample  numbers  that  we  have. Use  the  sample  size  explorers  to  evaluate what  you  might  be  able  to  conclude, and  then  use  those  findings  to  decide  if what  you  have  is  sufficient  to  proceed with  the  experiments and  the  development of  your  test  or  product. That's  what  I  have  on  sample  size. It's  more  than  a  number. It's  based  on what  it  is  you're  trying  to  decide and  how  you're  going  to  analyze the  data  once  you  get  that  data. It's  an  exploration, you  want  to  take  into  account how  do  those  assumptions that  you  make  impact  those  sample  sizes and  hedge  your  bets  for  a  great  outcome. Thank  you,  and  that's  it.
The most novel, innovative, and promising therapeutics in biopharmaceuticals are cell therapies. Cell therapies transfer human cells into a patient to treat disease. These cells either come directly from the patient or from a healthy (cell) donor. Multiple regulatory guidance documents recognize the importance of providing cell therapy manufacturers the flexibility to improve their processes. Therefore, it is imperative to show that the pre- and post-change processes are comparable and process changes pose no threat to the safety or efficacy of the drug product.   One method used to ensure comparability is an equivalence test of means. There is a regulatory expectation that the study is done as a paired design, often referred to as a split-apheresis study, unless there is minimal donor-to-donor variability. In split-apheresis studies, the same donor material is split and processed in the pre- and post-change process for comparison. The design of this study presents unique challenges in cell therapies as they require adequate sample sizes to ensure properly powered designs, yet the number of healthy donors available is usually quite low (three to six donors). Additionally, the power depends on lot-to-lot and assay variability, assay replication strategy, and the effect size used for the equivalence acceptance criterion (EAC).   This talk presents a series of JMP scripts that extend the existing capabilities of the Sample Size Explorer platform to address each of these relevant study nuances, as well as the capability to overlay power curves to address trade-offs with different sample sizes and approaches.     I  am  Heath  Rushing. Although  Andrew  Karl,  and  Jeff  Hofer , and  Rick  Burdick,  some  teammates  of  mine , did the  majority  of  technical  work  here, I'm  going to  be  the   one presenting  today . I'm  going to  talk  about  how JMP  and  JMP  scripts  can  be  used in  a  very  particular  specific application   in  cell  therapies. I'm  going to  talk  a  little  bit  about what   gene and cell therapies  are and  the  very  specific  instance that  I  want to  talk  about is  comparability. I'm  going to  focus  on  process  changes. Interestingly  enough,  last  year, I  gave  a  talk , and  it  focused on  cell  and  gene  therapies. They're  very  novel  therapeutics. The  first  one  was  approved in  the  United  States  in  2017. A  little  bit  different  than  most  of , what  I  call   the  small  molecule  and  the  large  molecule  therapeutics that  you may  have  heard  of  in  the  past. Let  me  just  touch  base on  what  is  a  cell  and  gene  therapy. First  thing  I'm  going to  do is  touch  base  on  what  a  gene  therapy  is. What  you're  essentially  doing is  you're  replacing a  gene  with  a  healthy  one, or  turning  off   bad  genes. A  lot  of  cancers are  caused  by  defective  genes. What  you're  doing   is  you're  inserting these  healthy  genes  back  into  a  patient of  either  in  vivo  or  in  vitro. An  in  vitro  would  be  more of  a  bone  marrow  transplant. Last  year,  I  talked  about , the  challenge  with  gene  therapies is  that  patient -to -patient  variability. I  focused  on  process  development. Then  I  talked  about  cell  therapies. In  the  cell  therapies , what  you're  doing is you're  replacing  disease  cells. You're  either  transferring some  sort  of  healthy  cell  into  a  patient, or  we're   replacing missing  cells  into  a  patient. Where  do  these  cells  come  from? They  either  come from  the  patient  themselves, so  you  would  have  to  deal with  that  patient -to -patient  variability, or   in  most  cases, they  come  from  a  healthy  donor. Now  you're  not  dealing with  this   patient-to-patient variability, but  you're  dealing with  donor -to -donor  variability. Whenever  I  say  donor, I'm  talking  about  a  healthy  donor. I  could  be  a  healthy  donor. Then  someone  else could  be  a  healthy  donor  also. In  both  of  those  cases is  you  have  to  deal with  that   patient-to-patient or  donor -to -donor  variability. What's  interesting  is  last  year , I  gave  the  example   in  process  development, and  it  looks  something  like  this. It  was   the  exact  same  data  set that  I  used  last  year  that  I  said, say  that  you  were  developing  a  process where  you  look  time,  temperature , and  pH, and  you're  measuring  their  effect on  cell  viability  and  by  product. In  that  case  is, I  cannot  use  one  donor  material, I  had  to  split  that  up into  four  different  donors. I  said,  "If  you  ran  these  experiments for  process  development, and  you  did  not  consider that  there  was  donor -to -donor  variability, this  is  what  you  would  see. What  you  would  say   is  we're  looking for   P-values  that  are  below   0.05, you  would  say  nothing affects  cell  viability and  nothing  affects  by  product. You  were  not  able  to  detect that  you  had  any  significant or  critical  process  parameters for  the  very  reason that  you  do  not  consider that  there  could  be  a  difference  in  donor. Right   now,  if  you  do  consider  those as  what  it's  called  a  fixed  donor  effect, the  only  thing  that  I  did is  I  brought  in  donor. Then  you  see  that  NAND. This  really sticks  out what  significantly  affects  cell  viability and  what  significantly  affects  by  product. The  whole  talk  was  on  how  does that   donor-to-donor  variability affect  statistical  inference and  also  process  capability. I'm  going to  focus on  that  statistical  inference. What  are  you  trying  to  do in  process  development is   you're  trying  to  determine  if  things  like pH, and  temperature,  and  time significantly  affect your  critical  equality  attributes. Say  that  I  was  a  drug  manufacturer , and  I  have  set  up a  process  development  study . I  send  this  process  development  study . I  want  to  determine if  temperature  affects, and  I'm  going to  call  it  cell  viability. I  say,  "Hey  whenever  I'm  looking  at  that , is  I  want to  make  sure  that  if  something significantly  affects my  quality  attributes, I  control  that  in  my  process. But  if  it  doesn't,  I  am  not  spending  money and  time  and  resources  controlling  it." What  I'm  concerned  with as  a  drug  manufacturer is the  Type I  error  rate. I  do  not  want  to  inflate a   Type I  error  rate. A   Type I  error  rate  would  say, "Hey , this  is  significant when,  in  fact,  it's  not." What  do  you  think  that  regulatory  agencies would  be  more  concerned  with? You  controlling  more  things? Are  you  not  controlling  things that  should  be  controlled? That  is  exactly  right is  they'd  be  more  concerned about  that  patient  risk, that   Type II  error. In  process  development is  drug  manufacturers do  not  want  to  inflate  the   Type I  error. They  also  want  significant  power.  Why? Because   that  controls  that  patient  risk. The  whole  point of  me  showing  that  last  year was  to  show  the  effective donor-to-donor  variability on  trying  to  determine your  critical  process  parameters. I  call  it  statistical  inference. Right   now, what  happens   if  I  change  my  process? I  had  a  colleague  just  last  week , I  was  working  with  her. Whenever  we're  talking  about  cell and  gene  therapies,  she  said, and  this  is  her  quote, "Heath , at  cell  and  gene  therapies, things  are  constantly  changing. You  could  have  things like  analytical  methods  change. You  could  have  things like  process  change." Today,  I'm  going to  focu s on this  process  right  here. Mainly,  I'm  going to  focus on  that  process  change. I  do  want to  point  out that  regulatory  agencies  understand that  you  have  a  need for  improving  your  process. Even  if  you  improve  your  process, are  you  changing  your  process ? They  recognize  the  need  for  that, but  they  also  recognize  the  need that  the  therapeutics that  you're  making  from  that  process should  be  similar in  terms  of  product  quality. You're  using  these  in  clinical  trials. What  does  it  mean  to  be  similar? That  doesn't  say  that  they  have to  be  exactly  the  same, that  they  have  to  be  similar or  comparable. In  terms  of  me  saying that  something  is  similar, what  I  want to  do  is  I  want to  make  sure that  I  have  some   similarity  condition. That's  the  whole  point  of  comparability. For  very  low  risk  attributes, what  I  can  do  is  I  can  show  that  process  A and  process  B is  similar  in  side -by -side  plots . For  more  higher  risk  attributes, what  I  want to  do   is  maybe something  like  a  quality  range. In  terms  for  quality  ranges, I  just  take  that  reference  group, the  old  process, and  I  built  some  range  around  it and  ensure  that  all of  th e measured  quality  attributes from  the  new  process fall  within  that  range. For  very  high  risk  attributes, what  I  want to  do is  I  want to  do  equivalen ce testing. This  is  what  I'm  going to  focus on  today, tell  you  about what  equivalence  testing  is, and  how  that  acceptable  difference or  that  similarity  condition  is  set. It's  called  equivalence  testing, Two One Sided  t-test. To  reiterate  what  we  talked  about  before, whenever  I'm  using  design  of  experiments in  process  development, what  I  do   is  I'm  changing some  variable -like  temperature from  low  to  high , and  I'm  measuring  the  effect on  my  critical  equality  attributes. I  am  assuming  in  the  null  hypothesis that  they  are  the  same. What  I  do  is  I  set  up  a  design to  see  if  they're  different. A   Type I  error  in  that  case would  be  me  saying, "Wow,  they're  different " when , in  fact,  they're  not. That  would  mean  that  I  would  control  that. I  would  spend  resources controlling  that  in  the  process. If  I'm  a  drug  manufacturer, I  do  not  want  to  control  things that  I  don't  need  to. I'm  concerned about  that   Type I  error  rate . If  I  was  a  regulatory  agency, I  would  be  even more  concerned  with  the   Type II. There's  no  difference when,  in  fact,  there  is . You  should  be  controlling  something and  you're  not. If  I  was  a  regulatory  agency, I'd  be  more  concerned with  the   Type II  error. Now, we're  going  to  flip  it. We're  going to  talk about  equivalence  testing. Equivalence  testing is  I'm  not  saying  that  they  are  the  same. I  am  assuming  that  there  is  a  difference. I just  want to  make  sure that  the  difference  isn't  too  big. That  too  big,  I'm  going to  call  delta. T here's  a  lot  of  different  ways to  calculate  that  delta. I'm  going to  call  it  d or  that  delta  right  there, often  called  the  equivalence acceptance  criteria. I would  like  to  come from  subject  matter  expertise, but  the  majority  of  times, it  comes  from  me  taking some   k-value  times  that  historical  value. That's  split  into  two  different  tests . On  one,  I'm determining if  it's  less  than  positive  d. In  the  other  one , I  want  to  show in  the  alternate  hypothesis that  that difference is  greater  than  a  negative  d. I'm  going  two  different  sides. That's  what's  called the  left -hand  side  of  the  bottom, or  the  top. In  terms  of,  if  I  was  a  drug  manufacturer, what  would  I  want  to  do? I would  want  to  be  able  to  reject both  of  those  hypotheses. I  would  want  high  power, low   Type II  error  and  high  power. T his  is  equivalent  to  taking a  90%  confidence  number around  the  difference  in  means in  ensuring  that  90%  confidence  interval whenever  I'm  looking  at  the  low  and  high or  within  the  balance  of  those  lower  delta in  the  upper  delta. If  you're  looking  at  this, you  should  think  to  yourself  is, "I  want  the  width  of  that  confidence interval  to  be  very  small." What  are  the  different  ways that  I  could  make  the  width of  that  confidence  interval for  the  difference  between those  two  means  very  small? I  could  decrease  my  standard  deviation. That's  a  good  thing. I  could  increase  my  sample  size. That's  a  good  thing. I  could  also  increase  my  alpha  level. Maybe  that  wouldn't  be  so  good because  what  you're  doing is  you're  inflating your   Type I  error  rate. In  inflating  your   Type I  error  rate, what  you're  saying   is, I  am  stating  that  they're  equivalent when  indeed they're not. The  different  ways  to   control  the  width of  that  confidence  interval is  to  lower  s, increase  in,  or  increase  alpha. We  talked  about   two of  those  being  good and  one  of  those  not  being  good. It  makes  sense  that   if  I'm  a  drug  manufacturer, I  want  to  maximize the  power  of  the  design. That's  the  flip. I  want  to  minimize  my   Type II  error. Regulatory  agencies  want  to make sure that  you  do  not  inflate that   Type I  error  rate. That   Type I  error  rate would  be  saying  your  assume  equivalent or  you're  stating  equivalence when  indeed  they're  not. In  JMP,  you  can  do these  equivalence  tests, and  I  want to  show  you  an  example  of  that. For  my  journal,  first  thing  I  want to  do is  I  want to  show  you that  in  terms  of  determining your   Type I  error  rates and  your   Type II  error  rates is  JMP  provides  power  curves under  Sample  Size  Explorer , Power , Two  Sample  Independent  Equivalence. Caleb  King  did an  awful  great  job  with  this. I say  awful  great  job, but  he  did  a  great  job  with  this. Let's  just  say  that  my  margin, my  equivalence  acceptance  criteria, is  plus  or  minus  2  standard  deviations. I'm  just  going  to  put  a  2  here, and  that's  just  2  times  the  standard deviations  that  I'm  talking  about. That's  all  that  I'm  doing. Let's  just  say  that in  my  historical  process is  I  have  10  lots, and  I'm  going to  compare  it to  a  new  process  that  has  5  lots. I  want  to  see  what  the  power  is if  they  are  exactly  the  same, but  there's  no  difference  between  these. A  few  things  that  I  want to  point  out  here is  JMP  gives  those  power  calculations. The  other  thing  that  it  does  is it  allows  you  change  those. What's  going to  happen  if  I  do  things like  increase  my  new  process, the  number  of  samples in  my  new  process  day  report , my  power  is  going to  go  up. What  would  happen  if  I  do  things  like, "Hey,  Heath, I  want  to  decrease  that  margin  of  error to,  instead  2  standard  deviations , to  say  maybe  1.5  standard  deviations, essentially , as  I'm  taking  those  boundaries and  I'm  tightening  them  up." What  I  see  is  my  power is  going  to  go  down. I'm  able  to  ask  myself   all  those  typical  questions that  you  would  in  equivalence  testing. This  is  something  else  that  I  want to  show  you  that's  going to  come  up is  JMP  has  the  ability  to  say, do  I  know   the  true standard  deviation  or  not? If  I  know  the  true  standard  deviation, that  is  going  to  be  better. You're  going to  see that  your  power  goes  up. Indeed,  what  happens is  my  power  goes  up. That's  usually  not  the  case. I  always  call  that  the  utopia, which  uses  the  cases  if  I  do  not  know what  that  true  standard  deviation  is. I  always  call  this  the  optimum, a  car  of  the  utopia. I  always  call  the  no,  the  realism. I  would  be  remiss if  I  did  not  show  you  the  tools that  JMP  does  have for  showing  that  equivalence like  if  I  had  an  historical  process where  I  had  10  lots  and  I  made  5  new  ones. First  thing  I  want to  do is  I  want to  look  at  this through Graph Builder, and  I  see  that  there  is  no  effect between  those  two. I  can  see  both  of  those , and  they  both  look  like they  came  from  the  new  process, the  blue  versus  the  red. How  about  if  there  is  an  effect ? What  I  do  is  I  see  a  shift. Just  like  I  showed  you  before is that  is  Two  One Sided   t-test. JMP  has  tools  for  that. Jin  Feng  did  a  great  job  with  this. My  goodness.  I  love  the  scores  plot. Here's  the  difference  in  means. Here's  the  lower , and  here's  the  upper, and  that's  within  the  boundaries. In  that  case,   what  you've  done is  you  rejected  both  the  null  hypothesis in  favor  of  the  alternate, which  is  the  same as  what  you  see  in  the  picture. What  you  also  see  here is  that  if  there  is  an  effect, is  I  am  not  going to  reject  both  the  nulls. One  of  those  is  I  am  going  to  fail to  reject  and  indeed  I  did. What  you'll  see   is  my  confidence  intervals outside  that  boundary. I  would  like  to  talk about  a  very  specific  case. A  very  specific  case  in  cell  therapy is  called  split  apheresis  design. In  a  split  apheresis  design , this  is  a  situation  where in  cell  therapies is  you're  changing  the  process. What  you  do is  you're  using  donor  materials  split between  the  two  different  processes. We  kept  getting  questions over  and  over  and  over  again from  our  customers  about, "Can  I  look  at  the  sample  size and  power  calculations for  these  pair  of  designs ?" Cannot  overlay  them . You cannot  see  if  they're  dependent upon  that  donor -to-donor  variability? Let's  talk  about  a  split  apheresis  design. In  a  split  apheresis  design, first  thing  I  want to  do is  I  want  to  tell  you  about the  regulatory  expectation. This  is  even  a  recent  draft guidance  document  from  the  FDA in  July  of  2023,  just  last  month. In  that ,  they  said  that  you  need to  select  a  suitable  statistical  test for   analysis in  difference between  paired  data where  those  donors  are  paired  up. That's  where  the  split  apheresis  design comes  from. For  every  single  donor  material that  you  have, you  split  it  in  between process   A and  process  B. This  is  not  two  independent   t-tests. What  this   is, is  a  paired  design. That's  the  first  thing that  I   wanted  to  talk  about. The  second  thing  is, I  wanted  to  talk  about that  you  are  very  often  in  early  stage, so  you  do  not  have a  line  of  donor  materials, so  you  have  very  low  sample  sizes. It's  hard  to  get  power out  of  low  sample  sizes. The  third  thing that  I'm  going  to  tell  you  is , how  do  you  come  up  with  your  EAC? How  do  you  come  up with  your  similarity  condition , that  difference, that  acceptable  difference? What  you  do   is  you  use  historical  data that  is  made  off  of  multiple  donors. You  take  the  standard  deviation used  off  of  historical  data. I'm  going to  call  that  n 1  or  historical. You  take  some  k  number of  standard  deviations  of  historical  data . You  do  a  test , and  you're  using the  split  apheresis  design to  judge  off  of  that  historical  data. These are two examples that I want to show you. The first example here is  where  you're  looking at  process   A and  process  B. What  you  see   is  you  do  see six different  donors  here. What  you  see  in  the  one  on  the  left is  the  majority  variation  is  coming from   donor-to-donor  variability, not  the  difference  between process  A and  process  B. You  have  high  donor -to -donor  variability. I'm  going to  call  that,  prho. In  the  case  on  the  right, what  you  do   is, is  the  majority  variation is  coming  from  the  difference between  process  A  and  process  B, not  the   donor-to-donor  variability. The  majority  variation  is  coming from  the  analytical  or  the  process. What  that  tells  you is  you  have  very  low  rho. You'd  have  low   donor-to-donor  variability. I'm  going to  show  you  a  series of  scripts  that  we  worked  on. These  are  typical  questions that  came  from  our  customers. In  our  cases,  we  do  not  know what  the  standard  deviation. How  does  that  compare  to  the  known? How  about  those   Type I and   Type II  error  rates? Remember,  if  I'm  a  drug  manufacturer, I  want  to  increase  the  power. If  I'm  a  regulatory  agency, I  want  to  make  sure  that  you  do  not inflate  that   Type I  error  rate. How  are  we  going to  do  this? This  is  from  the  European Medicines  Agency,  2001. The  best  way  to  do  that  is  with  things called  expected  operating characteristic  curves. That  gives  you  power  on  the   y-axis and  a  shift  in  the  main. I 'm  going  to  go  through a  series  of  scripts, and  these  series  of  scripts ... It's really one script that  have  right  here, that  it's  going to  allow  me to  change  things  like  that  rho, that  proportion of   donor-to-donor  variability. That  k -value,  remember, how  do  I  set  the  acceptance  criteria? It  is  k  times  that  standard  deviation. The  typical  way  of  doing  this is  that   k times  those  historical  lots. This  is  the  number  of  historical  lots that  you  use   n1. n2 is  the  number  of  lots  that I'm  going to  use  for  that  paired  design. Whenever  you  run  the  script, what  happens   is  you  come  out , and  it  does  a  series  of  simulations. In  this  case,  it  did  5,000  simulations , and  it  calculates  the  power  for  you. In  those  5,000  runs, what  percentage  of  those  passed? It  looks  something  like  this. It  gives  you  a  lot  of  different  options. My  goodness . I  can  look  at  different  k -values. I  can  look  at  a  different  number  of   n1, which  are  called  historical  lots. I  can  also  look  at  the  different number  of  n 2  or  paired  lots. Right   now, I  want to  talk  about... Whenever  I  do  this,  what  I  can  do is I  can  select  which  of  these different  cases  that  I  want  to  look  at to be  able  to  answer  typical  questions. Let  me  open  up my  typical  comparisons  here. The  first  one  I  want to  talk  about  is, "Heath, what  if  I  have a   known standard  deviation?" Look s something  like  this. That's  what  the  known standard  deviation  looks  like. A  few  things  that  I  want to  point  out is  this  is  the  percentage  of  time  that  you're  going   to  claim  equivalence. If  they're  exactly  the  same  that  you  said you're  going to  claim  equivalence a  high  percentage  of  time. If  there's  a  huge  difference  between  them like  a  two  standard  deviation  shift or  a   three standard  deviation  shift , is  you're  not  going  to  claim  equivalence. That's  a  good  thing. The  other  thing that  I  want to  show  you  here is   if  you're  looking at  this  alpha  of   0.05, being  that  I  set  my  k -value  at  2 , k  number  of  standard  deviations versus  10  historical  lots, the  standard  deviation of  10  historical  lots, you  would  expect that  alpha  level  would  be  0.05, the  exact  alpha  level that  I  set  in  my  equivalence  test. Right  now, the  thing  that  I  want to  show  you is  this  is  for  a  proportion of  donor -to -donor  variability  of  90%. What  happens  if  I  change  that? What  happens  if  I  change  that  to  60%. What  happens  if  I  change  that  to  30% ? There's  no   donor-to-donor  variability. What  you  see   is  that  paired  test, the  power  curve  looks  really good whenever  I  have high  donor -to -donor  variability. The  other  thing  that  you  notice with  the  known  standard  deviation is the  alpha  level  regardless of  operating  characteristic  curve is  always  at   0.05. Let's  talk  about   some  other  typical  questions. One  typical  question   is , how  does  it  compare for  the  different  levels   of  rho? How  does  my  typical  way  of  doing  this ? I  do  not  know what  the  standard  deviation  is. My  typical  way   of  doing  this is  in  the  blue. The  known  standard deviation  is  in  the  red. One  thing  that  I  want to  point  out is  I  want to  point  out this  one  right  here. What  you  see is   the  preferred  approach, the  approach  that  even  regulatory documents  have  said  that  you  should  do, the  paired  approach, using  the  standard  deviation that  is  calculated off  of  my  historical  lots, is  I  have  an  inflated  Type I  error  rate. This  should  be   0.05  just  like  it  is  here. That  was  really strange  to  us, and  we  looked  into  this. When  we  looked  into  it, what  we  found  is , it  has  everything  to  do with  this  right  here. The  reason  why  it  has  everything to  do  with  this  right  here, as  I  said,  how  do  I  decrease   the  width of  that  confidence  interval? The  way  that  I  decrease the  width  of  that  confidence  interval was  either  to  decrease  s , or  increase  n , or  increase  my  alpha  level. Understand  this. This  is  why  you  have an  inflated   Type I  error  rate with  this  paired  test is  those  deltas , which  you're  using  to  judge  this  off  of,  those  deltas are  using  the  standard deviation  off  of  historical  data that  contains  donor-to-donor  variability. That  confidence  interval  right  there does  not  contain donor-to-donor  variability. Why? Because  you  did  a  pair  test . That  contains  only  analytical and  process  variability. That's   where that  inflated   Type I error  rate  comes  from. Using  this  paired  approac h is  understand you  have  an  inflated   Type I  error  rate. We  see  that, and  it's  even  more  prevalent when  you  have  high donor -to -donor  variability. Why?  Because  if  you  have low   donor-to-donor  variability, th at process  variability is the  largest  part  of  the  variance component  that  you  have. Let's  look  at  a  few  more questions  that  you  have. A s  I  said,   this  one  script answers  these  different  questions. This  is  answering  the  question , "Hey,  Heath,  if  I  use  that  paired  approach that's  recommended, can  I  look  at  what  happens as  I  increase  sample  size from   3  to 4 to 5 to 6?" Two  things  that  I  want to  point  out  here is  number  one ,  what  you  see as  I  increase  sample  size, is  I'm  going to  have  higher  power. I  still  do  not  have  adequate  power if  there's  no   donor-to-donor  variability. That  means   that  I  have 0  donor-to-donor  variability. I  would  need  at  least a  sample  size  of  8  or  8  different  donors. If  I  do  have  high donor-to-donor  variability, like  0.9, 90 %  of  that  variability, which  you  see  is  I  do  have  high  power for  no  difference  between  the  means. What  I  can  do   is  I  can  make  sure to  answer  those  questions with  overlaid  operating  character s occurs  for  different  sample  size. I  can  also  answer  that  question   if  I  was  looking  at , and  I  say,  "Hey , I've  stated  my  different  sample  sizes, but what if we look at  the  different  k -values?" Understand  that  your  acceptance  criteria is  k  number  of  standard  deviations. What's  going to  happen is that  acceptance  criteria are those  what  I  call  go  post are  going  to  widen  as  you  increase  k. Therefore,  you're  going to  have a  much  higher  ability to  pass  equivalence , and  you're  going to  have much  higher  power. Another  typical  question  is  this. What  if  I  want to  change both  of  those  together? I'm  a  big  fan  of   Graph Builder . What  Graph  Builder   is what  you're  looking  at  here is not  only  are  you  looking  at, "Hey , Heath,  I  am  increasing sample  size  in  blue,  that  would  be  3, in  red,  that  would  be  4, in  green,  that  would  be  a  5, and  in  purple,  that  would  be  6 , but  I  also  looked  at  it for  different  k -values. What  would  your  operating characteristic  curves  look  like?" Good? I  want to  revisit  this. Just  like  I  said  before , I  said, "Hey ,  I  want to  revisit  this and  show  you  that  for..." Whenever  I  have  a  large  proportion of  donor -to -donor  variability , I  said,  "What  you  see  for  2  right  here, I  would  expect   my  alpha  level that  my  proportion  of  time that  I  pass  this  test  would  be   0.05." But  what  you  see  is  you  have inflated   Type I  error  rate. How  does  this  look? Whenever  I'm  looking  at   a  rho or  a  proportion of   donor-to-donor  variability that  is  very  small, I  do  not  have  much  power. The  question  was , what  if  we  did  this  instead? If  we  had  low   donor-to-donor  variability, if  what  we  did   is  we  used information  from  those  historical  lots. If  I  have  no   donor-to-donor  variability or  very  low   donor-to-donor  variability, why  couldn't  I  just  do a  independent  t -test, where  I  compare  from  process  A or  my  historical  process, not  just  the  paired  lots, but  I  also  consider those  10  historical  lots and  not  comparing  to  the  mean of  the  new  process? We  wanted  to  see  how  that  compared . Doing  it  that  way   is the independent test  is  in  the  red. The  paired  way  is  in  the  blue. What  you  see   is,  if  I  have little  to  no   donor-to-donor  variability in  my  cell  therapy split  apheresis  process , you  said  that  the  independent   t-test has  much  better  profile than  the  paired  approach. However,  if  I  have  high donor-to-donor  variability, that  paired  approach  in  the  blue has  a  much  better  operating characteristic  than  the  red. Right  now,  the  question  is instead  of  just  automatically doing  that  split   apheresis  pair  design, maybe  it  would  be  better to  make  a  decision  based  upon that   donor-to-donor  variability. How  does  this  compare   whenever I'm looking  at  different  k -values? I  see  the  exact  same  thing, the  exact  same  phenomena that with  a  low  donor-to-donor  variability, it  makes  sense  to  do the  independent  t -test. With  high   donor-to-donor  variability is  I  have  a  much  better operating characteristic  curve are  higher  power  associated with   the  paired  approach. It  doesn't  matter  if  I  looked at  a   k of  1.5 , or  2,  or  even  3.0. Regardless  of  the  k -value , I  have  a  much  better operating  characteristic  curve if  I  consider that  donor -to -donor  variability. What  if  I  looked  at  different  values of  those  historical  lots? I  looked  at  3. We  looked  at  4. W e  looked  at  5  paired  lots. We  looked  at  6  paired  lots. Regardless, you  see  the  same  phenomena. We're  currently  writing  a  paper  on  this to  try  to  propose that  if  you  have  low donor-to-donor  variability, maybe  it  does  not  make  sense for  you  to  use  a  split   apheresis or  a paired  analysis  approach. Maybe   the  approach  is  only  good whenever  you  have  high donor -to -donor  variability. T hese  are  typical  questions   that  are  asked in  the  split   apheresis  designs. What  I  want to  do is  I  just  want to  cover t wo  or  three  more  of  these j ust  to  show  you  a  few o ther  things  that  you  could  do. These  are  different  things that  we  were  looking  at . We  looked  at,  "Hey , how  does the  operating  characteristic  curve, how  does  that  compare if  we  looked  at  in  the  blue that's  using  nothing but  the  historical  lots to  estimate  the  standard  deviation versus  if  you  use  the  paired  and  the  historical  lots, which  is  in  the  red?" What  you  see  is  there's  not  much difference  between  these  two, especially  if  I'm  using  higher sample  sizes  like  the  n 2. W e  also  looked  at, "Hey,  if  I  estimated that  standard  deviation using  a  few  different  ways, what  if  I  looked  at  estimating that  standard  deviation using   the  historical  lots, which  is  in  the  blue,  versus  in  the  red is  using  the  historical  lots and  the  paired  lots? I  compare  the  independent  case versus  the  paired  case. What  do  I  see? " As  I  said  before, you  see  that  exact  same  phenomena with  a  low  donor-to-donor  variability. The  much  better  way  of  doing  this would  be  an  independent  t -test on  the  lower  right -hand  corner. That  is  where  you  high donor-to-donor  variability. It  makes  sense  that  we  would  use the  paired  approach. Last  one  that  I  want  to  show  you is  this  is  something that  we've  been  working  on. We  looked  at  the  paired  approach versus  the  independent. The  paired  approach  is  in  the  blue. The  independent  is  in  the  red. I've  said  this  over and  over  and  over  again. That  it  makes  sense  that  if  I  have low  donor-to-donor  variability, the  independent  case in  the  blue  looks  much  better. If  I  have  high   donor-to-donor  variability, the  paired  approach  looks  better. But  one  thing  that  we  did i s we  took  a  look  and  just  said, "What  if  I  took  a  look   at  the  approach that  gave  me  the  shortest with  that  confidence  interval?" That's  in  the  green . What  you  see  is  that  usually gives  you  the  best  approach regardless  of   what  your  rho  is or  what  your  proportion of  donor -to -donor  variability  is. In  closing, I  would  like to  just  point  out  a  few  things. This  script  that  we  have  answers, along  with  the  typical  questions that  our  customers  have on  operating  characteristic  curves, associated  with  these split   apheresis  designs, what I do  want   to  pull  away from  here,  though, is  if  you  do  have  a  low  proportion of   donor-to-donor variability is  you'll  see  that  these  designs are  very  underpowered for  fewer  than  8  lots, fewer  than  8  different  donor  material. We  live  in  a  world  in  cell  therapies where  you  do  not  have  a  lot  of  donor  materials, so  you  have  very  low  sizes. It  would  be  much  more  efficient if  you  had  low   donor-to-donor variability to  use  the  independent  case. We  do  have  the other  revisions  that  we  made  on  this where  if  you  were  able to   make  multiple  lots for those  paired  approaches with  the  same  donor, or  if  you're  able  to  take  multiple  measurements to  be  able  to  look  at  those operating  characteristics  curves. Thank  you.
Monday, October 16, 2023
There is a no-man's-land in JMP, a place where it is scary to venture. It is a place inhabited by specters of data sets too complex for simple nonlinear regression, but too ephemeral for functional data analysis. It is a strange place. It appears basic methods are enough to venture here but using those techniques quickly becomes unmanageable. These spectra can come from many sources, but all have a common problem: there is too much data for simple nonlinear regression or too litttle to use functional methods. Join me on a journey across this challenging landscape of classical spectroscopy to learn methods for extracting information from examples of complex spectra and how to automate the process.      There's   a  strange  place  that  sits between  the  analytical  tools  that  you would  use  to  do analysis  with  known  physical  models and  with  simple  curves and  the  analyzes  that  you  would  do, say,  with  functional  data  explorer, where  you  have  families  of  curves that  have  complex  shapes, and  you're  less  interested  in  the  actual physical  nature  of  the  shapes  themselves as  you  are  in  just  relating  them back  to  observed  phenomenon. This   strange, no  man's  land  of  analysis  in  JMP  is  where a  lot  of  first  principles  techniques  sit. Things  like  X- ray  diffraction, things  like  HPLC, where  we  have  known  physical  methods and  known  equations  that  help us  describe very  fundamental  phenomena of  a  molecule  or  crystal  or  a  system. All  we  have  to  do is  plug  peak  positions  in or  area  under  the  curve  information  in and  we  can  get  some  very  sophisticated analyzes  out  of  fairly  simple  data  points because of  these  first  principles  methods. At  first  blush,  it  would  seem  like JMP  should  be  able  to  handle  that. It  seems  like  it's  got  all  the  tools  but, when  we  dig  into  doing  those  kinds of  analyzes  we  suddenly  realize  that t he  problem  is  a  bit  more complex  than  what  we  would  expect. Today  what  I  want  to  do  is  focus on  some  techniques  and  strategies to  deal  with  some  of  those  simpler  cases and  then  introduce  some  tools that  we  can  use to streamline  those  larger more  complex  problems. Let's  go  ahead  and  let's  move into  JMP  and  have  a  look  at  that. To   start  off,  let's  go  ahead  and  have a  look  at  a  very  simple  case, a  single  peak  on  a  background. How  would  we  go  ahead  and  pull the  information  out  of  this  peak? How  would  we  get  its  center  position? How  would  we  get  its  full  width   at  half  max  or  its  standard  deviation or  even  the  area  under  the  curve? How  would  we  get  that  information? Well,  most  of  us  that  have  done  this  for a  while,  we  would  say, oh,  you  know  what, I'm  going  to  go  into   Fit curve and  I'm  going  to  say,  here's  my  y, and  I'm  going  to  say,  here's  my  x. Then  I'm  going  to  go  ahead  and  I'm going  to  go  fit  a  peak  model  of  some  kind. Let's  just  say  the  Gaussian  peak  and  you look  at  that  and  you  go  hey, 98%  R² ,  that's  awesome,  that's  great. Let's  see  if  we  can  do  a  little  better. Just  to  skip  ahead  just  a  little  bit  here we  could  look  at  the  Lorentzian  peak shape  and  the  Gaussian  peak  shape and  we  can  see  that those  both  give  fairly  good   R², they  give  fairly  good  peak  fits. We  could  even  come  into  the  values underneath  each  and  we  can  pull  up the  area  under  the  curve  for  them. But  how  good  are  those  fits  actually? Let's  take  a  look at  them  a  different  way. What  I  want  to  do  is  take  and  we'll  go ahead  and  pull  up  graph  builder  on  these and  we'll  look  at  how  the  models  relate to  the  residuals  for  those  peaks. We  can  see  a  very  different  story than  what  we  saw  in   Fit curve with  these  two  peak  shapes. We  can  see  that  there's  a  systematic error  built  into  these  peak  shapes. The  reason  we  can  see that  with  the  Gaussian, we  can  see  that  it's  kind of  underestimating  at  the  center. It's  doing  okay  on  the  shoulders,  but  out in  the  tails  it's  really  missing  things. We  can  see  almost the  inverse  for  the   Lorentzian. Why is that? Well,  the  truth  is  that in  spectroscopy  particularly, there  are  a  lot of  different  peak  shapes. It's  not  just  Gaussian, it's  not  just  Lorentzian. There's  actually  a  whole  family of  peak  shapes  that  are  out  there to  handle  all the  different  physical  phenomena that  result  in  the  peaks that  we  see  in  spectroscopy. How  do  we  deal  with  those  in  JMP? Well,  it's  actually  really  quite  easy. Let's  start  with  looking  at what  the  results  of  using the  correct  peak  shape  is. Here  I've  got  the  Gaussian  again, the  residuals  for  the  Gaussian  peak  fit and  the  blue  line  in  this  case is  no  longer  the  Lorentzian. It's  called  a  Pseudo Voigt, which  is  an  approximation  of  a  peak shape  called  a  voigt  function. Notice  that  the  residuals for  the  Voigt  function  are  dead  flat. We  are  actually  doing  much  better. Before,  if  we  were  to  try  to  do quantification  with  the  Lorentzian  or the  Gaussian, we  would  run  into  a  situation  where  we might  over  or  underestimate the  quantity  of  a  material  in  a  sample. With  the  Voigt  in  this  case, because  this  is  a  voigt  peak  shape, we're  actually  going  to  get the  accurate  quantification  of  that. That's  the  important  thing. Now  how  did  I  do  this? Well,  there's  a  few  ways  to  do  it. The  easiest  is  to  come  into  the  model come  into  the  data table  and  create  a  model. The  model  is  really  easy  to  make. This  is  the  voigt  peak  shape. Looks  a  little  scary,  but  that's just  the  nature  of  the  math. Here  I've  got  a  parameter for  the  baseline  and  this  whole  mess. Here  is  the  voigt  peak  shape. We  can  come  into  the  parameters  settings and  define  starting points  for  each  of  our  values. Then  going  into we're  not  going  to  use   Fit curve, we're  going  to  come  down  to  nonlinear. We  can  use  that  as  a  starting  point for  an  analysis. I'm  going  to  expand underneath  40  minutes. That's  actually  a  good  habit to  get  into  in  this  case. I  did  that  wrong. Let's  go  back  and  redo  that. Should  be  the  counts. There  we  go. That  looks  better. Now  if  I  go  ahead  and  click  Go, it  does  my  peak  fitting for  me  and  everything. That's  great. Can't  get  the  area  under the  curve  here  very  easily. But  I  can  get  just  about  every other  parameter  that  I  need. The  nice  thing  about a  lot  of  these  peak  shapes is  they  also  have well  defined  integrals. Once  you  have  the  standard  deviation and  the  mean  and  those  information, you  can  usually  get the  integral  fairly  easily, the  area  under the  curve  fairly  easily. That's  one  way  of  handling  that. But  it  introduces  a  large  amount  of  error possibilities  in  this  peak  shape. We've  given  ourselves a  lot  of  potential  problems. What  we  really  would  like  is something  that  looks  a  bit  more  like  this, where  we've  got  a  predefined function  called  the  PseudoV oigt. We  give  it  all  of  our  fitting  parameters and  there's  our  fitting parameter  for  our  baseline. It's  the  same  math, but  we   cloaked  it  in  an  easy to  understand  function  where  we  are  just providing  the  parameters that  we  want  to  fit. It  works  the  same  in  nonlinear. How  do  I  do  that? Well,  there  are  a  few things  that  we  can  do. We  can  define  in  and  there's a  lot  of  code  right  here. But  the  big  things that  we  want  to  pay  attention  to are  the  fact  that  we're defining  a  function, that  we're  defining  some  parameters. At  the  very  bottom  of  this, this  is  a  family  of  parameters. I  am  using  the Add  custom  functions  operator to  put  those  into  the  memory of  JMP,  so  that  JMP  knows  that  I've  got these  custom  functions  and  knows  what they  look  like  and  knows  how  they  behave. Doing  it  that  way  provides some  really  powerful  tools. If  I  come  into  the  scripting  index, once  I've  defined  my  functions, they  show  up  in  the  scripting  index. I  didn't  really  give  a  lot of  descriptions  here, but  you  could  give  quite  detailed descriptions  and  examples here  as  you  would  like. The  other  thing  that  we  can  do, again  coming  back  into  our Fit  model  is  when  we  define  these functions, we  get  our  own  entry in  the  formula  editor, which  lets  us  just  click  on  one of  these  and  use  them just  like  we  would any  other  function  in  the  formula  editor. Again,  these  are  actually quite  easy  to  define. The  examples  in  the  scripting index  make  it  very  easy  to  do. Just  search  for  Add  custom  function and  you  can  just   use  the boilerplate  there  to  build  off  of  that. There's  also  a  great  blog post  on  how  to  do  that. That's   one  answer to  one  question  that  we  have. Let's  continue  on  and  let's  look at  a  different  question, maybe  a  slightly  more  complex  problem. What  happens  if  we  have  two  peaks? So  suddenly  Fit curve is  no  longer  on  the  table. We're  going  to  have  to  use   Fit nonlinear and  that  also  suggests  how we  might  work  with  this. We're  going  to  basically  have  to  break  out our  equation,  our  model that  we  had  before. I  break  it  out  column  by  column just  to  manage  all  of  those  bits and  pieces  that  we  saw  before. I  have  one  for  my  baseline, I  have  one  for  my  peaks, and  then  I  have  one  for  my  spectris. Let's  have  a  look  at  what  all those  look  like  really  quick. Let's  start  with  the  baseline because  it's  got  a  little  bit  of  a  gotcha that  we  have  to  worry  about. The  baseline  just  has  the  fit  parameter for  the  baseline, but  it  also has  this  x  term  times  zero. That's  because  nonlinear  expects  every equation  that  goes  into  a  formula to  tie  back  to  the  x  axis that  you're  providing. We  put  x  times  zero  in  there  just to  have  it  be  okay  with  plotting  that. That's  just  a  little  gotcha that  you  have  to  deal  with. That's  one  piece,  peak  1  looks  just like  we  would  expect  with  its  parameters. Peak  2  looks  just  the  same, except  it's  got  different  parameter names  so  we  don't  have  any  collisions. Peak  one  was  1, 2, 3, 4  and  peak  2 is  B  1,  2, 3 ,  4,  5, 6, 7, 8. That's  the  only  thing  we  have  to  do. Then  the  spectris  itself, the  thing  that  we're  going  to  fit, the  things  that  we're  going  to  put into  not  the   Fit nonlinear  platform, is  we're  just  going  to  say  my  baseline curve  plus  my  peak  1  plus  my  peak  2. Just  like  I  showed  you  before  doing that  in   Fit nonlinear, here's  my  spectris  that  goes into  the  prediction  equation. I'm  going  to  remember  to  put my  counts  in  and  not  my  x  curve. Just  like  I  said  before, I'm  going  to  expand  my  intermediate formulas  and  that's  going  to  tell  JMP to  dig  back  in  from  that  first  formula into  all  the  formulas  that  are in  the  individual  columns. We  click  Okay,  hey,  we  see what  we  expect  to  see. Now  we  can  click  go and  it  goes  through  and  fits everything  just  like  we  would  expect. We  get  a  nice  fit and  we  have  the  ability  to  get  confidence intervals  and  everything else  we'd  like  off  of  that. Two  peaks  is  reasonable  and  possible. But  the  problem  that  we  run  into  is what  happens  when  we  have  something that  looks   like  this. At  a  rough  count, there's  probably  a  dozen  peaks  there plus  a  complex  baseline  that's not  actually  a  straight  line  that's probably  got  some parabolic  behavior  to  it. We've  got  a  complex  baseline, we've  got  multiple  peaks. We're  going  to  have  to  make one  formula  for  each  of  those. There's  a  lot  of  legwork to  build  in  something  like  this. If  you  get  into  X- ray  diffraction, the  problem  gets  even  worse. There's  comfortably  30, 40  peaks  in  this  spectris  right  here that  we  would  have  to  work  with. The  first  question  that  we  need  to  ask  is, can  nonlinear  handle that   a  problem? Well,  it  turns  out  that  it  can if  we  just  use  nonlinear  and  I'm  going to  do  something   wild  and  crazy. I've  got  it  fitting  a  Lorentzian  peak and  I'm  going  to  come  back  and  I'm  going to  actually  have  it  fit  it  in  real  time. You  can  watch  that  as  it  goes  through. It  nails  each  peak  in  near  real  time as  I  move  through  this  quite  quickly. It's  hitting  the  big peak  in  each  group. That  says  that  the  Fit  engine can  probably  handle  the   processing that  we're  dealing  with. That's  fine. This  really  becomes  more  of a  problem  of  logistics  than  a  problem  of actual  functionality  within  JMP. It  really  is  a  real  problem. If  we  were  to  look  at,  let's  just  say we're  looking  at  fitting  Voigt  peak  shapes and  we  could  talk  about  Lorentzian and  we  could  talk  about  Gaussian, we  could  talk  about  the  Pearson  seven, all  those  different  types  of  peak  shapes. But  the  voigt  peak  shape has  five  parameters, the  x  axis  and  then the  forfeit  parameters. That  roughly  equates  to  about six  mouse  clicks  per  peak. Even  if  you're  doing  it  in  a  single formula,  it's  six  mouse  clicks  per  peak. That  says  that  for  a  ten  peak  formula, for  a  ten  peak  spectra, we're  going to  have  to  do  88  mouse  clicks. However  long  that  takes  you  per  mouse click  is  dependent  on  many,  many  factors. But  if  we  were  to  do  something  like  that X-ray  diffraction  pattern, we're  talking in  the  range  of  300  mouse  peaks. If  it's  actually  up  around  40, it's  actually  around  300  mouse  clicks. That's  a  lot  of  clicking  around that  we  don't  want  to  have  to  do. We  would  like our  interaction  with  the  spectra to  be  something  along the  lines  of  one  click  per  peak. That  suggests  that  we  need some  automation  built- in. Let's  have  a  look at  how  I've  done  that. I've  taken  a  tool and  built  a  tool  to  handle  this. I've  actually  taken  a  number of  different  solutions  here. First  off,  let's  look  at  the  library of  peaks  that  I've  generated. Spectriss. The  title  of  this  talk  includes  in  it a  number  of  different  peak  shapes. We  include  a  family  of  Gaussian  peaks that  have  a  split  Gaussian that  gives  you  a  different  standard deviation  for  the  x  and  y  parameter for  one  side of  the  peak  from  the  other. The  same  with  Lorentzians, the  Pearsons  and  then  the  PseudoV oigts. These  all  also  have  versions  that  are tuned  to  give  you  the  area  instead of  the  intensity  as  a  fit  parameter. That's  the  area  term  in  all  of  these. That's  one  piece. When  we  load  in  the  spectris, add  in,  we  get  that  for  free. That's  automatic. Now  let's  look  at  the  other  challenge. Let's  take  that  olive  oil  spectris. What  we  really  want  is  a  tool where  we  can  come  in  and  say, here's  my  X- axis,  here's  my  Y- axis. I  just  want  to  be  able to  do  some  peak  finding. Here's  my  four  main  peaks. It  found  them  automatically. Maybe  I  want  to  do  a  first  derivative or  maybe  I  want  to  do  a  quantile. I  can  also  remove  the  background here  so  I  can  click  finished. It's  found  those  first  three  peaks  for  me. I'm  going  to  go  ahead  and  change my  background  to  a  linear  one. Now  I  can  come  in  also  and  do some  manual  peak  selection. Behind  the  scenes, it's  taking  care  of  writing  all  of  those peak  parameters  for  you so  that  everything's  nice  and  tidy. There's  probably  one  right  there. Probably  one  right  there. There's  one  right  there. Every  time  you  add  a  peak, you  can  come  in  and  select  the  peak in  the  list  of  peaks,  and  it'll  give  you the  information  calculated  at  that  time. You  can  see  right  here, these  peaks  are  not  well  defined. They're  not  fitting  the  data  very  well. Really  we  want  to  go over  into  nonlinear. I've  taken  in   hacked  nonlinear so  that  it  will  run  this  in  real time  and  look  nice  and  pretty. You  can  watch the  peak  shapes  changing. Realistically,  I  might  have  chosen a  quadratic  instead  of  a  linear  for  this, but  just  for  the  sake  of  interest. Here,  I've  run  out  of  iteration. I'll  increase  the  iteration  and  I'll also  back  off  just  a  touch on  my  gradient  so  that  I  can  try  and  get this  thing  to  converge  a  little  quicker. Okay,  we'll  take  that  as good  enough  for  the  moment. We  can  say  that  we  want  to  accept  the  fit, and  there's  my  fit  parameters. Then  I  can  say  done. It  brings  it  back  over  into spectris  for  me  to  work  with. I  can  now  say,  refine  my  AUC  parameters and  I  can  come  in  and  get  my  new approximate  area  under  the  curve. That's  great  and  grand, but  what  I  really  want  is  an  output  table that  has  all  those  parameters  and  their information  attached  to  them. That's  spectris  in  a  nutshell. The  goal  with  this  project  was to  take,  like  I  said  before, we  want  to  have  the  ability  to  handle physical  peaks,  multiple  peaks, with  an  easy  to  use  interface that  handles  those  curves where  we  need  the  area  under  the  curve, the  physical parameters  attached  to  each  peak. But  we  really  don't  either  have   enough  data  to  use  in  fit  model or  in  a  functional  data  explorer,  or it's  just  not  the   problem  where  we want  to  work  with  that  particular  tool. The  tool  is  up  again. The  QR  code  here  will  take  you to  the  add  in  on  the  community where  you  can  work  with  it. Spectris  is  up  now  and  ready  to  go.
At HP Hood, the use of design of experiments (DOE) has helped to successfully identify formulations in new product development, delighting consumers of our food products.   Throughout our R&D department's wide implementation of DOE, there have been a number of successful DOE models, as well as some unsuccessful ones. One major stumbling block to model building with DOE has been dealing with subjective binary responses such as acceptable/unacceptable. These binary responses provide less information than continuous responses, thus inhibiting the ability to extract meaningful results from designed experiments.    This presentation shares simple and practical strategies for using the JMP DOE platform to convert these binary responses to continuous ones, resulting in improved models and powerful insights. Real-world examples from consumer food products are given to demonstrate how DOE can be used for more than just building models. It can be used to overcome the problem of responses that are difficult to measure.     This  talk  is  titled  Expanded  Uses   of  Converting  Binary  Responses   to  Continuous  Responses in  Consumer  Product  Development. It's  a  bit  of  a  mouthful, but  I  promise  it  won't  be  that  complicated. My  name  is  Curtis  Park. I'm  a  principal  scientist  at  HP  Hood. HP  Hood  is  a  company, a  food  and  beverage  company. We  make  a  lot  of  different  milks,   nondairy  milks. We  also  make  yogurt, cottage  cheese,  ice  cream. So  a  lot  of  a  lot  of  fun things  to  taste  at  work. I'm  a  food  scientist  by  education. A few  years  ago I  was  asked  to  take  a  look  at  a  problem that  we  had  for  one  of  the  beverages that  we  were  producing. I'm  going  to  show  you a  video  just  so  you  can  see. But  we  were  getting   a  lot  of  consumer  complaints and  these  complaints  were  happening when  the  product  was close  to  the  end  of  shelf  life. A s  you  see  in  this  video, it's  pretty  obvious   why  people  were  complaining. I  think  I  would  complain  if  I  saw  something  like  that  too. It's  supposed  to  be a  nice  portable  beverage. It's  thick  and  chunky   when  it's  being  poured  out. Not  what  I  would  expect. Believe  it  or  not, this  product  was  not  spoiled. I  promise  you, it  was  not  spoiled. So  I  was  asked  to  take  a  look  at  this and  figure  out  how  can  we  fix  it? What's  the  problem? How  do  we  fix  it? HP  Hood  at  the  time, this  was  a  few  years  ago. We  were  early  on  in  our  journey with  using  JMP, and  so  I  was  really  excited  to  have an  application  to  use  in  real  life rather  than  just  reading about  it  or  learning  about  it. Naturally  I  felt  like  this,  like  Yahoo! Let's  run  a DoE,  let's  do  it. I  was  really  excited and  for  those  of  you  who  might  not  have as  much  experience  doing DoE, the  first  step  is  usually  taking  a  look at  what  factors  should  I  be  looking  at. So  we  did  a  few  experiments. If  you  can  forgive  me,  they  were  probably one  factor  at  a  time  experiments. But  we  narrowed  in  on  what  we  believed   were  the  key  ingredients that  could  have  been  causing  the  problem. We  ended  up  making  a  design. This  is  probably  the  fourth  or  fifth  iteration  of  the  design that  we  came  up  with, and  this  was  in  custom  design. So  if  you  go  to  custom  design, that's  that's  the  platform   that  we  use  to  generate  this DoE . A s  you  can  see, this  is  this  is  what  we  had. So  we  had  ingredients  A,  B  and  C, and  it  was  actually  a  response  surface. So  we  had  all  of  the  two  way  interactions and  the  quadratic  terms built  into  the  model. It  ended  up  being  17  runs, as  you  can  see  here. It's  17  different treatment  combinations. This  much A ,  this  much B ,   this  much  C  for  each  run. Once  we've  settled  on  this  design, we  were  really  excited   so  let's  go  solve  this  problem. Piece  of  cake,  right? You  go  into  the  lab,  into  our  pilot  plant, you  throw  some  things  together, the  beverage  comes  out. I'm  making  it  a  lot  more simple  than  it  actually  is. We  made  17  different  beverages and  then  we  put  them   on  the  shelf  for  a  little  while because  as  I  mentioned  earlier, it  takes  a  little  bit of  time  for  this  problem  to  appear. Put  them  on  the  shelf  for  a  while, sat  until  they  were  ready  to  be  analyzed. This  is  just  a  screenshot of  a  data  table. This  has  our  actual or  our  design  that  we  used. A s  you  can  see,  there's  a  column here  to  the  right  that  I  highlighted. It's  our  our  friend,  the  Y, our  response  column. So  once  we  got  to  the  point  where  we  were ready  to  ready  to  measure  that  chunky  pour now  we  started  thinking,  Oh,  how  are  we  going  to  measure  that? Because  a t  the  time, we  did  not  have  a  chunky  parameter. I've  never  heard  of  one. I've  never  found  one. If  anyone  has  ever  found  one, we'd  love  to  to  see  it  and  maybe  buy  one. But  it's  our  knowledge. It  doesn't  exist. So  what  options   did  we  have  to  measure  this? Because  if  you  can't  measure  it  with  DoE it's  really  not  that  useful. So  we  have  a  fe w  options. First  thing  is  we  can  measure everything  as  a  binary  response. So  it's  either  a  pass  fail, it's  good  or  bad etc. There's  some  pros  with  this  and  some  cons. The  pros  would  be  it's  pretty  simple  to  do,  right? Anybody  can  say  pass  or  fail  and  it takes  you  like  no  time  to  to  measure  it. However,  it  has  some  serious  cons  to  it. Such  as, it's  really  subjective  to  the  observer. What  I  think  is  good, a  colleague  of  mine  might  think  is  bad. Or  even  worse,  what  I  think  is  good, my  boss  might  think  is  bad. So  it's  really  subjective. While  it  can  give  you  some  information, they  don't  give  us as  much  information  as  we  want. Because  when  you  do  logistic  regression, what  you  get  out  of  it  really  are  just probabilities  of  something  passing or  probabilities  of  failing. In  my  experience, that's  been  difficult  to  communicate and  to  really  understand  what  to  do with  that  data, especially  when  we're  trying to  communicate  with  non-technical  people. So  continuous  if  there's  any  way  to  get a  continuous  response, that's  what  we  strive  for  because they  give  us  a  lot  more  information. We  can  know  how  good  is  it   or  how  bad  is  it, because  not  all good  are  created  equal. There's  another  option we  could  have  done and  I  would  say  this  is  probably  the  best  option if  you  can  do  it,  is  we  could  run consumer  testing  and  get  consumer  input. What  this  would  look  like  is  I  have  all our  beverages,  17  beverages, and  we  recruit  maybe  100, 120  consumers  of  our  product and  we  have  them  sit  down   and  rate  every  single  one for  different  attributes, one  of  them  probably  being  how  well  do  you  like  how  this  pours? The  reason  why  this  is  a  gold  standard is  because  those  are  the  people's opinions  who  matter  to  us. What  we  would  do  is  after  we  get 100  or  120  responses, you  take  a  look  at  the  data  you  get, we  can  take  averages  and  put those  averages  into  our  model. However, it  can  cost  a  lot  of  money and  it  can  take  a  lot  of  time. So  if  your  budget  doesn't  allow  it or  your  timeline  for  whatever  reason   doesn't  allow  it, you  can't  do  this  for  everything. Sometimes  the  thing  you're  trying  to  measure isn't  such  a  huge  problem   that  you're  trying  to  solve that  it's  worth  spending  all  that  money. But  it  would  still  be  important to  be  able  to  measure  it. Do  you  have  any  other  options? I  mentioned  this  earlier. You  can  find  an  instrument  that   can  measure  what  you're  looking  for. Sometimes  they  exist. Like  I  said,  I  don't  know of  a  chunky  parameter. I  looked  in  our  warehouse  in  our R&D  center,  couldn't  find  one. Even  if  you  can  find  one, if  this  is  something   that's  really  specialized, you're  not  going  to  use  it  very  often. It  doesn't  make  sense  to  buy  the  piece of  equipment  or  it  could  be  something that  would  be  really  great, but  it  requires  a  lot  of  expertise that  maybe  your  R&D, your  technical  department  doesn't  have  or just  doesn't  have  the  time   or  resources  to  to  deal  with. I'm  going  to  show  you  the  last option  we  have  here. What  I'm  going  to  say  is  training  a  group of  people  how  to  rate  that  attribute of  interest  and  then  let  them give  you  all  the  ratings. This  is  quite  as  good  as having  actual  consumers. But  here  we're  trying   to  take  subjectivity  out  of  it and  make  it  objective. When  well  trained  humans  can be  great  measuring  instruments. I'm  going  to  walk  you  through   what we've  done  at  Hood when  we  have  some   hard  to  measure  attribute. We're  going  to  use  the  case study  of  this  chunky  pour. This  is  our  roadmap. I'll  walk  you  through  this   and  then  we'll  actually  do  it  live. The  first  thing  I  wanted  to  get  across is  that  the  samples  that  you  produce from  DoE  can  be  used  for  many  purposes. I  like  to  tell  people   that  your  samples  are  like  gold and  you  should  treat  them  like  gold. They're  very  valuable. You  may  do  a DoE  thinking  that   you're  trying  to  answer  one  question, but  something  else  might  pop  up  later that  you  would  be  able  to  use  those samples  to  answer  that  question  as  well. I've  had  that  happen  to  me  many  times, so  sometimes  it's  good  to  think  about just  ask  yourself  the  question. I've  done  all  this  work  to  make 17  different  beverages. What  else  can  I  do  with  them? What  else  can  I  learn? In  our  case,  we  use  these  samples  as  a  "calibration  set" so  that  we  can  teach  our  humans,   my  colleagues, how  to  measure  this  chunky  pour. So  here's  our  method. The  first  thing  we  do  is  we  review all  the  samples  with  a  small  group, some  maybe  1  or  2  or  3  people  that  are really  knowledgeable  on  the  subject or  are  responsible  for  the  project. What  you  do  is  you  look  at  all  the  samples and  decide  which  samples should  be  used  to  train  the  Raiders. We're  trying  to  build  a  scale  essentially, and  then  we'll  take  that  scale and  we'll  get  our  friends,  let's  say  10, 15,  20  friends  to  actually  rate  these, these  samples  for  us after  we've  trained  them. Training  step  two, have  them  read  each  video,  step  three . If  it's  a  video, it  could  be  something  else,  a  picture, or  it  could  be  actually  them pouring  out  the  product if  you  have  enough,  etc. You  can  get  the  idea. Next,  we'll  take  the  average of  all  those  ratings. We'll  look  at  the  data,  make  sure there's  nothing  funky  in  there and  then  we  will  use  those average  values  to  build  a  model. Let's  start  with,  oops. Let's  start  with  steps  one  and  two. So  we're  going  t o  assume  that   we've  looked  at  all  the  all  the  videos and  the  way  we  typically  do  it because  it's  a  little  easier is  you  start  off  answering  the  question, which  one  is  the  lowest  in  Chunky  pour? That  would  be  this  one  right  here. Number  one,  I'm  going to  play  each  one  of  these. This  just to  make  it  clear, this  is  our  scale. It's  a  continuous  scale  from  1  to  10 and  the  1  to  10  is  kind  of  arbitrary. If  if  you  have  something  that  works better  for  you  then  great. The  video  right  above it  corresponds  to  that. So  this  first  video  corresponds  to  a  one. So  as  you  can  see, while  we're  watching  this  video pours  nicely, no  rippling  and  no  chunkiness. Pours  as  expected. Beautiful. That's  that's  the  easy sample  to  identify and  then  in  the  in  the  sample  set,  we  ask ourselves,  okay,  which  one  is  the  worst? In  this  case,  it  was  pretty  obvious. I  will  tell  you  again, this  product  is  not  spoiled. So  just  with  changing  a  few  ingredients. You  can  see  it's  so  thick, we  can't  even  get  it  out  of  the  bottom. So  that's  obviously  a 10. Then  we  did  a  little  bit  of  work to  try  to  figure  out,  okay,  which  one should  we  consider  to  be  a  five? So  halfway  in  between. This  one,  you  can  see  it  still  flows, but  there  is  chunkiness  to  it. Then  maybe a  two  and  a  half  would  be  this  one. See  it  has  a  little  less  chunkiness  to  it. Flows  well,  probably  with  normal  shaking. It'd  probably  be  fine. So  there's  a  little  bit  of  subjectivity, but  you  add  more  people to  make  it  more  objective. Then  the  last  one. This  is  seven  and  a  half. So  you  can  see  it's  very,  very  chunky. The  only  thing  that  really  is differentiating  it  from  number  ten  is that  we  can  get  it  out of  the  bottle  still  flows. But  as  you  can  see,  it's  pretty  thick. What  I  would  do   and  basically  in  this  amount  of  time, I  could  train the  people  that  are  going  to  help  us to  analyze  this, to  measure  this  chunky  pour. Then  we'll  have  them  rate once  we've  trained  them. I'll  basically  do  what  I  just  did. Maybe  we'd  take  a  little  bit  more  time to  be  more  specific  with  certain things  we  want  them  to  be  looking  for. If  what  you're  having  someone  rate   is  a  lot  more  complicated, then  you'll  probably  have  to  need to  take  more  time  training  people. This  one  wasn't  pretty  complicated and  we're  really  just  looking for  people's  first  impression. A fter  that  you  have  them   rate  all  the  videos i  like  to  use  Microsoft  forms  just  because it's  easy  and  I  can  get  the  the  data really  quickly  and  easily, but  you  can  use  whatever  you  want, including  paper,  although  that  takes more  time  and  I  try  to  avoid  that. Just  to  show  you  what  Microsoft,  what  our  forms  look  like. Here's  a  preview  of  it. This  is  as  if  you're doing  it  on  your  phone. I  like  to  make  everything   as  simple  as  possible, and  everybody  always  has  their  phone,  so  I  can  do  it  on  a  phone. That's  my  goal. I'm  just  saying  chunky  pour  doughy, and  then  they  just  go through  and  rate  each  one. So  chunky  poor  for  treatment. One  I'll  say,  don't  know that  that  one  was  a  six and  we're  just  asking  people for  the  first  impression. There's  no  right  or  wrong  answers. Usually  people's  first impression  is  right. So  that's  why  I'm  asking  people not  to  think  too  hard  on  it. Maybe  number  two  is  a  ten, and  number  three  was  a  three. I  don't  know. They  would  go  through  all  of  those. Then  we  would  get  our  data   and  then  using  JMP we  would  average  all  those  ratings and  then  we  put  the  data  into   the  data  table  to  build  the  model. So  we're  going  to  get  out  of PowerPoint  for  a  second  and  we'll  go to  excel  for  a  second. This  is  what  I  get  when  I  want  to  export the  data  from  Microsoft  forms. Like  I  said,  you  don't  have  to  use  this, use  whatever  works  for  you. A s  you  can  see, ID  is  the  the  rater  number just  a  random  number, not  random,  but  just an  identifier  for  each  person. I  left  it  anonymous  so  we  don't. We  don't  criticize  people  who  maybe didn't  do  as  well  as  everybody  else. And  in  this  case,  this  actually this  data  is  real  from. I  took  this  to  a  college  class  food science  class  and  had  them  do  this. And  so  this  is  actual  real college  students  rating. The  rating  the  the  videos. And  as  you  can  see,  we  have  all  these columns,  a  column  for  each  one. So  person  one  rated, rated  treatment  one  and  eight, they  rated  treatment, two  of  four  treatment  three  and  nine, etcetera,  etcetera,  etcetera. So  we  want  to  get  to  put  this  into  jump. So  we  have  I  like  to  use  the  jump  add  in. So  in  Excel  right  here. And  then  just  as  long  as  you're only  highlighting  one  cell. And  you  click  data  table, it'll  import  everything. I've  noticed  that  sometimes I'll  accidentally  have  like  just  a  portion  of  the  data highlighted   and  if  you  could  data  table  now it's  only  going  to  import what  you  highlight. So  either  highlight  everything or  only  highlight  one. Once  you  hit  that  data  table  button, you  will  get  something  like  this. So  this  is   our  data. We  need  to  in  the  end, just  to  show  you  where  we're  trying  to  get to  with  this  data  table  because   we  have  to  manipulate  it  a  little  bit. This  is  our  data  table  for  the DoE. We  run  it  was how  much  of  ingredient  A, B  and  C  were  in  there. I  put,  we'll  talk  about  this  in  a  minute but  I  put  my  scale whether  or  not  I  thought  something  passed or  whether  or  not   I  thought  something  failed. In  the  end,  we  need  one  more column  that  says  Chunky  pour. We'll  call  it  continuous. And  we'll  have  an  average rating  for  for  run  one. Average  rating  for  one,  two, three,  four,  five,  etc. If   we  look  at  this data  table  as  it  is  today is  not  in  that  format   because  we  need  all  these columns  to  be  rows  and  we  need the  the  rows  to  be  in  one  column. There's  probably  a  thousand  different ways  we  could  do  this  in  JMP and  they're  all  good and  they're  all  correct. I'm  going  to  show  you  one  way  to  do  it. It's  just  the  one  that  works  for  me. First,  what  we're  going  to  do  is  we're  going  to  stack all  of  the  columns on  top  of  each  other. Then  we're  going  to  do  a  summary  table that  has  the  average  and  maybe  we'll  also add  in  the  standard  deviation  for  fun. But  the  very  first  thing  that  I've  always been  taught  to  do  is  when  you  get  data, you  want  to  look  at  the  graph, the  data  and  look  at  the  plot. So  we're  going  to  actually  look at  the  distribution  really  quickly. So  if  we  go  to  analyze. There  we  go. Analyze  distribution. We  want  to  look  at  the  distribution for  all  of  the  treatments. I'm  just  going  to  highlight  them. Go  to  the  columns  and  say,  okay. I'm  just  looking  to  see  is  there  anything weird  about  this  data  that  we should  be  concerned  about? When  I  look  at  so  we  can  see  for   1, 2, 3 ,  etcetera, I'm  looking  for  outliers, like  for  example,  three,  everybody rated  this  sample  between  1  and  6. There  was  someone  up  here   who  rated  it  really  high, and  there's  also  someone up  here  that  rated  this  one  high. So  what  I  like  to  do is  if  you  click  on  this, it'll  highlight  where... So  this  this  row  represents one  raider,  one  person. So  I'm  going  to  see  how they  rated  everything and  you  can  see  they tend  to  be  an  outlier. The  nice  thing  is  in  JMP  is  that  once  you  highlight  one  row, all  it  will  highlight  for  all  the  other  responses. So  I  can  see  that,  yeah,  they  rated 3 being  higher  4  being  higher. We  go  down,  look. Terminate. They're  opposite  of  everybody. It  seems  like  for  some  reason the  the  training, they  got  a  little  confused and  they  thought  higher  number  meant lower  chunkiness  and  vice  versa. So   what  I'm  going  to  do  is since  I  have  this  row  highlighted, I'm  going  to  close  this, it'll  stay  highlighted. So  this  is  row  one. I'm  just  going  to  delete  this  data and  then  we'll  move  on. Now  we  feel  pretty  comfortable with  the  data  is  pretty  much  solid. Like  I  said,  we're  going to  stack  the  columns. If  we  go  to  tables  stack. It's  going  to  pop  up and  we  just  want  to  stack all  17  of  the  treatments. The  nice  thing  is  in  JMP  17, now  you  get  this  preview. I  love  the  preview so  then  I  know  if  I'm  doing  things  right. What   we  see  here  is, as  I  can  see, it'll  have  the  ID  so  the  rater  and  then rate  the  chunky  pour  for  treatment  1. They  gave  it  a  five   and  they  did  number  two,  a  seven. This  is  how  we  want  the  data  structured and  we  can  change  the  column  names. So  instead  of  data, we're  just  going  to  say  chunky  pour, continuous. Then  for  label,  I'm  just  going  to   call  it  run  because  that's  really what  we're  going  to  use this  for  in  a  minute. I  just  stack  it. So  I  say,  okay,  that's  how  I  want  it. Now  we  have  the  data  table  in  this  way so  now  it  lets  us  use  a  summary  table. S ummary  tables  are nice  ways  to  be  able  to make  a  table of  the  of  different  statistics. So  what  we're  going  to  do  is  we're  going  to  highlight the  chunky  pour continuous  column  and  say  statistics. Do  mean. For  fun  in  case  we  want  to  use  it, we'll  also  say  standard  deviation. This  just  gives  us  the  overall mean  and  standard  deviation. But  if  we  want  to  do  it  per  run, I'll  highlight,  run  and  put  it  here  in  group. Now  when  we  look  at  this  preview, we  have  one  through  17 and  conveniently,  they're  in  order. One,  two,  three,  four, five,  six,  seven,  eight. All  the  way  to  17. We  have  the  mean and  the  standard  deviation. So  we're  going  to  say,  okay. Okay,  so  we  have  one  more  table. Now  we're  to  the  point  where we're  where  we  need  to  be because  I  have  each  run  as  a  row and  have  a  column  for  the  average  column  for  the  standard  deviation. So  what  I'm  going  to  do  is  I  will  highlight  this  column. If  you  go  to  edit  copy  with  column  names and  then  I'm  going  to  go to  our  original  data  table. We're  gonna  make  a  new  column  here and  say  edit  paste  with  column  names. There  it  is. I  should  have  done  both  of  those at  the  same  time,  but  I  didn't. So  we're  going  to  do. Do  this  one  as  well. Okay,  so  now  we  are  ready to  do  our  modeling. So  first,  first  thing  I  want  to  show  you is  what  we  would  get  if  we  just did  pass  fail  our  binary  response. What  we'll  do  is  if  we go  to  analyze  fit  model. Because  I  made  this  this  design  in  JMP  in  the  custom  design  platform, it  automatically  knows what  kind  of  design  this  is so  that's  why  my  model  is  already  built. If  there  is  a  really  convenient  way, if  you  knew  this  was  a  response  surface  design, let's  say, let's  say  this  wasn't  here. The  macros  are  convenient. If  I  highlighted  ingredient  A,  B  and  C. Said  Macros  Response  Service. It  pulls  it  all  up. It  already  knows  what  I'm  looking  for. So  that's  helpful. I  put  it  in  the  y  axis,  the  variable, the  response  y  chunky  pour  pass  fail. What  it  gives  us  is  nominal  logistic. I'm  not  statistician, so  I'm  not  going  to  go  into  any of  the  statistics  behind  what  it's  doing. I'm  just  going  to  show  you  how  what  you  get  out  of  it and  what a  scientist  might  be  looking  at. So  if  I  say  run, our  target  level  is  passed. So  when  it's  going  to  do  probabilities and  probability  of  passing. So  we  say  run. This  is  what  we  get. So,  I  mean, the  first  thing  that  a  scientists  like myself  would  probably  look at  is  this  effects  summary. I'm  looking  at  probably  looking  at  P  values  and  I  say, well,  nothing  significant  except  ingredient  A. There  are  other  things  that  we would  look  at,  but  I'm  going  to... I'm  going  to  go  over  that. We're  not  going  to  cover  that  today. Instead,   I  want  to  just  look  at  the  profiler, because  that's  what  we  find,  at  least  in  our  in  our  experience, the  profiler  being  the  most  useful and  easiest  to  interpret for  the  scientists  and  when they're  communicating  with  others. So  what  this  is,  is I'm  going  to  make  it  a  little  bigger. Is  on  the  left  here. We're  going  to  get  a  probability  of failing  and  a  probability  of  passing. So  if  we  have  0.13  of  ingredient, a  0.12  of  ingridient  B, 0.45  and  of  ingredient  C, and  it's  actually  0.13%,   0.12%,   0.45%. I  just  didn't  change  it. It's  a  very,  very  small  proportion of  the  formula  that  we're  changing anyways  at  those  levels, this  says  100%  of  the  time we're  going  to  pass. If  I  move  it  up,  let's  say  to.. Have  like, say  point  two  of  this  ingredient  now. Now,  looks  like  we're  going to  pass  only  64%  of  the  time. You  can  see  these  curves, how  I  changed  ingredient  B  a  little  bit and  ingredient  C, maybe  we  can  get  back  up to  a  point  where  we  pass  98%  of  the  time. You  can  play  around  with  this. But  the  problem  with  this  is, is  like  I  said  earlier, passing. Maybe  this  pass  right  here  is  not the  same  as  passing  over  here. However,  we  don't  really  know  that  with  this  information, and  it's  kind  of  hard  thing  for  some people  to  wrap  their  head  around, like  it  was  just  probability  of  passing. What  do  I  do  if  if  all   I  can  get  is  an  85%  pass  rate? Like,  let's  say  hypothetically, this  was  the  best  we  could  do. What  do  I  do  with  that? So  that's  why  we're  looking at  continuous  responses. I'm  just  going  to  close  this   and  we're  going  to  do  that, build  that  model  again, except  let's  do  it  for  the  mean of  our  continuous  scale. So  we're  going  to  have  to  remove  chunky  pour and  we're  going to  add  the  average  here. We're  just  going  to  say  run. Keep  it  simple. Do  the  effects  screening  report. Now  you  can  see   there's  a  lot  more  information  going  on that  we didn't  get  before. So  where  before,  if  you  remember, all  we  saw  was  that  ingredient  A  had  a  really  low  P  value. Everything  else  was  like  0.99. The  conclusion  was ingredient A  does  everything. Well,  it's  not  actually  the  whole  truth, as  we  can  see  here. Yes, ingredient  is  the  most  and  most  important. The  main  effect  of  ingredient  right  here. But  B  and  C  also  have  a  role  to  play. While  not  as  big, they're  still  an  important  role. So  we  look  at  our  actual  predicted  plot. It  looks  pretty  healthy. Our  lack  of  fit. Look s  good. I'm  not  going  to  go  into  all  the  details of  everything  that  we  look  at, mainly  because  I'm  not  statistician. That's  just  what  I  look  at. I'll  look  at  the  lack  of  fit. I'll  look  at  the  residuals  to  see  if there's  anything  weird, the  studentized  residuals. Then  really,  I  come  to  the  profiler and  now  you  can  see   this  gives  us  a  much  different  picture, much  more  complete  picture, where  as  I  increase  ingredient  A, the  chunky  pore  increases, but  increasing  these  these  ones  does  too. So  they  they  also  have  a  role  to  play. If  we  were  to say  that  we  want  to  minimize  it, I  think  it's  pretty  obvious  what  the... Desirability  is  going to  come  out  to  being. But  just  to  show  you, we're  going  to  you  go  to  the  red Triangle  by  the  prediction  profiler. Optimization  desirability  and  we're going  to  do  the  desirability  function. Then  here,  this  is  the  desirability. I  find  it  useful. You  can  change  it  in  the  red  triangle, but  I  find  it  easier  if  you  just hit  control  and  then  click  on  it. Now  we  can  change  what  our  goal  is. So  in  this  case,  we  want  to  minimize  this because  we  don't  want  it  right? We  don't  like  chunky  pour . Consumers  don't  like  it  either. So  we're  just  going  to  say  minimize  and  okay. Now  we  can  go  back to  that  optimization  and  desirability and  say  maximize  it. What I  thought  I  was  going  to  do. Say,  take  these  two  ingredients  out. Put  this  one  as  low  as  you  can. You'll  get  the  the  lowest chunky  pour  that  you  can. In  reality,  we  had  some  other  constraints, so  we  couldn't  do  that. There  were  other  factors  at  play, but  this  definitely  gave  us a  really  good  idea   of  where  we  needed  to  go, what  was  important  and  how  do  we control  this  chunky  pour to  the  point  where  when  we  implemented the  changes,  the  complaints  went  away. It's  been  good  ever  since. That  is  the  the  nutshell of  how  you  could  take  something that  is  hard  to  measure. It's  really  subjective. It's  binary so  you  pass  fail   or  good  or  bad, and  you  can  convert  it into  something  that's  continuous. It's  a  relatively  simple  method. You  can  use  it  for  a  number  of  things. As  long  as  you  have  people  available  to  help  you  out, you  can  you  can  measure  a  lot  of  things   that  could  be  considered  hard  to  measure. Where   do  we  go  from  here? At  Hood. Just  to  give  you  an  example  of  some other  things  that  we  encountered. This  one,  the  Chunky  Pour,  is  actually one  that's  a  little  easier  to  do. But  let's  say  this  is  another  product  we were  working  on  a  long  time  ago  where let's  say  you  have  coffee   and  you're  going  to  add  some  foam  to  it and  you  want  to  understand  how  well  does that  foam  dissipate  into  the  coffee? That's  a  that's  a  tough  thing  to  measure. We  definitely  don't  have  any  instrumentation that  can  really  measure  it. Videos  really  helped  us to  understand  how  we  could  measure  it and  get  some  useful  information  out  of... As  you  can  see,  we're  trying  to  measure how  does  that  look? How  well  does  it  move  that  one  versus, let's  say,  this  treatment  over  here? You  can  see  they're  quite  different. Where  one  moves  really  fast, the  other  moves  really  slow. This  one  looks  kind  of  chunky the  other  one  didn't  so  much. That's  that's  how  we  use  it. We  use  it  quite  often. I  appreciate  you  taking the  time  to  listen  to  my  talk. Hopefully,  I  hope  that this  has  been  useful. You'll  be  able  to  find  a  way  that  you  can  implement  it  to in  in  your  day  to  day  work. Thank  you.
In this presentation, we use the longitudinal analysis capability of the Structural Equation Modeling (SEM) platform in JMP to measure the impact of temperature and usage amount on detergent performance from the consumer perspective using a measure of the overall rating of product as a surrogate of overall performance. In this diary study measured over time, data were collected from consumers who used one of two products in three phases of four-week time intervals. Using data collected from Week 5 to Week 9, we modeled the trajectories of the performance of the detergent in relation to the temperature and usage (number of pods). Using standard SEM fit measures, we found the linear growth curve model fit the data well for the underlying latent structures. Results showed that one of the test products performed significantly better under cold-wash temperatures and used fewer pods than the other product. This result was a significant win for the company under the sustainability platform and performance under colder wash conditions.     Hello,  everyone.   My  name  is  Zhiwu  Liang, statistician from  Procter  &  Gamble  Company. I'm  support of the  business in  Brussels  Innovation  Center  for  P&G. My  main  job  is  doing the  consumer  survey  data  analysis. Today,  Narayanan  and  I  will  present the  G rowth Curve Modeling to  Measure  Impact of  the  Temperature  and  Usage A mount on  Detergent  Performance. Next  slide,  please. Here  is  the  contents  we  will  cover  today. First,  I  will  give  the  brief  introduction about  the   structural equation models and  a  bit  about  the  data we  will  be  using  for  our  modeling. Then  I  will  turn  to  Narayanan to  introduce  the   growth curve modeling, model  building  process  plus  the  JMP  demo. Without  showing,  I  will  present the  conclusion  and  next  steps. Next  slide,  please . The   structural equation modeling is  a  multivariate  technique that  is  used  to  test  a  set  of  the  relationship between  the  observed and  the  latent  variables by  comparing the  model  predicted  covariance  matrix and  observed  covariance  matrix. In  SEM,  what  we  have  done  is, observed  variables  are  manifest  variable as  the  indicator  for  latent  variables, which  is  using the  measurement  model  to  construct. Latent  variables  form  a  regression  model to  build  a  network which  we  call  the  structure  model. Here  is  an  example with  the  three  latent  variable, eight  o bserved  variable  in  JMP, so  the  SEM  structure. As  you  can  see  in  the  button  left  chart, the  circle  represent  the  latent  variable, which  is  calculated through  the  indicators  like  cleaning, as  the  latent  variable is  indicated  by  the  full  square represent  the  manifest  variable overall  cleaning,  stain  removal, whiteness  and  brightness. Same  as  the  freshness,  latent  variable indicated  by  the  three  manifest  variable. If  you  look at  the  right  side  of  the  window, the  loading  window  show the  structure  for  the  measurement  model how  this  individual  latent  variable relate  to  the  indicator. The  button  of  the  regression  window   show the  two  regression  model: cleaning  drive overall rating, freshness  dry  overall rating. This  is  the  structure for  the  structure  equation  model. Next  slide,  please. The  data  we  use for  our   growth curve modeling is  the  survey  data  we  conduct in  the  France  with  the  119  consumer. We  divide  this  119  consumer  into  groups. Sixty  of  them  use  control  products, which  is  the  Ariel, soluble  unit  dose,  the  pods, in  our  data  set,  marked  as  0. Another  59  consumers  use  test  product is  the  Ecolabel  product,  code  as  1. Each  consumer, during  the  12  weeks  of  the  test, first  four  weeks, they  use  their  own  products. Then  they  will  go to  the  eight  week's  test  week, use  one  of  our  assigning  products, either  use  the  Ariel  SUD  or  Ecolabel. Then  for  each  time  of  the  wash, the  consumer will  fill  in  the  questionnaire, provide  some  information about  their  washing  behavior, such  as  the  washing  temperature, number  of  the  pods  used, soil  level  of  the  fabric,  how  dirty  it  is, and  overall  rating  of  the  performance for  the  product. Our  modeling  objective  is  try  to  test if  there  is  a  product's  effect on  the  overall  performance  rating, washing  temperature on  the  overall  performance  rating, number  of  the  pods  used for  overall  rating for  each  of  the  wash. Next  slide,  please. Since  every  consumer, they  have  a  different  washing  habit, they  have  different  condition, not  all  of  the  consumer  has  the  same number  of  wash  during  the  test  week. Therefore,  to  make  every  consumer the  weight  equal in  our  model  building  data  set, we  first  aggregate  the  consumer  data by  the  panelist  level  by  weekly  basis. You  take  the  average  washing  temperature during  that  week for  the  particular  consumer, number  of  pods  used, and  the  overall  rating for  each  load  during  that  week. After  this  aggregate  data, we  use  the  exploratory  tool like  JMP  Graph  Builder to  identify  if  there's  any  linear  trend for  overall rating, for  temperature  trend during  the  test  week, and  the  number  of  the  pods using  trend  during  the  test  week. Since  the  exploratory  stage , OAR  is  pretty  stable in  the  week  9  to  week  12, we  use  the  intercept  only  model  for  OAR. Then  for  the  temperature  for  the  product and for  the  number  of  the  pods  used from  this  exploratory  stage, we  found  there  is  either increasing  or  decreased  trend. Therefore,  we  use  the  linear  growth  model to  describe  the  temperature  indicator and  the  number  of  pods  indicator. To  explain  the  product  impact, we  also  including  the  product manufactured  variable  in  our  model. Then  we  first  build  a  growth  curve  model for  temperature  number  of  the  pods, then  add  this  latent  variable  to  build a  regression  model  from  products  variable, intercept  of  temperature, slope  of  temperature, intercept  of  the  number  of  pods  used, slope  of  the  number  of  pods  used to  intercept  of  the  OAR  to  build multivariate,  the   growth curve model. Now I  would  turn  to  Narayanan to  introduce  latent   growth curve model. Narayanan,  it's  your  turn. Thank  you,  Zhiwu, for  the  great  [inaudible 00:06:49]. Hi,  everyone. My  name  is  Narayanan. I  am  an  Adjunct  professor at  the  University  of  Cincinnati, but  I  teach  courses on  data  mining  using  JMP. I'd  like  to  start  by  giving a  very  broad  definition of  what  is  latent growth curve modeling. As  we  go  along, I  may  use  the  letters  LG CM  to  represent latent growth curve modeling, and  SEM  to  represent structural equation modeling. Latent growth curve modeling is  basically  a  way  to  model longitudinal  data  using  the  SEM  framework. Because  it  is  built  in  the  SEM  framework, it  has  all  the  advantages of  specifying  and  testing  relationship, as  Zhiwu  was  explaining  with  the  example of   structural equation modeling. A s  a  side  note, I  would  like  to  mention  that  LGCM is  actually  an  application of  confirmatory  factor  analysis, which  is  actually  a  submodel within  structural equation modeling with  the  added  mean  structure, and  this  will  be  explained when  we  get  into  JMP. One  of  the  benefits of  using  the  SEM  framework is  that  we  are  able  to  evaluate  model  fit. Let  us  look  at  the  statement  there, which  says,  every  model  implies a  covariance  matrix  and  mean  structure. What  this  really  means  is  that the  observed  covariance  matrix and  the  mean  vector can  be  actually  reproduced by  the  model  parameter  estimates which  are  estimated using  the   latent growth curve modeling. The  equality  between  the  two is  what  many  of  these  fit  indices are  actually  testing. One  of  the  oldest  one is  the  chi-square  test and  the  hypothesis  it  is  testing is  actually  listed  there: the  equality  between  the  population, and  the  model  predicted  covariance  matrix, and  the  mean  vectors. However,  this  test, which  is  one  of  the  oldest, has  some  watch- out. One  is  that  the  test  statistic in a   function  is  sample  size, which  means  that  larger  sample  size will  tend  to  reject  the  model even  for  trivial  differences. The  other  one  is  that  the  test  is  global and  does  not  reflect  the local fit  such as  could  be  measured  by  R-square. A lso,  the  fit  is  too  exact as  specified  in  the  hypothesis. We  know  from  the  famous  box  statement that  all  models  are  wrong. Our  models  are  only  just  an  approximation. Because  of  this,  there  have  been several  alternative  fit  measures that  have  been  proposed. I'd  like  to  mention  three  of  them  here. The  first  is  the  Root  Mean  Square Error  of  Approximation. This  is  actually  measuring  model  misfit, adjusting  for  the  sample  size, which  was  an  issue with  the  chi-square  test. This  is  actually  a  badness- of- fit  measure, so  lower  numbers  are  better. But  one  of  the  advantages of  using  this  fit  measure is  that  we  have a  confidence  interval  for  it, and  the  suggested  threshold for  this  fit  measure is  that  the  upper  bound of  the  confidence  interval is  less  than  0.10. The next  is  a  Comparative  Fit I ndex and  Non-Normed  Fit  Index. These  are  relative  estimates, and  they're  actually  testing how  good  is  your  proposed  model compared  to  a  baseline  model, which  is  usually a  model  of  no  relationship. This  is  a  goodness- of- fit  measure, and  so  the  suggested  criteria  here is  that  these  fit  measures cross  a  threshold  of  at  least  0.95. The  last  one  is  a  Standardized Root  Mean  Squared  Residual. This  is  actually  an  average squared  residual  of  all  the  elements in  the  covariance  matrix. This  is  a  badness- of- fit  measure. Again, we  are  looking  for  smaller  numbers, and  the  suggested  threshold  here is  that  this  value  is  less  than  0.08. On  top  of  all  this,  finally,  do  not  forget  to  check  the  actual  residuals, the  standardized  residuals. What  we  are  looking  for  here is  numbers  which  are  beyond minus 2  and  plus 2  threshold. The  idea  here is  to  look  at  the  totality  of  fit and  not  just  any  one  measure. Having  discussed  fit  measures, now let  us  look  at  the  longitudinal process  we  want  to  study. Zhiwu described three  different  processes. First  one  is  success  criteria as  measured  by  overall  satisfaction  rating from  week  9  to  week  12. Then  we  have  got two  time vary ing  covariates. That  means  these  are  varying  over  time. One  is  the  temperatures  setting in  which  the  product  was  used from  week  5  to  week  12, and  then  the  amount  of  product used  also  from  week  5  to  week  12. Then  finally, we  have  an  indicator  variable indicating  what  type  of  product  it  is, and  this  is  a  time invariant  covariate doesn't  change  with  time. The  modeling  strategy  we  are  going  to  use, first,  we're  going  to  visualize  data using  Graph  Builder. Then  we  are  selecting  a  univariate latent  growth  curve  model for  each  of  the  processes. Then  we  combine  all  of  them, put  together  as  a  multivariate  LGCM. Then  we'll  finally  test the  hypothesis  that  Zhiwu  proposed, which  is  how  well  the  product and  other  growth  factors impact  overall  satisfaction. We  will  choose  the  simplest  model when  we  build. I  am  going  to  get  into  JMP. I  am  running  JMP  18, which  is  an  early  adopter  version, and  I  am  going  to  show  some  scripts, and  I  will  show  you  how  I  got to  some  of  these  from  the  JMP  platforms. The  first  thing  I  want  to  do is  visualize  the  overall  satisfaction, and  these  are  trajectories. What  these  are, are  basically  individual on  each  line  from  week  9  to  week  12. Here,  the  overall satisfaction  plotted  here for  each  of  the  119  consumers. They're  basically  one  trajectory for  each  consumer. If  you  look  at  this  particular consumer,  row  number  16, that  person's  trajectory is  on  a  downward  trend from  week  9  through  week  12. They  started  somewhere  in  the  mid-50s, and  by  the  time  they  are  in  week  12, their  satisfaction  measure  has  come  down to  about  37.5  on  a  scale  of   0-100. Let  us  look  at  another  person. This  person  here who  used  the  Ariel  product, their  trajectory  is  on  an  upward  swing going  from  the  mid-70s  probably  to  the  early  90s by  the  time  they  reach  week  12. They  are  getting more  and  more  satisfied week  over  week. Sorry  for  that. A  bubble  screen  showing  up. What  we  want  to  do is  we  want  to  understand how  different  consumers are  experiencing satisfaction  over  the  weeks, and  the  change  in  these  processes for  these  consumers is  what  we  want  to  model  using  LGCM. What  I'm  going  to  do is  I'm  going  to  turn  on  the  script, LGCM  of  overall  satisfaction. I  have  built  here  three  different  models. What  these  are  basically  the  latent variable  corresponds  to  an  intercept for  these  repeated  measures  of  the  overall  satisfaction from  week  9  through  week  12. I've  built  three  different  models. I've  built  a  fourth  model, which  is  a  simplification of  the  first  model. I've  built  a  no-growth  model, which  means  different  people have  different  levels  of  satisfaction in  the  beginning,  which  is  week  9, but  then  their  trajectories  flatten  out and  does  not  grow  over  time. Second  model  is  a  linear  growth  model, which  means  that  trajectories  do  change in  a  linear  fashion  over  time. The  third  model  is  a  quadratic  model, which  means  their  trajectories  change in  a  quadratic  fashion  over  time. Then  finally,  I've  got  a  simplification of  the  first  model, but  I'm  assuming  almost  elasticity or  no  change  in  the  variance  across  time. I'm  going  to  look  at  these  fit  measures that  I  talked  about and  choose  the  model  that  fits  the  best. What  I'm  looking  for is  low  values  of   chi-square, high  values  of  CFI, which  means  CFI  goes  on  a  scale  from  0-1 and  low  values  of  RMSEA, which  also  goes  on  a  scale  from   0-1. It  looks  like  all  my  models,  no- growth, linear  growth,  and  quadratic  growth, fit  the  data  equally  well. But  however,  I'm  going  to  take the  simplest  of  the  models because  if  I  look at  the  estimates  as  I  can in  the  path  diagram, many  of  these  coefficients relating  to  the  slope, the  linear  slope  or  the  quadratic  slope, are  actually  not  significant as  shown  by  the  dotted  lines. In  this  linear  growth  model, what  we  have  is  an  intercept, which  measures the  initial  level  of  satisfaction, and  slope,  which  measures the  rate  of  increase of  the  satisfaction  over  time or  rate  of  decrease of  satisfaction  over  time. S lope  measures  that, intercept  measures  the  initial  level. We  can  see  all  the  estimates  related  to  the  slope are  actually  not  significant as  indicated  by  dotted  lines. The same  is  the  situation for  the  quadratic  model  also. Therefore,  I'm  going  to  take the  simplest  of  the  model, which  is  the  no- growth  model for  this  process, which  is  overall  satisfaction. Let  me  show  you  how  I  do  this. In  JMP,  go  under  the  Analyze and  pick  Multivariate  and  choose the S tructural Equation Model  platform. Choose  the  repeated  measures, in  this  case  is  OAR  from  week  9 through  week  12. Drop  them  in  Model  Variables  box and  click  OK. We  have  got  these four  repeated  measures available  as  modeling  variables in  the  path  diagram  area. I  can  build  this  model from  scratch  using  the  path  diagram, but  JMP  has  made  it  easier by  using  shortcuts. I'm  going  to  go  under the  Model  Shortcut,  red  triangle, choose  Longitude  Analysis, and  check  the  linear  latent  growth  curve or  the  intercept-only  model. If  I  choose  the  intercept-only  model, I  get  this  path  diagram which  you  saw  in  my  script. If  I  run  the  model, you  will  get  the  estimates and  the  fit  statistic  for  this  model. If  you  want  to  add the  linear  growth  model to  do  the  same  thing,  come  under M odel  Shortcuts, Longitudinal  Analysis, and  Linear  G rowth Curve Model. Now we  have  got not  only  an  initial  level as  represented by  the  intercept  latent  variable, we've  got  the  rate  of  growth of  this  process as  represented by  the  slope  latent  variable. We  can  run  this  model. Click  on  Run, and  you  get  the  model  estimates, as  I  showed  you  before, which  are  not  significant for  the  slope  latent  variable. You  get  the  fit  statistics  right  here under  the  Model  Comparison  table. T hese  models  are  easy  to  fit  in  JMP using  the  model  shortcut  menu available  under  the  Model  Shortcut. I'm  going  to  close  the  one  I  just  created. We  have  so  far  built a  univariate  LGCM  for  a  single  process. I'm  going  to  repeat  the  same  thing for  the  other  two  growth  process  we  have, and  we're  going  to  look at  the  wash  temperature  trajectories. Let  me  show  you  how  to  do  this  in  JMP. In  JMP,  in  Graph,  click  on  Graph  Builder and  open  up  the  temperature  variables. We  want  to  look  at  temperature from  week  5  through  week  12. Drop  them  on  the  x-axis. For  the  type  of  graph  you  want, choose  the  last  icon in  the  bar  at  the  top. This  is  a  parallel  plot. There  will  be  some smoothness  associated  with  this. Drag  this  letter  bar all  the  way  to  the  left. There  should  be  no  smoothness  at  all. Take  the  product  variable, which  is  an  indicator  variable, put  them  on  Overlay. Now you  get  individual  trajectories. If  you  want  to  add the  average  trajectory, choose  the  sixth  icon on  this  toolbar  from  left. Click  on  the  Shift  key  and  click  on  this. Now you  get  that  average  trajectory of  temperature  used over  these  eight  weeks. Click  on  Done to  get  the  plot  with  more  real  estate. This  is  exactly  the  plot that  I  showed  using  the  script. You  can  clearly  see  that from  week  7  onwards, there  might  be  a  growth in  the  temperature  setting. It  looks  like  people are  increasing  the  temperature as  time  progresses   from week 7  through  week  12. I'm  going  to  close  this. We  have  a  graph  to  visualize the  trajectories of  the  temperature  setting. We  repeat  the  same  thing. We  want  to  choose a  model  for  that  process. A s  before,  I  built  the  same  three  models: a no-growth,  a  linear  growth, and  a  quadratic  growth. I'm  going  to  look at  the  fit  statistic  here. This  time,  we  see  definitely a  significant  improvement in  going  from  the  no- growth to  linear  growth in  terms  of  the  fit  statistics. The  quadratic  growth  is  a  marginal increase  over  the  linear  growth  model. Again,  for  the  same  reason  as  before, all  the  estimates  in  the  quadratic  slope are  actually  not  significant. To  keep  things  simple, I'm  going  to  choose  the  simpler  model, which  is  the  linear  growth for  temperature. The  last  process  is  the  pod  usage. This  is  the  number  of  pods. Now  we  can  see  clearly an  increasing  trend, more  so  for  the  Ecolabel  product, which  means  people are  using  more  and  more  products when  they  use  Ecolabel as  compared  to  Ariel, which  is  a  P&G  product. I  want  to  model  this. Let  me  close  that. Click  on  the  script for  LG CM  of  pod  usage. I'm  going  to  look  at  the  fit  statistic. A gain,  I  see  a  good  model  fit, especially  the  linear  and  the  quadratic. For  the  same  reason  as  before, I'm  going  to  choose  the  linear  model. Here  I  want  to  look  at the  estimates  for  the  quadratic  slope, and  this  is  what  I  mean by  not  choosing  the  quadratic  slope because  you've  got  all  the  parameters point unit  to  that  to  be  not  significant. Now we  have  got  a  model for  each  of  the  three  processes. We  chose  a  no-growth  model for  overall  satisfaction. We  chose  a  linear  growth  model for  low  temperature. Now I'm  going  to  put  them  all  together using  a  multivariate, latent   growth curve model. This  is  basically all  the  three  processes  put  together. Here , I  want  to  show  you  the  similarity between  a  confirmatory factor  analysis  model and  latent   growth curve model as  was  pointed  out  in  the  previous  slide. You  can  see  that  there  is a  mean  structure  added  to  it with  a  triangle  with  a  number  one, and  there  are  lines  going  from  that to  each  of  the  latent  variables. If  I  right-click  and  use  the  Show  option and  not  show  the  mean  structure, you  can  see  the  familiar confirmatory  factor  analysis  model with  latent  variables  and  the  indicators associated  with  each  one  of  them. We  have   a  single  latent  variable, intercept  for  the  overall  satisfaction. We  have  two  latent  variables for  the  temperature, which  is  initial  intercept  and  the  slope. We  have  the  same  two latent  variables indicating  the  pod  usage. Initial  level  as  represented by  the  int  pods and  the  rate  of  change  of  product  usage as  indicated  by  the  slp  pods, which  is  basically  the  slope  of  pods. Let  me  put  back  the  means  activated. Now we  can  actually  look at  the  estimates  of  these, which  are  really one  of  the  important  pods of  the  latent   growth curve model. What  we  have  here  is  an  estimate of  the  initial  level of  satisfaction  at   week 9 because  that  was  the  starting  time period  for  overall  satisfaction. That's  about  71  on  a  scale  of   0-100. This  is  the  average temperature  setting  at  week 9, which  is  36  degrees  Celsius. Here  is  the  product  usage,  1.4  pouches. Here  is  the  rate of  change  of  product  usage because  there  is  a  slope of  product  usage,  the  latency  variable, which  is  about  0.02. People  are  using  slightly  more as  time  goes  on. That  is  what  we  get. The  overall  fit  of  this  model is  also  fairly  good. I  think  we  saw  that. CFI exactly  at  the  threshold  0.95, and  our  upper  bound  of  the  RMSEA is  definitely  less  than  0.1. Now we  go  to  the  last  model, which  is  the  hypothesis that  Zhiwu  wanted  to  test, where  we  want  to  see  if  product, the  indicator  variable, and  the  other  growth  factors have  a  significant  impact on  overall  satisfaction. In  order  to  remove  the  clutter, I  have  not  shown  all  the  indicators. All  we  are  seeing  is  only  the  circles, which  represent  the  latent  factors for  each  of  the  growth  curve  models and  a  single  product  variable indicating  what  type  of  product  it  is. Again, let  us  look  at  the  fit  of  this  model. Fit  of  this  model  is  indeed  good. We  have  a  0.95  for  the  CFI. We  have  less  than  0.1 for  the  upper  bound  of  the  RMSEA. We  will  look  at  more  fit  indices after  we  interpret some  of  the  estimates  here. I'm  going  to  interpret  the  solid  lines which  are  significant  coefficients. We  have  a  significant  product  effect from  the  product  variable to  the  intercept  of  overall  satisfaction. This  can  be  interpreted  basically as  a  regression  coefficient, which  is  the  average level  of  satisfaction for  product  coded  1 minus  the  average  level of  satisfaction  for  product coded  0. Ariel  is  coded  as  product  0, so  we  have  much  more satisfaction  with  Ariel, a  delta  of  negative   9 in  favor  of  Ariel on  a  scale  of  0- 100. That  is  a  big  change. Delta  in  favor  of  the  Ariel  product. Let  us  look at  the  product  effect  on  pods. Again,  the  same  way, average  amount  of  product  used for  product  coded  1  minus  product  coded  0. This  time,  we  are  using  more of  the  Ecolabel  product. If  you  are  a  manufacturer  of  Ariel, this  is  good  news  for  you. A lso,  the  rate  of  change  of  product  use is  also  more  for  Ecolabel compared  to  Ariel, or  0.02  pouches  from  week  to  week. Finally, we  have  the  intercept  of  temperature having  a  negative  impact on  the  overall  satisfaction, which  means  higher  temperatures lead  to  less  satisfaction. Remember,  these  are  products which  are  marketed  as  cold-wash  products. That  means  they  should  work  better in  cold  temperatures and  not  higher  temperatures. I  also  want  to  show  you  where  you  can  look for  other  fit  statistics  beyond what  is  coming  out in  the  model  comparison  table. Under  the  S tructural Equation Model in  red  triangle, if  you  check  on  Fit  Indices, which  I've  already  checked, there  are  more  fit  indices that  can  be  shown  at  the  bottom. We  want  to  look  at  CFI  and  RMSEA, which  we've  already  seen, and  here  is  the  Standardized Root  Mean S quare  Residual, which  I  discussed. This  is  also  exactly at  the  threshold  of  0.08. All  in  all,  in  terms  of  fit  indices, our  model  does  fit  quite  well. Finally, I  told  you  not  to  forget  the  residuals. These  are  normalized  residuals in  terms  of  the  measured  variables. We  have  21  measured  variables, eight  for  pods,  eight  for  temperatures, four  for  overall  satisfaction, and  one  for  the  product  variable. This  is  a  21  by  21  matrix. What  we  are  looking  for is  numbers  which  are  outside the  plus  2  minus  2  range. There  are  just  too  many  numbers to  look  at  in  the  table, but  JMP  produces  a  heatmap. Heatmap  option  is  also under  the  red  triangle. What  we  are  looking  for is  dark  red  or  dark  blue. Here,  we  have  two  dark  reds which  are  relationship   between pod  usage  at  week 6 , temperature  at  week  12, pod  usage  at  week  6, and  temperature  at   week 9. Finally,  we  have  one, because  this  is  just  a  mirror  image of  the  one  that  is  here. This  is  the  relationship between  temperature  at  week  9 and  temperature  at  week  10, which  is  not  modeled. This  could  actually  be  modeled by  adding  an  error  covariance, which  I  did  not  do. If  I  did  this,  the  model,  in  fact, would  be  even  better. I  want  to  go  back  to  the  presentation and  summarize  what  we  have  found. Oops, sorry,   wrong  slide. In  terms  of  conclusion, we  started  the  Graph  Builder to  visualize  our  trajectories, and  we  built  latent  growth  curve  model using  the  SEM  platform. We  extended  from  univariate to  multivariate  models. A ll  our  models,  including  the  last  one, had  acceptable  fit,  in  fact,  good  fit. Product  had  a  significant  impact  on  OAR, which  means   Ariel  is  better  than  Ecolabel in  terms  of  its  overall  satisfaction and  significant  impact on  the  number  of  pods, which  means  less  product  was  used for  Ariel  compared  to  Ecolabel, and  also  from  week  to  week. Intercept  had  a  negative  impact  on  OAR, which  means  people  prefer lower  temperature  setting than  higher  temperature  setting. If  you  are  a  P&G  manufacturer, this  is  good  news  for  you because  Ariel  works  better  than  Eco label in  this  modeling  framework that  we  have  done. I'm  going  to  turn  it  over  to  Zhiwu  to  see  what  the  next  steps  are from  this  model  results. Zhiwu? Thank  you  very  much. Thank  you,  Narayanan, and  very  excellent  presentation and  wonderful  demo. As  Narayanan  mentioned, the  modeling  results  prove the  product  has  a  significant  impact to  the  overall satisfaction of  the  performance of  the  detergent  products  in  our  test. This  result  provides  us the  confidence  we can make  a  very  clear  claim, Ariel  products  is  a  favor  to  the  cold wash can  be  used  less  than  the  normal  products. This  is  also  modeling  confirm the  consumer  behavior  change. If  you  use  Ariel  product, you  will  have  more  washing  loads go  to  the  cold wash, use  less  energy  and  use  less  product. Also,  we  plan  to  conduct  bigger  size consumer  study  for  including  the  more covariates  variables  in  the  future modeling  stage  like  the  additive  usage and  the  washing  cycle  of  every  wash and  the  low  size  per  wash. This  is  our  next  step. Next  slide. Now we  would  like to  take  question  if  you  have  any. Thank  you  very  much for  attending  the  presentation. We  look  forward  to  your  questions probably  in  the  JMP  Summit.
Effective data analytics is critical for the success of product development. Product development processes for adhesives and sealants involve data analytics for tasks needed at various project phases. Too many formulation chemists still depend on Excel for daily data recording, formulating, and analysis. While some chemists use JMP for statistical analysis and experimental design, JMP’s powerful and versatile data analytics should be used more universally for many daily tasks in product development. When shown how widely JMP can be used, more formulation chemists should see the benefits of adopting JMP as a daily tool.   This presentation shares examples of how easily JMP can be used in daily data analytics, in addition to the statistical analysis and DOE that most chemists use. A variety of examples are given, including handling data from literature and patent searches, analysis of raw materials, and a variety of formulation-related tasks such as formula stoichiometry calculation, mixing volume balance, data recording, and in tabulate/graph analysis. In addition, this presentation demonstrates how JMP is used for product selection, application trouble shooting, failure mode analysis, DOE, and more.     Hi ,  this  is  Stone  Cheng , I 'm  a  technical  director   in  Henkel  Corporation . I 've  been  using  JMP  for  more  than five  years  in  product  development . Today  I 'm  happy  to  share with  our  best  practice  in  utilizing  JMP  data  analytics in  product  development of  adhesive  and  sealants . My  presentation  has  two  parts . The  part  one  is  application  gallery where  JMP  used  in  various  stage   in  product  development  will  be  discussed, and  in  part  two  I  will  focus  on  using  JMP as  a  formulation  worksheet  with  the  demonstration . In  my  organization , folks  either  have  not  heard  of  JMP or  have  an  impression  that  JMP   is  an advanced  DOE  software . For  the  latter,  it  is  true , but  it 's  not  the  whole  truth . Since  there  are  other   DOE  software  options , it  is  hard  to  persuade  folks to  switch  away from  tool  they  are  familiar  with . JMP  is  an  excellent  all- around  data  analytics  tool . To  promote  JMP  adoption , we  need  example  to  demonstrate   its  power  in  the  data  analytics . In  my  presentation , the  example  are  taken  from  my  experience in  adhesive  and  sealants . About  Henkel . Henkel  is  a  22- billion  enterprise   with  two  business  unit : adhesive  technology and  consumer  brand  of  laundry , home,  and  beauty  care . I 'm  in  the  adhesive  technology  unit . We  are  global  adhesive  leader serving  800  industries with  more  than  20 ,000  products . Let 's  start  with  case  number  1 . One  customer  has  a  need , they  may  call  a  service  center asking  for  product  recommendation . For  example ,  a  customer  may  need   a  room  temperature  adhesive  hook with  the  shear  strength  between  1 ,500 -3 ,000 and  a  work  life  between  15 -30  minutes and  in  the  package  of  10 . What  we  need   is  a  searchable  product  selected  guide , what  is  showing  in  our  case  1 . Once  the  product  information  are  entered  into  the  JMP  table , one  can  use  the  tabulate   and  local  data  filter to  downselect  the  product based  on  the  customer  requirement as  shown  in  here . This  is  a  very,  very  powerful  tool in  the  preformation  stage . My  team  has  been  applying  this  tool to  search  for  formulation , pattern ,  literature   and  the  raw  material  specifications . In  case  number  2 , multivariate  analysis is  applied  to  a  silver  filler , which  are  used in  making  conductive  adhesive . From  multivariable  analysis , if  we  focus  on  the  surface  area , it  has  a  moderate  negative  correlation with  the  tap  density and  then  it  has  a  stronger positive  correlation  to  lubricant  amount as  measure   at  the  weight  loss  at  530  AC . There  is  a  graphical  way  here   and  it 's  a  table  format  here . With  this  analogy , chemistry  can  select  the  right silver  package  for  the  application . Case  3  is  the  literature . Literature  is  a  great  place  for  learning the  chemistry  and  formulation and  this  particular  cited  literature illustrates  how  epoxy  tensile  strength are  affected  by  the  mixture   of  two  amine  hardener . The  results  are  described qualitatively  by  a  table and  a  graph   of  the  stress  train  curve  here . For  formulator   quantitative  description  is  good , but  learning  via  quantitative  modeling is  even  better  for  the  prediction  purpose . In  case  number  3 , we  take  the  literature  data and  then  create   a  two -factor  column  right  here . These  are  the  epoxy /amine  stoichiometry , or  we  call  it  the  index . The  other  factor  is  the  fraction of  one  of  the  amine  PAE to  the  total  amine  fraction . With  the  Fit  Model  platform, and  we  use  the  ISM  model , it  showed  that  the  quadratic  effect of  the  index, together  with  two  main  effects, are  all  significant . The  prediction  provider and  the  contour  profiler are  used  to  quantify  the  learning and  give  the  prediction . In  case  number  4 , my  group  was  assigned to  support  a  technology  platform that  include  about  30  products . Since  we  are  not  involved  in  the  original formulation  development , how  to  study  the  formulation  family in  this  case  is  not  trivial . Looking  at  a  big  Excel  table with  all  the  formulation  certainly is  not  effective  either . Case  number  4  is  the  example of  addressing  this  challenge . I  select  the  three   top  most  used  ingredient in  these  29  formulation : monomer  1 ,  2,  and  then  oligomer . Then  by  using  the  hierarchical  clustering  analysis  right  here , right  here  our  formula  was  identified  to  have  very ,  very  small  distance assumed  right  here , implying  that  they  are  in  cross  related . Actually  they  are  only  different in  the  photo  initiator   for  different  wavelength  in  this  case . We  can  add  more  ingredients  one  by  one in  this  hierarchical  clustering and  then  learn  the  formulation  family by  using  this  method . Most  of  the  chemist  analyze the  formulation  performance  in  Excel . Case  number  5  is  a  JMP  tabulate that  has  the  same  data  format  as  Excel . Basically ,   various  information  of  a  formula are  displaced  in  the  same  column . Like  what  you  see  here has  a  heading ,  has  a  recipe , has  a  processing  material  characteristic, and  all  the  results . To  create  such  data  structure , we  need  to  enter  data  in  JMP  table in  a  special  format and  that  will  be  discussed  in  detail  in  part  two  of  my  presentation . Case  6  is  a  silicone  study   involved  23  formulations and  more  than  10  measurement  property . It  is  quite  overwhelming  trying  to  analyze the  raw  data  in  such  big  system . We  first  construct  a  series  of  graph with  a  property  retention  in  Y and  then  the  initial  property  in  X . For  example ,   if  you  look  at  the  first  graph  here , the  adhesion  retention  is  in  Y and  initial  adhesions  in  X . We  also  give  the  reference  line , acceptance  line  for  each  axis . When  a  formulation  is  selected , for  example ,  I  choose  this  point , they  are  in  the  quadrant  with  acceptable initial  adhesion  and  retention , then  all  its  associate  property   such  as  the  tensile  strength , elongation ,  hardness ,  all  show  up , also  formulation   all  show  up  at  the  same  time , these  are  all  thanks  to  automatic  highlight . This  is  all  thanks   to  JMP  dynamic  link  capability . Visualization  analysis in  such  a  way  is  very  effective for  chemists  to  know  the  overall behavior  of  this  system . In  polymer  science , we  measure  the  modulus  of  polymer  as  function  of  the  temperature with  an  instrument  called dynamic  mechanical  analyzer ,  DMA . DMA  data  has  a  temperature  modulus and  then  attend  delta   are  typically  transferred  to  Excel in  a  wide  format  for  plotting . To  overlay  several  DMA  curve   for  comparison  it  is  durable  in  Excel , but  it 's  not  a  trivial  effort . In  case  number  7 ,   we  stack  74  DMA  results  together and  by  using  the  graph  builder , we  can  compare  DMA  results   very ,  very  quickly  just  by  clicking . I  cannot  imagine  doing  the  same  in  Excel that  has  a  222  column . It 's  basically  74  sample  times  3  signal  per  sample . It 's  going  to  be  very  difficult to  handle  in  Excel  environment . A  graph  builder  is  excellent  in  turning a  very  complicated  graph  seen  in  Excel into  a  visually  digestible  analysis . In  case  number  8 , the  needle  pull  strength is  illustrated  in  graph  builder  using  four  variable . We  have  eight  adhesive  on  the  top , we  have  three  different  radiation  system on  the  Y  here , and  we  also  have  a  four  radiation  time and  then  two  needle  hub  combination . See  how  easy   it  is  to  understand  this  JMP  graph as  compared  to  the Excel  graph  right  here . Good . In  case  number  9 , we  are  conducting  accelerated  aging  study for  four  epoxy  prototype  formulation by  measuring  their  initial  adhesion on  three  substrate and  with  a  three  replica . The  aging  condition   are  two  different  temperature and  then  eight  week  aging  time  with  two  weeks  testing  interval . This  aging  design  and  the  data   was  initially  recorded  in  Excel and  we  converted  the  Excel  data  into  JMP  table  with  seven  column , seven  column  only and  the  stack  format   and  then  we  make  a  graph . You  will  agree  that  the  visualization in  JMP  graph  builder  in  this  case is  much  easier   to  see  the  aging  performance than  looking  at  the  busy  Excel  table  here . In  formulation  stage ,  we  frequently need  to  optimize  composition . Case  number  10  is  example where  two  catalysts  in  polyurethane   are  optimized  with  the  DOE  design . The  factor  are :  catalyst  ratio and  the  catalyst  total  amount . There  is  a  10 -run  face -center central  composite  design , the  predictor  provider  indicating   that  the  catalyst  total  amount  factor has  a  long  linear  effect  on  the  work  life . The  white  area  in  the  contour  profiler is  actually  the  suite  design  space  which  desire  work  life . In  this  case ,  20 -28 . It  is  important  for  chemistry  to  select this  green  highlighted  white  area for  better  production  robustness than  the  area   [inaudible 00:12:10]  in  blue , it  has  a  higher  tolerance   for  the  amount  change  there just  in  case  operator  makes some  minor  mistake . That 's  why  it  has  a  better production  robustness . We  routinely  see  chemistry  perform statistics  analysis  of  adhesion  data  like  what  you  see  here , but  we  hardly  see  anybody  presenting the  results   about  the  failure  mode  analysis . In  JMP  the  failure  mode  analysis can  be  performed  in  two  places . One  is  in  the  contingency  analysis  in  Y  by  X  platform and  the  second  one  is  using  the  graph  builder . The  case  number  11  is  the  example applying  to  the  silicone  sealant where  the  failure  mode  change  before  and  after  high  temperature  aging , before  and  after  is  clearly  shown . Clearly  shown , Beside  a  good  adhesion , adhesive  rheology  will  need  to  be  formulated so  that  it  can  be   effectively  applied  to  the  substrate . We  have  a  project  to  develop  a  seam  sealant  to  protect  the  metal  joint by  spraying  process . Case  number  12  is  the  example that  illustrating  the  complication of  spraying  process  with  respect to  the  sealant  viscosity  variation . We  have  three  spraying  variable . They  are  the  pressure ,  nozzle  type , and  then  the  head  travel  speed and  then  we  have  one  material  factor  in  this  case  is  the  viscosity . Initially ,  we  plan  the  experiment  in  factorial  DOE  fashion , but  one  of  the  factor , the  sparing  pressure   are  very  very  hard  to  control . We  end  up  performing  regression   of  the  40  round  with  ISM  model using  the  strain  width   and  strain  thickness  as  the  response . We  get  a  good  model  with  R -squared  about  0 .91  for  width and  0 .81  for  the  thickness . The  modeling  result  tell  us that  the  spraying  condition will  need  to  be  adjusted   dependent  on  the  sealant  viscosity . This  is  illustrated  in  this  prediction  profile  here . Each  processing  parameter has  their  own  machine  limit and  also  desirable  operation  limit . If  this  predicted  processing  variable are  outside  those  limits , then  the  chemist  will  need  to  redesign formulation  reality and  making  sure  that  manufacturing has  the  processing  capability to  meet  the  viscosity  requirement . This  example  show  that  the  formulation  design and  application  constraint  will  need  to  be  considered  side  by  side and  JMP  is  actually  a  very  good  tool in  facilitating  this  type  of  study . Case  number  13  is  example  that  JMP is  used  to  handle  huge instrumentation  data  sets . In  testing  thermal  interface  material , the  temperature  at  a  different  location and  the  power  consumption  data are  collected  and  then  uploaded  to  the  JMP . Once  the  data  are  in  JMP  table , visualization  of  the  data and  data  analysis  of  data  set as  much  as  500 ,000  row are  still  very  manageable  and  has  a  fast  response . That  means  the  geometry  actually  can be  used  to  handle  instrumentation  data . We  have  a  project  to  apply  adhesive to  software  by  the  sensor  printing and  this  application  is  challenging with  pinhole  defect  issue . Process  engineer  changed   six  processing  variable  randomly and  then  collect  21  wrong  results . His  data  analysis   did  not  reveal  any  special  trend , so  the  JMP  was  then  used for  the  troubleshooting  in  this  case and  the  prediction  partition  analysis  has  identified  factor  F as  the  key  factor . Later  on  we  apply the  predictor  screening  analysis and  then  identify  additional  factor  D that  need  a  further  investigation . For  JMP  training , we  learned  that  the  predictor  screening can  identify  predictor , they  may  be  weak  alone , but  strong  when  they  are  used in  combination  with  other  predictor . In  the  scaling  up   and  the  manufacturing  stage  production , when  the  batch  run  into  the  issue , the  raw  material  lot -to -lot  analysis  is  one  of  the  troubleshooting  item in  order  to  isolate   the  potential  raw  material  effect . This  exercise   is  typically  done  in  the  Excel  table . But  when  the  multiple  raw  material and  multiple  lots   of  each  raw  material  are  involved , it  is  difficult  to  look  at  a  huge  Excel  table to  analyze  the  raw  material  effect . In  case  number  15 , a  polyester  formulation with  three  raw  material and  about  45  separate  lots  are  plotted   verses  the  date  of  manufacturing with  the  color  scale  of  the  gel  time . This  heat  map  plot   provide  a  visual  analysis for  the  production  engineer  to  determine whether  a  particular  loss  of  raw  material is  the  major  cause   of  the  out  of  spec  batch . We  turn  the  Excel  table into  a  visual  way  for  better  analysis . Statistics  comparison  in  T -test  or  ANOVA  analysis are  performed  routinely  in  the  product  development . A  product  benchmark  exercise  typically  involves multiple  product  running  under various  testing  protocol , aiming  to  have  a  very  comprehensive the  product  comparison  learning  here . Case  16  is  an  example   of  statistic  analysis involved  large  combination  of  23  products and  then  more  than  10  testing  protocol . In  JMP ,  a  large  volume  statistics  analysis is  not  a  challenge since  creating  of  the  sub -table is  not  required  in  this  case , as  compared  to  other  software . One  can  utilize  the  column  switchers and  the  local  data  filter to  create  all  the  combination  of  property and  adhesive  for  statistics  analysis . Plus  the  results  of  each  analysis can  be  copied  into  a  JMP  journal to  streamline  the  reporting . For  case  number  17 , the  needle  bonding  testing  of  light  cured , historically ,  have  a  high  data  variance . Case  17  use  JMP  to  summarize   18  reports  of  needle -bound  testing which  involve  multiple  lots  of  adhesive , and  those  are  tested  in  various  time . The  needle  pore  strand ,  its  the  COV ,  are  plotted  in  graph  builder under  various  lighting ,   radiation  condition , as  well  as  the  substrate  combination . With  the  local  data  filter  here , one  can  easily change  the  criteria  selection to  have  a  comprehensive  comparison of  this  adhesive and  their  consistency  performance . When  this  result  was  presented ,  everyone was  amazed  with  the  JMP  capability . It  is  so  versatile  and  so  powerful . This  is  the  last  case for  the  application  gallery . In  this  case ,  number  18 , we  use  the  parallel  plot  feature in  the  graph  builder  to  demonstrate visual  comparison  of  15  performance items  and  10  adhesives . Each  performance   has  its  own  unit  and  scale which  provide  a  visual  comparison more  quantitatively  in  contrast to  the  qualitatively  comparison in  spider  chart  which  is  used  in  Excel . So  far ,  in  the  18  application   gallery  examples , the  data  are  coming  from   literature ,  instrumentation , processing ,  and  not  much emphasis  on  formulated . Now  we  will  switch  gear   to  discuss  formulation  creation , use  worksheets ,   and  it 's  a  JMP -based  worksheet , not  a  traditional  one  using  Excel . Before  we  show  you  the  JMP  worksheet , let 's  discuss  about  adhesive  type . Broadly  speaking ,  adhesive can  be  divided  in  two  categories : one  component  adhesives  or   the  two  component  adhesives , or  1K  or  2K . A  1K  system  like   the  Super  Glue  everybody  knows require  no  mixing  and  it  can  be  cured by  moisture ,  by  light , by  heat ,  or  by  other  method . In  case  we  are  dealing  with  one  component but  heat  cure  adhesive  such  as  epoxy , then  we  will  need  to  design and  then  calculate  the  stoichiometry or  the  index  to  balance  the  proportion of  the  epoxy  to  the  amine  hardener . Then  for  the  two  component  system , 2K  system , their  mixture  will  react at  NDM  temperature so  that  they  are  kept  apart  before  use . In  a  2K  system ,  their  stoichiometry   will  need  to  be  designed  and  calculated based  upon  the  desired  mixing  ratio , either  by  weight  or  by  volume . There  are  some  formulation  calculation  here  we  need  to  perform . This  type  of  calculation  design historically  been  done  in  Excel . This  is  the  Excel . Everybody  know  that  Excel  spreadsheet  allow  mixed  data  type  in  the  same  column and  its  formulas  can  be applied  to  individual  sales  level that  make  it  very  flexible  as a  formulation  calculation  worksheet . Formula  are  typically  organized  in  column  format  like  this . Each  column  has  a  full  group of  formulation  information such  as  their  heading ,  which  is  the  ID , their  recipe  ingredient , the  formulation  characteristic   or  processing  parameter , and  followed  by  the  result . What  about  the  result ? Excel -based  worksheet  is  very  useful . Everybody  using  that   because  it 's  easy  to  learn , but  it  does  come  with  some  shortcoming such  as  first  of  all  the  row  matching . When  you  have  a  new  ingredient   or  new  testing  results , you  need  to  match  to  the  right  row , and  they  take  time . Then  one  may  need  to  hide or  unhide  a  column  for  comparison . Then  third  thing  is   it 's  harder  to  analyze  the  data when  the  results  are  put  in  different  tab . It 's  a  tab -to -tab  format . It 's  also  very  difficult  to  make  a  graph in  such  kind  of  a  data  structure . JMP  offer  webinars  to  go   beyond  the  Excel  spreadsheet in  various  features  as  listed  here . But  the  worksheet calculation  is  not  emphasized . Perhaps  this  is  due  to  the  inherent   data  structure  that  each  column cannot  have  a  mixed  data  type and  the  column  formulas is  applied  to  the  entire  column which  is  not  as  versatile  or  flexible as  compared  to  the  Excel . Despite  of  these  constraints , we  have  developed  JMP  worksheet with  the  following  objectives  in  mind . It  should  have  a  broader  capability for  formulation  design ,  calculation , recording ,  and  analysis . It  is  all  in  one  and  we  want  to  minimize cross -platform  copy -pasting . It  should  be  easy  to  operate , easy  data  entry  and  use  the  JSL for  a  lot  of  the  automation . Then  the  final  data  set  is  ready for  machine  learning  exercise . Let 's  look  at  our  Gen1 , and  that  is  for  one  component  system . This  includes  four  data  group . We  have  a  formulation  ID ,   we  have  a  recipe , we  also  have  a  material   processing  characteristic , and  then  we  have   a  testing  result  right  there . The  four  data  group  are  the  same  as  what you  see  in  the  earlier  Excel  worksheet , but  layer  structure  was  organized in  the  column  from  the  left  to  right . This  is  different  from  the  Excel which  is  from  top  to  the  bottom . The  data  of  the  three  group ,  2 ,  3 ,  and  4 are  shared  and  recorded   in  the  same  column , which  has  a  numeric  data  type . All  the  recipe ,  all  the  testing  results , and  all  the  formulation  characteristics all  in  the  numerical  data  type , and  they  are  documented in  the  same  column  here . With  this  kind  of  a  format … The  data  was  also  stacked  together . I  have  formulation  1  here , formulation  2  here . With  a  stacking  format , one  can  freely  enter  the  new  ingredient or  new  testing  item  without  needed to  match  the  role  as  needed  in  Excel . JSL  was  also  created   to  enable  data  analysis in  either  tabular  way   or  in  a  graph  format . This  is  in  a  tabular  way . Chemist  can  pick  several  formulation  ID and  compare  their  recipe  characteristic and  performance  in   a  very ,  very  condensed  format  here . This  is  very  different  from  Excel without  needing  to  hide /unhide  columns to  bring  formulation to  be  adjacent  to  each  other . Much ,  much  easier  under   the  JMP  format  here . Besides  tabulation ,  one  can  make  a  graph of  the  property  versus   the  property  comments  or  the  sample  ID , but  not  the  ingredient  percentage . This  graph  can  be  combined   with  the  recipe  table  here into  a  group  under   the  dashboard  operation . This  make  it  as  a  very effective  visualization  analysis . As  for  testing  involves   multiple  replicates . We  typically  just  record   the  average  result . But  one  can  enter  the  individual   replicate  data  in  the  property  column , and  then  perform  the  T  test ,  the  all -over  test ,  using  this  worksheet  here . In  case  people  doesn 't  want  to  enter  data  in  this  way , there  is  the  other  way  to  virtually  link  the  data  file  with  the  replication  result with  the  worksheet . That  will  be  shown  later   in  the  presentation . So  far ,  what  you  see   is  our  Gen1  worksheet which  involves  no  formulation  calculation . Chemists  in  my  group  has  been  using  this  tool  for  more  than  one  year . They  get  used  it  its  easy  data  entry  and very ,  very  powerful  tabulation  analysis . Next  we 're  going  to  look  at the  Gen  2  worksheet that  can  overtake  the  Gen1  feature . It  has  an  additional  feature for  the  formulation  calculation for  the  1k  and  2k  system . This  worksheet  also  link with  the  other  JMP  file that  has  additional  raw  material   information  needed  for  calculation . We  have  the  other  worksheets ,   we  call  Gen  3 ,  that  are  designed to  deal  with  the  solvent  borne  system . It  also  allow  formulator   to  incorporate  master  batches , but  due  to  the  time  constraint it  will  not  be  discussed  here . This  is  our  Gen2  worksheet . There  are  three  sections . We  have  a  heading  and  then the  formulation  input  section  right  here . The  middle  one ,   we  have  a  calculation  output . The  third  section  is the  processing  material  characteristic and  also  the  testing  results . Section  1  and  section  3 are  like  the  one  in  Gen1 , but  the  section  2  here  is  newly  added . The  column  row  name   is  used  to  link  the  reference  file that  has  additional  data  information  needed  for  calculation . You  can  see  the  symbol for  the  virtual  link  right  here . After  chemist  enter  the  formulation  ID , they  will  specify  for  columns ,  parts , row ,  name ,  and  initial  weight . If  they  are  doing  the  2K  system , they  need  to  also  specify  the  mixing  ratio either  by  index , by  volume ,  or  by  weight  ratio . Then  the  worksheet  will  output the  mixing  ratio  characteristic  here again  by  index ,  by  volume ,  or  by  weight . They  also  provide  a  normalized composition ,  either  by  part . By  part  means  A  and  B  sum  up  together   by  themselves  and  equal  to  100 , or  A  and  B  mixed  together . We  call  it  normalized  by  total  here . After  seeing  this  one  and  the  chemist can  perform  the  experiment and  then  come  back   to  enter  the  results  right  here . The  other  thing  is in  the  property  material  characteristic , we  have  the  other  column  called  Lookup . This  can  extract  the  information from  the  calculation and  also  the  raw  material   fraction  percentage  ratio and  automatically  displays  right  here . Then  chemists  just  need  to  copy parameter  in  the  value  enter  column and  then  this  will  be  automatically transferred  to  the  two  normalized percentage  column  for  display  purpose . We  also  have  three  JSL  there to  facilitate  in  analysis . The  first  one  is  showing  you   normalization ,  normalized  by  total . That  means  A  and  B  being  mixed together  and  sum  up  to  100 . Here ,  I  showed  you  the  formula , showed  you  the  characteristic and  showed  you  the  result . You  have  a  second  JSL that 's  normalized  by  part . In  this  case ,  you  can  see  your  part  A formulation  and  part  B  formulation , and  then  A  and  B  all  have  been normalized  to  100  by  themselves . With  the  other  JSL , we  can  change  the  formulation worksheet  format  from  the  stacked to  the  white  format . In  this  case  their  ID  performance , individual  ingredient , and  then  the  characteristic  will all  have  their  own  individual  columns . With  this  format ,  one  can  make  the  graph with  the  property  versus the  ingredient  percentage which  cannot  be  done   under  the  stack  format . One  can  also  looking for  the  correlation between  the  performance  or  the  performance with  the  formulation  characteristic . At  this  moment ,  I  like  to show  you  the  live  demonstration . This  is  the  formulation  worksheet I  just  showed  you  in  the  PowerPoint . Basically ,  we  have  the  heading . Then  we  have  a  formulation  input  section . We  have  a  calculation  between  n1  and  n2 . Anything  here  is  for  calculation . Then  we  have  the  last  section  here , that  is  a  performance  and  then the  property  material  characteristic . I  mentioned  that  we  have  a  JSL , allow  people  to  look at  this  result  easily . Let 's  look  at  this  one ,  JSL  by  total . We  can  easily  highlight  any  formulation  or  compare  2  and  8 , and  then  compare   their  formulation  and  their  result . These  are  mixed  together . We  can  look  at  it  by  part . Part  A  here  and  then  part  B  here . They  all  sum  up  to  a  hundred   by  themselves . Easily ,  we  can  compare … Oh  no ,  I  need  to  remove  this  one  first . I  can  compare  formulation  easily   by  manipulating  the  local  data  filter . Again  with  the  JSL ,  we  click  the  Join  All . We  are  turning  the  stack  format  into  a  wider  format . Each  row  belong  to  one  formulation with  the  heading  here , with  their  property , with  their  formulation , and  with  their  formulation characteristics  showing  right  here . For  machine  learning , we  can  highlight  a  role  ingredient and  then  just manually  add  zero  so  that  each ingredient  has  zero  or  whatever , and  then  now  we  can  do  this  one . We  can  create  a  summation  or  something , easy  to  operate  in  this . I 'm  going  to  show  you  next how  this  one  work  in  the  sense  that assuming  that  we 're  going to  create  a  formulation . I 'm  going  to  copy  the  heading . Sorry ,  I 'm  going  to  delete  everything  here because  I  create  this  one  already  before . I 'm  going  to  delete the  demonstration  one . I 'm  going  to  create  it  from  scratch by  copying  the  heading  here . I  change  the  name  to  Demonstration  here . I  will  copy  the  formulation  because I 'm  going  to  modify  formulation from  this  one ,  the  DOE  8 . Then  the  DOE  8  is  based  on   one -to -one  mixing  ratio  by  volume . But  in  this  new  one ,  we  could change  it  to  one -to -two  mixing . A  divided  by  B  is  one  divided  by  two , so  it  will  be  0 .5 . Then  I  copy  the  heading  including the  mixing  ratio  all  the  way  down . Now  all  the  calculation has  been  done  here . With  this  weight  percentage  I 'm  entering , it  showed  that  the  material  has  an  index model  ratio  A  to  B   to  be  0 .65 ,  which  is  too  low . We  need  to ,  using  our  chemistry knowledge ,  to  turn  this  around . In  this  case ,  for  example , I  make  this  one  2 . I  can  easily  make  this  one  into  1 .05 . That  is  the  range  I 'm  looking  for . Basically ,  assuming  it  is  the  design that  we  want ,  formulation  we  want , the  next  thing  we  want  to  do  is  to  copy some  of  the  testing   that  we  already  had  before , that  we  are  monitoring  before , but  without  the  results ,  of  course . We  have  a  new  result  here , so  I 'm  going  to  delete  that  one . But  we  also  want   to  add  additional  property which  for  example  is  viscosity measure  at  a  room  temperature . With  this  section  here , then  we  want  to  extend  our  heading to  specify  those  are  belong to  this  formulation . As  soon  as  I  specify  the  heading , the  Lookup  automatically  give  me the  information  such  as the  missing  characteristic . 1 .5  or  0 .5 ,  they  are  automatically  copied  to  here  through  the  Lookup  function and  then  the  feeder  loading   in  the  formulation  normalized  to  Total while  also  being  extracted , sum  up  together  and  put  it  right  here . Now  I  can  copy  this  information , put  them  in  value  enter , and  specify  my  mixer  is  number  2 , and  then  start  to  enter  my  results , time  that 's  going  to  be  80 , and  adhesion  450  assuming ,   viscosity  20 ,000 . I 'm  pretty  much  finished  everything , so  let 's  look  at  the  result  here . We  just  enter  Demonstration . This  one  was  based on  the  DOE  number  five . DOE  number  five  is  one  to  one  mixing and  this  Demo  is  only  one  to  two  mixing , and  we  added  the  viscosity result  right  here . It 's  very ,  very  easy . One  click  you  see  the  result and  in  the  format  it 's  very  easy to  understand  for  comparison . This  is  the  end  of  my  demonstration . Let  me  go  back  to  the  presentation  here . We  consider  the  JMP  worksheet  that  I 'm   just  showing  you  is  an  integrated  platform and  here  is  the  summary . The  worksheet  in  the  stack  format ,  here , is  used  for  formulation  design , calculation  and  for  recording  the  results . The  data  entry  of  raw  material which  is  needed  for  the  worksheet  is minimized  by  virtually  linked  with the  other  file  that  has  additional raw  material  information . JSL  was  widely  used  to  automate the  worksheet  output  to  the  tabulate , to  graphic ,  to  the  statistic  analysis , and  also  to  create  a  table   with  wide  data  format . The  wide  data  format ,   they  already  have  a  data  structure for  modeling  via  the  machine  learning and  also  allow  the  graphical  analysis using  the  ingredient  as  one  of  the  axis . Then  since  each  of  the  row in  this  wide  format is  a  unique  tool  formulation  ID , this  actually  can  be  used as  a  reference  table to  join  the  other  JMP  file  that  has   a  testing  result  that  has  a  replication . When  these  are  joined  together , then  we  can  plot  the  raw  data  and  do  statistic  analysis , either  as  function  of  the  ingredient   or  as  function  of  the  formulation  ID . This  JMP  Integrated  Worksheet  Platform truly  illustrates  it  is  an   all -in -one  platform ,  very ,  very  capable . In  summary ,  JMP  is  not just  an  advanced  DOE  software . JMP 's  data  analytics   has  been  effectively  utilized in  my  group  for  product  development at  various  stage  to  speed  up the  innovation  process . JMP -based  formulation  worksheet  is an  integrated  platform  that  feature broad  formulation  capability ,   all  in  one ,  easy  operation , and  machine  learning  ready  data  structure , and  more  and  more waiting  to  be  further  explored . With  this ,  thanks  for  your  attention and  I  also  like  to  acknowledge  the  people I  work  with  and  learning  to  JMP  together and  also  our  management  system for  supporting  JMP  adoption  initiative . Thank  you  very  much .
The development of innovative new products can be accelerated using statistically optimized DOE and regression modeling. With the goal of maximizing efficiency and reducing expense, it is often tempting to limit the collected data to key product attributes, such as customer specifications or internal quality metrics.  However, increasing the number of available responses by including a wider range of more fundamental measurements in the analysis can often be critical to success.   This talk covers examples from projects at PPG’s Coatings Innovation Center. We highlight the use of JMP from design through analysis to visualization of the results for a fractional factorial and a constrained mixture/process design. Using tools such as column switcher, multivariate plots, pairwise correlation and mixture profiler, we demonstrate how the inclusion of ancillary responses helped to develop a deeper understanding of the systems being studied and pinpoint the causes behind unexpected results.     My  name  is  David  Fenn. I  work  for  PPG  Industries at  their  Coatings  Innovation  Center just  outside  of  Pittsburgh. Today  I'm  going  to  be  talking about  some  of  our  experiences with  using  DOE for  novel  coatings  development, particularly  focusing  on  the  importance of  using  auxiliary  responses. The  agenda,  I'll  talk  a  little  bit about  research  methodology, and  in  case  you're  wondering what  I  mean  by  auxiliary  responses, I'll  define  that  in  this  section. Then  we'll  go on  to  two  examples. The  first  one,  a  new  resin  development for  architectural  coatings, and  the  second  one, a  protective  coating, then  we'll  finish  off with  a  few  general  observations. There  are  various  frameworks that  can  be  used  to  describe the  new  product  development  process. One  that  I  particularly  like is  shown  here,  DMADV. Here  we  have  five  stages. The  first  stage  is  to  define the  goals  of  the  project. What  are  we  trying  to  achieve? Then  we  get  into  measurement. What  are  the  critical  characteristics we  need  to  measure, and  do  we  have  suitable  processes in  place  to  measure  them? Then  we  think  about  analyze . What  factors  can  we  change to  make  improvements? Then  onto  the  design  stage, where  we  deliberately  manipulate those  factors and  the  levels  of  those  factors to  try  and  affect  an  improvement and  lead  to  optimum  performance. Then  once  we  have  an  advanced  prototype, we  get  onto  verification, thinking  about  will  our  solution  work in  the  real  world? What  are  the  important  things we  need  to  think  about when  we  apply  this  framework? Well,  in  terms  of  the  defined  stage, the  goal  needs  to  be  clear so  that  the  whole  organization  has the  same  understanding  of  the  goals, and  it  needs  to  be  impactful. If  we're  successful and  we  deliver  a  solution, will  it  fill  a  real  unmet  need in  the  marketplace and  be  a  successful  product? If  we  skip  now  to  the  end, any  solution  we  apply needs  to  be  cost- effective. It  needs  to  be  robust. Then  the  middle  of  this  process, we  want  to  get  through  this  process as  quickly  and  as  efficiently  as  we  can. We  want  to  deliver  the  product to  the  marketplace  as  soon  as  we  can, and  we  want  to  expend the  minimum  amount  of  cash and  the  minimum  amount of  resource  to  do  that. Clearly,  DOEs  and  a  lot  of  the  tools that  are  available  in  JMP are  well  set  up to  make  us  succeed  in  this  area. One  of  the  tools  that  I  like  to  use particularly  early  on  in  a  project is  a  process  map. This  is  a  very  particular  type of  process  map. It's  really  mapping the  process  of  carrying  out the  research  and  development. I'm  showing  here a  simplified  example  of  a  process  map to  develop  an  automotive  base  coat. We  have  all  the  steps that  are  involved  in  our  experiment. We  make  a  resin, we  use  that  resin to  make  a  base  coat  paint, we  spray  apply  that  base  coat onto  a  substrate, we  apply  a  top  coat onto  that  base  coat, then  we  cure  those  coatings  together, and  then  we  measure  the  properties that  we  get  from  all  of  that. All  of  these  steps, we  list  all  of  the  factors  that  might  play a  role  in  these  separate  steps. This  is  useful  for  a  number  of  reasons. First  of  all, it  gives  everybody  in  the  team a  unified  understanding of  what  the  process  is  we're  dealing  with and  how  are  we  going  to  affect  it. It  also  allows  us  to  capture all  of  the  variables  we  can  think  of that  might  play  a  role in  the  various  steps so  we  don't  overlook  anything. Then  it's  a  good  starting  point for  thinking  about  which  of  these are  we  going  to  try  and  manipulate, which  of  these  are  we  going  to  focus  on to  try  and  deliver  a  successful  project? These  factors  are  further subdivided  and  categorized. First,  we  have  our  Xs. These  are  the  variables that  we  can  manipulate to  try  and  affect  an  improvement in  our  product  or  our  process. Then  we  have  our  big  Ys. These  probably  appear in  the  specification  of  the  product. These  are  what we're  really  trying  to  achieve. This  is  what  the  customer really  cares  about, what  the  customer  will  pay  for. Next,  we  have  our  Ns, noise  variables. These  could  be  variables that  we  may  be  not  controlling, we're  not  deliberately  manipulating, but  things  that  could  introduce noise  into  the  process, either  during  the  experiments, during  the  new  product  development, or  in  the  end  application, in  the  manufacture  of  the  product or  the  end  use  of  the  product. Then  finally, the  subject  of  today's  talk, we  have  our  auxiliary  responses, which  we  label  as  little  Ys. These  might  not  appear in  the  specification, the  customer  might  not  even be  aware  of  these, but  they're  measurements  we  can  take at  various  stages  of  the  process that  might  tell  us  something about  what's  going  on. I  said  in  the  previous  slide that  one  of  our  goals is  to  get  through this  whole  process  quickly, as in  efficiently  as  we  possibly  can. One  question  that  raises  is, why  don't  we  just  measure  our  big  Ys? We  have  the  ability  to  carry  out  DOE's. We  could  optimize  for  our   big Ys, we  could  build  predictive  models. Isn't  that  all  we  need  to  do? Why  should  we  spend  time? Why  should  we  spend  money measuring  some  of  these   little Ys when  they're  not the  real  goal  of  the  outcome? Well,  I  hope  in  the  next  couple of  examples  that  I  can  show  you, some  cases  where  carefully  selecting these  little  Ys and  doing  some  good  analysis can  be  really  critical to  the  success  of  a  project. Our  first  example  here, the  development  of  a  new  resin for  architectural  coatings. The  goal  was  to  come  up with  a  single  resin  that  could  meet all  of  the  performance  requirements across  several  product  lines in  several  countries. Our  starting  point  was, we  had  no  single  resin that  could  meet  all  those  requirements. We  were  using  different  resins in  different  products, different  resins  in  different  countries, and  we  needed  to  come  up  with  a  solution that  allowed  us  to  reduce the  complexity  there. Our  early  prototype  struggled in  a  number  of  areas, but  one  particular  area was  tint  strength. The  way  these  white  base  paints would  be  used is  if  I  go  into  a  store  and  request a  paint  of  a  particular  color to  paint  the  walls  of  my  house, the  store  will  take  that  white  base  paint and  add  specified  amounts of  concentrated  color  toners  to  that  paint to  create  a  specific  color. It's  really  critical  to  be  able to  hit  a  target tint  strength, which  is  the  measurement of  how  quickly  that  color  will  change as  we  add  a  certain  amount of  a  particular  toner. We  need  to  be  able  to  control  that and  hit  it  reproducibly to  achieve  the  wide  spectrum of  colors  we  need  to  achieve. We  also  had  a  few  issues in  terms  of  poor  heat  age  stability and  poor  resin  reproducibility. Our  approach  was  to  carry  out some  sequential  DOE's to  learn  how  to  control  tint  strengths and  some  of  the  other  factors. I'm  showing  the  progress  on  this  plot  at the  bottom  left-hand  side  of  this  screen. Before  we  started  the  DOE's, just  some  of  the  exploratory  experiments; the  orange  bar  represents  the  range of  tint  strengths  we  were  able  to  achieve. We  can  see  that  is  far  below the  target  range  of  tint  strengths  shown by  this  green  bar  on  the  plot. As  we  carried  out  the  DOE's, we  learned  how  to  control  tint  strength. We  were  able  to  increase  it until  towards  the  end  of  the  project when  we  were  doing our  optimization  DOE's, we  were  nicely  centered around  this  target  tint  strength. We  were  able  to  build  predictive  models and  use  those in  conjunction  with  predictive  models for  some  of  the  other  key  properties to  identify  white  space  where  we  met  all of  the  target  properties  at  the  same  time. But  rather  than  talk about  the  whole  project, I  want  to  now  focus on  one  particular  DOE  that  we  carried  out. The  goal  of  this  DOE  was to  confirm  and  quantify something  we'd  observed  previously, that  the  particle  size of  the  resin  we  were  making was  a  big  factor in  controlling  tint  strength. These  resins  are,  in  effect, dispersions  of  little  particles of  resin  in  water, and  it  was  the  size  of  those  particles that  seemed  to  be  important. We  were  also  using what  we  call  a  co-surfactant to  help  disperse  those  particles, and  we  had  a  few  choices about  where  in  the  process we  could  add  that  co-surfactant. We  wanted  to  look at  a  couple  of  candidates for  the  addition  point of  that  co-surfactant to  see  if  it  affected  the  key  properties. Then  finally,  up  until  this  point, all  the  resins we've made, we've  made  at the  Coatings  Innovation  Center. We  now  wanted  to  check, could  we  make  these  resins  reproducibly across  three  different  locations? The  DOE  we  carried  out  is  shown on  the  right-hand  side  here. We  have  three  levels for  our  target  particle  size. We  have  two  levels  for  the  addition  point of  the  co-surfactant. That  gives  us a  full  factorial  DOE  with  six  runs. Then  we  replicated  that  DOE across  three  different  laboratories. I'll  go  straight  into  JMP and  I'll  show  you what  the  data  table  looks  like. You  can  see  here  we  have the  original  data  table,  the  DOE, but  now  we  have a  whole  collection  of  data that  we  gathered  during  the  DOE. The  first  thing  we'll  do  is, we'll  look  at  what  we  learned about  tint  strength. I've  already  built  here a  reduced  model  for  tint  strength. If  we  have  a  look at  the  effect  summary  to  start  with, we  can  see  that  the  location of  addition  of  the  co-surfactant wasn't  a  factor in  determining  tint  strength. That  dropped  out  of  the  model. But  we  do  see that  the  target  particle  size and  the  reactor  location  were  factors, as  well  as  the  interaction between  target  particle  size and  reactor  location. If  we  look  up  at our  actual  by predicted  plot, we  can  see  it  looks like  a  pretty  nice  model. We've  got  a  nice  R-square, and  everything  looks  to  be in  pretty  good  shape. Then  probably  the  best  way of  understanding  what's  happening and  what  this  model  is  telling  us is  to  look  at  the  prediction  profiler here  at  the  bottom. We  see  our  anticipated  effect of  target  particle  size  on  tint  strength. As  we  increase  target  particle  size, we  get  higher  tint  strength. Then  if  we  look  across at  reactor  location, what  we  see  is  that   Lab A  and  Lab C are  giving  broadly  similar  results. But  if  we  look  at   Lab B, first  of  all, we  see  that  the  tint  strength that  we  get  from   Lab B is  significantly  higher than  we  were  getting  from   Lab A  or   Lab C. We  also  see that  the  dependence  on  particle  size is  much  less  from   Lab B than  we  saw  from  the  other  two  labs. This  was  a  problem  for  us. Whenever  we  see  that  different  labs are  producing  different  results with  the  same  resin and the  same  process, it  can  be  a  really  long  task to  work  out  what's  going  on  here. There's  so  many  potential  candidates for  the  cause of  this  poor  reproducibility. At  this  stage, we  were  very  concerned that  it  was  going  to  take  us a  long  time  to  resolve  this, that it  was  going  to  derail  the  project, and  we're  going  to  miss our  target  launch  dates. Before  we  went  into  any  specific  activity to  try  and  address  this  problem, the  obvious  first  step  was  to  look at  the  data  that  we'd  already  gathered in  this  data  table and  see  if  there  were  any  clues that  could  maybe  give  us  a  hint as  to  why   Lab B was  giving  different  properties. Whenever  I  see a  wide  data  table  like  we've  got  here, one  of  the  first  tools  that  I  always  go  to is  the  column  switcher. The  way  in  this  case that  I  will  implement  this is  the  first  step is  to  build  a  variability  chart that  best  shows the  problem  that  we're  having. I've  pre-built  a  variability  chart  here where  I've  got  target  particle  size and  reactor  location  as  my  X-axis and  I've  got  the  initial  tint  strength as  my  Y-axis. The  first  task  is  to  get  this into  a  format  that  the  best  represents the  problem  we're  dealing  with. The  first  thing  I'll  do is  swap  over  my  target  particle  size and  react or  location. I'll  also  add  and  connect  the  cell  means to  add  some  lines  here. Now  I'm  pretty  happy  with  this. I  think  this  nicely  reflects the  problem  that  we're  dealing  with. We  can  see   Lab A  and   Lab C very  similar  results, but   Lab B,  higher  tint  strength and  less  dependence on  tint  strength  and  particle  size. Now  I  can  use  my  column  switcher, and  what  this  will  allow  me  to  do is  keep  this  plot  in  exactly  this  format, but  quickly  switch  out  this  Y-axis, the  initial  tint  strength, for  any  other  variable that  I've  got  in  my  data  table. I'll  go  into  the  redo  platform and  select  the  column  switcher. Now  I  can  select any  of  the  other  factors  in  my  data  table. I'm  just  going  to  select  everything that  I've  got  in  my  data  table. Then  when  I  hit  OK, I  now  have  this  column  switcher to  the  left  of  my  plot. I  can  click  on  any  of  these  factors and  it  will  change  this  axis  but  keep the  plot  in  exactly  the  same  format. If  I  select  particle  size, I  can  see  now  I'm  plotting my  actual  measured  particle  size against  target  particle  size and  reactor  location, exactly  the  same  format. It  looks  like  in  this  case, all  three  labs  are  giving pretty  similar  results. I'm  not  seeing  anything  that  gives  me a  clue  as  to  what's  going  on, but  I  can  quickly  just  scroll through  this  whole  data  set. I'm  seeing  mostly  noise  right  now. I'm  not  seeing  any  patterns that  seem  to  be  particularly  helpful, but  I'll  keep  going. When  I  get  to  this  plot  here, so  now  I'm  plotting  conductivity, I  see  interestingly that   Lab B  is  making  resins with  much  higher  conductivity than  Lab A  and   Lab C. That's  one  useful  observation. I'll  keep  going. This  next  one, this  is  actually  another measurement  of  conductivity after  the  resin  has  been in  a  hot  room  for  a  week, showing  the  same  thing, still  confirming  that   Lab B is  giving  higher  conductivity. I'll  keep  going. Mostly  noise, maybe  a  little  bit  of  an  indication that  the  molecular  weight from   Lab B  is  slightly  lower. I'll  keep  going. Again,  still  not  seeing  anything that  interesting,  mostly  noise. But  then  I  get  to  this  plot  here, and  again, now  we're  plotting  the  pH of  the  resins  one  hour  into  the  process, so  early  into  the  process, the  acidity  or  pH  of  the  resin. Lab B,  again, is  different  from   Lab A  and   Lab C. It's  giving  me  much  higher  pH. Keep  going  just  to  check if  there's  anything  else. This  was  the  initial  plot we  started  with  of  initial  tint  strength, and  then  the  last  one is  the  paint  viscosity, where  everything  looks  pretty  similar. Really  quickly  using  column  switcher, I  found  out  that  not  only  is   Lab B  making resins  with  higher  tint  strength, it's  making  resins with  higher  conductivity and  higher  acidity,  higher  pH. What  could  that  be  telling  us? What  might  be  causing  higher  pH and  higher  conductivity? Well,  these  resins,  I  said, were  a  dispersion of  a  polymer  particle  in  water. Anything  that's  changing  the  conductivity, the  pH  is  going  to  be  in  the  water  phase. It's  not  going  to  be  in  the  resin  phase. What  we  did  was  we  precipitated by  centrifuge, we  precipitated  out  the  resin and  just  analyzed  the  water  phase. We  carried  out  a  lot  of  analysis, but  one  of  the  things  we  worked  on, I'm  showing  on  this  plot on  the  right-hand  side  of  PPMs, of  parts  per  million of  phosphorus  and  sulfur in  that  water  phase. If  I  look  at  the  orange  bars to  start  with,  the  sulfur, I  can  see  all  of  the  resin s from  all  three  labs  are  very  similar, but  the  blue  bars, the  level  of  phosphorus, Lab  B  is  making  resins with  about  four  times  as  much  phosphorus as  they  were  making from   Lab A  and   Lab C. When  we  looked  at  the  recipe for  making  these  resins, there's  only  one  raw  material  that  brings  in  phosphorus. On  a  bit  of  further  investigation, what  we  found  out  was  the  supplier that  was  delivering  this  material  to   Lab B was  mistakenly  supplying  something that  was  four  times  as  concentrated as  it  should  have  been, and  four  times  as  concentrated as  they  were  providing  to   Lab A  and   Lab C. The  auxiliary  data  that  we  looked  at in  this  DOE  using  the  column  switcher, we  were  able  to  really  quickly  pinpoint the  cause  of  that  problem. We  didn't  have  to  expend time  to  get  there. The  project  stayed  on  track, and  there  was  even  a  bonus. We  learned  that  increasing  the  level of  this  material  with  the  phosphorus was  another  tool  we  had to  increase  the  tint  strength. We  would  have  probably  never been  aware  of  that if  we  hadn't  carried  out  this  analysis and  had  this  happ y  accident. That's  the  first  example. We  go  on  to  the  second  example  now. In  this  case, we're  dealing  with  a  protective  coating, a  coating  that's  designed  to  go over  metal  substrates  like  iron  and  steel and  protect  from  corrosion . We  have  five  experimental  resins that  we  want  to  look  at, a  resin  that's  designed to  give  good  corrosion, and  then  four  resins that  are  designed  to  improve the  flexibility  of  the  coating. The  first  three  of  these  resins are  added  in  the  first  stage of  the  coating  preparation, and  then  the  last  two  get  added in  a  separate  later  step. We  have  two  questions  here we're  trying  to  answer. How  do  the  resins  affect  corrosion and  affect  flexibility, and what  is  the  best  combination of  the  levels  of  these  resins to  give  us  the  best  combination of  corrosion  and  flexibility? Again,  we  use  DOE, we  were  able  to  build  predictive  models, and  here  we  were  using the  mixture  profiler to  identify  some  white  space that  we  can  work  in. This  DOE  is  a  little  bit  more  complicated than  the  first  one, so  I'm  trying  to  represent  pictorially what  we  were  dealing  with  here. If  we  look  at  our  first  stage of  our  coating  manufacture, in  addition  to  our  three experimental  resins, we  have  a  main  resin  at  a  fixed  level. In  effect,  our  three  experimental  resins are  three  mixture  variables  here because  they  form  the  rest  of  this  100 %. They  add  up  to  a  constant  sum  of  57.77 %. Three  mixture  variables dealing  with  there. Then  in  stage  two, we  can  deal  with  our  other  two  resins as  independent  variables because  they're  not  part  of  that  mixture. We  have  three  mixture  variables, two  independent  variables. We  also  have  some  levels that  the  formulators  were  able to  decide  they  wanted  to  work  in based  on  prior  experience for  all  of  these  resins. Then  we  have  some  constraints on  the  combinations  we're  dealing  with. For  example,  at  the  start  here, we  want  the  sum  of  Flex2  and  Flex3 to  be  more  than  10%  but  less  than  30%. There  are  some  other  constraints  as  well. A  fairly  complicated  DOE, but  using  custom  design, it's  relatively  straightforward to  build  this  DOE. Definitely  some  tips  and  tricks in  terms  of  how  to  build  the  DOE, what  model  to  use, and  how  to  analyze  that  data. I  don't  have  time to  go  through  that  today, but  I'd  be  perfectly  happy to  talk  about  that  offline if  anybody's  interested. But  let's  go  straight  into   JMP and  we'll  look  at  this  example. Here  we  have  the  DOE  that  we  carried  out. It  was  a  16-run  DOE. If  we  go  right  across to  the  left-hand  side, we  have  our  three  mixed  variables and  our  two  process  variables. We've  measured our  flexibility  and  corrosion and  then  we  have  a  lot  of  other auxiliary  responses  we've  measured. I  was  able  to  build  good  predictive  models for  flexibility  and  corrosion. What  I'm  going  to  do  is  just  show  you those  models  in  the  profiler just  to  help  us  understand what  we're  learning  and  what's  going  on. I'll  add  those  two  predictive  models that  I  built  to  my  profiler, and  then  I  get  my  profiler  here. I  can  see,  first  of  all, I'm  plotting  flexibility and  corrosion  here. Lower  numbers  are  better for  both  of  these  responses. Lower  numbers  for  flexibility, lower  numbers  for  corrosion are  what  we're  targeting. I  can  see  as  I  add  my  corrosion  resin, if  I  increase  the  level, I  get  better  corrosion  performance, but  unfortunately, I  get  worse  flexibility. The  opposite  is  true  for  most of  these  flexibleised  resins. As  I  add  more  of  these, I'm  getting  better  flexibility, but  worse  corrosion. This  is  something  that's  very  common in  coatings  development and  lots  of  other  areas. Seems  like  there's  always a  pair  of  properties where  if  we  improve  one  of  them, we  always  make  the  other  one  worse. But  if  I  come  across to  my  Flexabiliser  4  resin, something  really  interesting  here, as  I  add  more  of  this  resin, I  get  better  flexibility, but  I  don't  suffer  at  all in  terms  of  corrosion. This  is  going  to  be  a  really  useful  tool for  us  to  optimize  the  combination of  flexibility  and  corrosion. But  I'd  like  to  understand  a  bit  more about  the  science  behind  this. What's  happening? What's  unusual  about  Flex4 that  allows  us  to  improve  our  flexibility without  degrading  corrosion? Again,  I  want  to  use all  of  this  auxiliary  data that  I've  gathered  in  my  data  table to  help  me  understand  that. What  I  did  is, I  want  to  look  through  this  table, and  I'm  going  to  use a  different  tool  this  time. I'm  going  to  use  multivariate. If  I  select  that, this  allows  me  to  basically  look at  the  correlation between  all  the  combination  of  factors that  are  in  my  data  table. I'll  select  everything  that  I  measured and  I'll  add  it  in  the  Y  columns and  just  hit  OK. This  generates  my  multivariate. The  first  thing  I  see  is  this  table  here where  I've  got  all  the  correlations for  all  the  pairs  of  combinations of  the  factors  that  are  put  in  my  table. I  can  see  there  are some  pretty  nice  correlations  here. I'm  seeing some  fairly  strong  correlations, but  it's  a  little  bit  difficult to  go  through  all  this, a  bit  overwhelming  to  go  through  all  this and  pick  out  any  interesting  patterns. I've  also  got  my  scatter  plot  here, and  if  I  add  a  fit  line to  these  scatter  plots, again,  I'm  seeing some  fairly  strong  correlations, but  still  I  think  this  is a  bit  overwhelming  to  dive  straight  into. The  tool  that  I  like  to  use  to  start with  here  is  pairwise  correlations. If  I  select  that, this  generates  a  new  table  where  I've  got all  the  possible  pairs  of  variables and  it's  giving  me  the  correlation. I  can  sort  this  table  based  on  any  column. I'm  going  to  sort by  the  significant  probability and  I'll  make  it  a scending because  I  want my  low  significant  probabilities to  be  at  the  top  of  my  table. Then  if  I  hit  OK,  I  can  see  the  first and  strongest  correlation  I  get, in  fact,  involves this  Flexibiliser  Resin  4 that  was  giving  us this  interesting  behavior. I  can  see  a  strong  correlation with  the  secondary  or  TG2. This  is  a  glass  transition  temperature. The  glass  transition  temperature  is a  temperature  at  which  a  coating  changes from  being  a  glassy  hard  material to  a  soft  rubbery  flexible  material. My  Flex4  level  is  correlating  with  here a  secondary  glass  transition  temperature that  I'm  measuring. And  I  can  see  also if  I  go  a  little  bit  further  down, my  primary  glass  transition  temperature, the  main  glass  transition  correlates strongly  with  the  corrosion. S cientifically,  I  think they're  interesting  observations. What  I  did  based  on  that is  I  also  built  predictive  models for  my  primary  TG  and  for  my  secondary  TG. Now  I  can  look  at  my  profiler, but  I  can  include  all of  my  four  predictive  models. Now  I'll  include  the  two  I  did  before, flexibility  and  corrosion, but  also  my  primary  TG  and  secondary  TG. Now  what  I  can  see is  that  the  first  two  rows are  exactly  what  we were  looking  at  before. If  I  look  at  my  primary  TG, I  can  see  whatever  I  do in  terms  of  adding  new  resin. For  example,  if  I  add more  of  my  corrosion  resin, I'm  increasing  my  primary  TG, and  that's  correlating with  an  improvement  in  corrosion. The  flexibilising  resins, if  I  add  more  of  those, I'm  decreasing  my  primary  TG and  making  my  corrosion  worse. That  primary  TG  does  seem  to  correlate, as  the  multivariate  is  showing, correlate  very  well  with  corrosion. If  I  look  at  my  Flex4  resin, it  was  having  no  effect  on  corrosion and  it's  having  no  effect on  my  primary  TG, so  it's  different  from  my  other flexivising  resins, but  I  can  see  for  my  secondary  TG, as  I  add  more  of  my  Flex4, it's  rapidly  decreasing  the  secondary  TG. The  other  resins  really  don't  have much  effect  on  secondary  TG. What  does  that  mean? What  can  I  learn  from  that? Well,  any  material  that  has  multiple  TGs, glass  transition  temperatures, it's  usually  a  sign  that  it's a  multi-phase  raw  material. It's  not  a  homogeneous  material. That  was  the  case  here when  we  did  some  microscopy. What  we  saw  was  our  coating  had a  continuous  phase  shown by  this  gray  material  here, but  it  had  dispersed in  that  a  secondary  phase. The  primary  glass  transition  temperature was  correlating with  that  primary  continuous  phase and  the  secondary lower  glass  transition  temperature was  correlating  to  this  secondary  phase that  we  have  here. We  had  a  hard  glassy  primary  phase and  then  a  soft  rubbery  secondary  phase. Why  that's  important  is usually  high  glass  transition  temperature does  lead  to  better  corrosion because  it  inhibits  the  diffusion of  anything  through  this  layer and  stops  material  getting to  the  substrate, the  metal  substrate, and  causing  corrosion. Usually,  if  I  want to  make  flexibility  better, I  have  to  make this  continuous  layer  softer and  that  degrades  corrosion. But  with  this  type  of  morphology, I  was  able  to  keep my  hard  continuous  phase and  gain  flexibility  through a  separate  dispersed  rubbery  phase. This  meant  that  anything  that  wanted to  diffuse  through  the  coating and  cause  corrosion  was  always  having to  diffuse  through  this  high  TG  area. It's  given  me  the  combination of  good  corrosion and  good  flexibility  together. The  auxiliary  data  that  I  gathered was  really  responsible... The  analysis  to  that  was  responsible for  the  learning of  what  was  going  on  in  this  system. In  conclusion, it's  definitely  possible to  carry  out  successful  DOEs where  we  only  measure the  critical  responses,  the   big Ys. But  I  hope  I've  shown  that  including carefully  selected  auxiliary  responses, little Ys  can  often  be  really  valuable, can  bring  clarity  to  unexpected  results, and  it  can  help  us  to  build scientific  knowledge. I  hope  I've  also  shown  that  JMP  provides some  tools  that  really  help  us  with  this. I've  shown  a  couple,  but  there  are many  more  that  are  available. I'd  finally  like  to  finish  off by  thanking  the  many  associates at  PPG's  Coatings  and  Innovation  Center who  contributed  to  this  work.
More than any statistical software, JMP and JMP Pro make tremendous use of model selection criteria, such as the AICc and BIC. These tools can be used by practitioners in all industries and at all skill levels, from users choosing a distribution for a capability analysis, to advanced users choosing input factors in a linear mixed model or a functional DOE analysis. Model selection criteria are incredibly flexible and powerful, yet make it easy to decide between very different sets of predictor variables, response distributions, and even correlation structures, all at the same time.   Unfortunately, the full story of how and when to use these criteria are not part of most standard data science courses in universities and professional training. One reason for this omission is that, unlike JMP, many software packages implement model selection criteria in an incomplete or arguably incorrect way, making it impossible to compare models with different input variables.   In this presentation, we give clear guidance on how and when to use model selection criteria. We describe their motivation and the assumptions they require. We compare model selection criteria to other better-known approaches to selecting models, such as hypothesis tests and holdout-based crossvalidation procedures. We also give a brief story of how JMP Statistical R&D developers came to appreciate how useful these tools are, as we sought a general solution to the distribution dredging problem.     The  most  famous  quote  in  all of  statistics  is  George  Box's "All  models  are  wrong,  but  some  are  useful." I've  heard  this  quote   at  almost  every  conference I've  ever  been  to, and   because  of  this, to  my  recollection,  I've  actually  avoided using  this  quote  in  any  talk  before. But  when  I  looked  up the  first  time  it  was  ever  set  in  print, it  was  in  a  1976  journal of  the  American   Statistical  Association  article. It's  found  in  a  section  called  Parsimony. Immediately  after  that  first  instance of  the  quote, he  talks  about  the  importance  of  finding the  simplest  model   that  describes  the  observed  phenomena. This  amounts  to  finding  models  that  offer  a  reasonable  balance of   goodness-of-fit versus  model  complexity  and  is  exactly what  I'm  going  to  be  talking about  today  in  this  presentation. JMP  and  JMP  Pro  offer  a  lot  of  different  modeling  capabilities, each  with  a  lot of  output  related  to  choosing  a  model. Today  I'm  going  to  go  into  some  detail into  some  of  the  most  important  of  these, highlighting  their  motivation and  the  assumptions  behind  them. A  lot  of  the  discussion  will  be  about the  AICc  and   BIC model  selection  criteria, which  are  direct  and  very  data- efficient tools  for  addressing  the  problem. Box  had  in  mind  with  his  quote, which  is  how  to  find  a  useful  model from  a  set  of  flawed  or  wrong  ones. As  I  was  putting   this  presentation  together, I  went  through the  derivations  of  the  AIC  and  the  Bic. I  wanted  to  get  a  clear  understanding of  what  these  similar- looking  methods really  are  and  what  assumptions  they  made. Afterwards,  out  of  curiosity, I  did  an  Internet  search  of  AIC versus   BIC versus  cross-validation. It  was  interesting  to  see  in  all   these  Internet  forms  that  there  is  still so  much  debate,  even  though  these  methods have  been  around  for  50  years. Having  recently  reviewed  the  derivations  of  the  methods, it  looks  like  there  are still  a  lot  of  misconceptions  out  there. I  think  the  reason  for  this  is  that  both  model  selection  criteria have  very  deep  and  technical  derivations despite  the  simplicity  of  their  formulas, both  of  them  are  equal  to  minus  two  times the   log likelihood  of  the  fitted  model, plus  a  simple  penalty  based  on  the  number  of  model  parameters. You  can't  guess  the  reasons for  the  penalty  terms   from  the  formula  alone, which  makes  them   seem  mystical  and  arbitrary. One  of  my  goals  today  is  to  try to  demystify  these  without going  overboard  on  the  math. To  put  this  all  in  the  context of  an  analysis  workflow, we  can  think  of  an  analysis  project as  having  four  major  steps. We  first  have  to  acquire  the  data, get  it  organized  and  cleaned  up. Then  we  fit  several  models  to  it  in  a  way that  is  either  manual  or  automated by  software  like  JMP  or  JMP  Pro. Once  we've  done  that, then  we  need  to  choose  one  of  them as  the  model  that  we're  going to  work  with  moving  forward. This  is  a  critical  step  in  the  process that  we'll  be  focusing  on  today. It's  important  that  we  get  the  model  selection  right because  the  quality of  the  results  and  the  conclusions  we  make at  the  end  requires   that  we  have  a  reasonably  good  model. Here  are  the  main  ways  that  I've  seen people  make  decisions  about  models. Hypothesis  testing  is  probably the  first  one  people  learn  about. These  are  most  commonly  used  to  determine  if  a  regression  coefficient is  statistically  significantly different  from  zero, which  sounds  like  a  model  selection  problem. While  they  are  often  used  in  that  way, hypothesis  tests  are  derived  under a  specific  set  of  assumptions that  explicitly  does  not  account for  having  changed  the  model or  having  used  a  model  that  was  chosen  as  the  best amongst  several  alternatives. Then  we  have  the  general  empirical procedures  that  assess  models  based on  data  held  out   from  the  model  fitting  process. These  techniques  can  be  applied  to  both  classical  statistical  models as  well  as  machine  learning  models. In  my  opinion,  holdout  validation, in  particular,  is  the  way  to  go if  you  have  a  whole  lot  of  data. Then  we  have  what  all  called the  small  data  analytical  procedures. These  were  derived  for  situations   when  you  have  to  make  a  decision about  which  model  to  use,   but  you  don't  have  enough  data to  hold  out  any  observations. The  most  commonly  used  of  these are  the  AIC  and  the  BIC. But  there  are  other  well- known  techniques like  Generalized C ross-Validation  and  Mallow's C P. It  turns  out  that  these  two   are  actually  asymptotically  equivalent to  the  AIC,  so  in  large  samples  you  should get  the  same  conclusions  from  GCV, Mallow's  CP,  and  the  AIC,  in  particular, for  at  least  squares- based  models. Then  we  also  have  what  I'll  call  model- specific  approaches  like  VIP and  partially  squares  models   and  the  cubic  clustering  criterion in clustering  models. These  are  pretty  niche  and  I  won't  be talking  about  them  any  more  here  today. Then  we  also  have  visual  tools  like  actual by  predicted  plots and  ROC  curves. Regardless  of  how  you  choose  your  model, these  plots  are  good  to  take  a  look  at  before  moving  forward  with  a  model because  they  provide  more  interesting  information than  any  individual  statistic  will  and  can  tell  us  if  the  best  model that  we've  considered  so  far  is  still a  good  enough  model  for  us  to  use. My  own  first  encounter  with  model selection  criteria  in  my  professional  life was  back  in  the  mid- 2000,   around  when  JMP 5  and  JMP  6  were  out. JMP  had  added  the  ability  to  provide  capability  analyses for  non- normal  distributions. Capability  analysis  is  a  very  important  tool for  assessing  whether   a  manufacturing  process  is  " capable" of  delivering  products that  are  within  specification. JMP  users  wanted  to  determine  the " best  distribution"  for  the  data so  their  process  capability  metrics  would  best  reflect  the  reality of  their  situation. JMP  customers  understood  that  you  could fit  different  distributions  with  JMP and  knew  that  many  of  the  distributions came  with  a   goodness-of-fit  test  in  a  case of  having  a  hammer  causing you  to  find  nails  everywhere. They  were  trying  all  the  distributions they  could  find  and  were  choosing  the  one with  the  largest   p-value   as  the  distribution for  their  capability  analysis. They  wanted  us  to  codify  this  into  a  new  fit  all  distributions  feature that  would  automate  this  process  for  them. But  we  were  rather  uncomfortable with  this  request  for  a  number  of  reasons. For  one  thing,   the  different  distributions  fit in  JMP came  with  different kinds  of   goodness-of-fit  tests. The  normal  had  a  Shapiro- Wilk  test. The  Weibull  had  a  Cramér–von Mises  test, and  the  LogNormal  had  a  Kolmogorov  test. It's  very  strange  to  compare  tests that  are  rather  different  from  one  another. Another  problem  with  this  approach is  that  distributions  with  more  parameters are  going  to  tend  to  have an  edge  on  those  with  fewer. If  we  choose  the  distribution based  on  the  largest   p-value, it  will  always  favor  distributions with  more  parameters  as  we  see  here with  the  two- parameter  normal  compared with  the  four- parameter Johnson  Su  distribution. Then  for  some  of  the  distributions like  the  Weibull's  Cramer  von  Mises  W  test, we  only  had  table  values  of   p-values going  up  to  something  like  P  equals  25. But  even  if  we  consolidated  all  the   goodness-of-fit  tests  down to  just  one  and  got  accurate   p-values  for  all  of  them, there's  still  a  larger  philosophical  issue at  stake  and  that's  that  hypothesis  test like  these  can  only  quantify  evidence against  the  null  hypothesis. If  the  null  hypothesis  is  true, then  the   p-value  is  a  uniformly   distributed  random  variable. In  other  words,   if  the  null  hypothesis  is  true, then  the  probability  that  the   p-value  is  between  0.1  and  0.2  is  exactly the  same  as  the  probability that  it  is  between  0.8  and  0.9. S eeing  a   p-value  of  0.9   isn't  more  evidence  that  the  hypothesis is  true  than  a   p-value  of  0.3. Returning  to  our  example, because  all  four  of  these  distributions have  goodness- of- fit   p-values  larger  than  0.05. Through  this  lens,   all  four  distributions fit  the  data  reasonably  well, even  though  the   goodness-of-fit  tests  say all  the  distributions  are  good, the  conclusions  about  the  process generating  the  data  are  different depending  on  the  distribution. If  you  use  a  peak  reference  value  of  1.33 to  determine  if  the  process  is  capable, then  choosing  the  viable  indicates that  the  process  is  not  sufficiently capable  to  meet  the  specifications, whereas  the  other  distributions indicate  that  the  process  is  capable. We  recognize  that  there  had  to  be  a  better  way  to  determine the  distribution  automatically  and  came  to  the  conclusion that  this  should  be  seen  as  a  very basic  kind  of  model  selection  problem. In  our  search  for  a  sound  method for  choosing  a  distribution, we  stumbled  upon  this  very  good  book  on model  selection  by  Burnham  and  Anderson. They  give  careful  derivations  of  the  AIC  from  the  perspectives of  information, theory,  and  cross-validation. They  also  give  a  derivation  of  the  BIC into  how  the  AIC  can  be  derived in  the  same  way  with  a  different assumption  about  the  prior  distribution. Burnham  and  Anderson  also  carefully  show hypothesis  testing  is  rather  incoherent as  a  model  selection  strategy. The  book  had  a  pretty  big  impact  on  my  own  views  of  modeling and  also  on  JMP  statistical  modeling  platforms. Returning  to  the  distribution   selection  problem  for  the  moment, when  we  went  ahead   and  added  a  distribution  selector, we  ended  up  calling  it  fit  all and  we  base  it  on  the  AICc. Here  on  the  left, we  have  two  distributions of  the  capability  analysis  data we  were  looking  at  before,   the  normal  and  the  Johnson  Su. The  Johnson  Su's   goodness-of-fit   p-value is  larger  than  the  normal's  because  it  has two  more  parameters  than  the  normal  distribution. Now  on  the  right, we  see  the  results of  a  fit  all  using  the  AICc. The  normal  comes  out   as  the  best- fitting  distribution, but  the  Johnson Su is  near  the  bottom. This  is  because  the  AICc  is  penalizing  it for  having  these  two  extra  parameters. This  feature  has  now   been  used  many,  many  times and  I  believe  people   are  generally  pretty  happy  with  it. Now  I'm  going  to  go  through a  somewhat  mathy but  hopefully  accessible  explanation of  what  the  AICc  really  is. All  right. Now  I'm  going  to  go  into  some  basic  theory  behind  the  AIC. I'll  be  as  brief  as  possible   and  use  the  best  analogies  as  I  can, but  I  think  it  is  important  to  be  exposed to  the  underlying  concepts  so  you  can  see that  the  AIC  has  a  rigorous  foundation  that  has  some  sense  to  it. The  AIC- type  selection  criteria  are  based  on  a  distance- type  metric between  probability  distributions  called the  Kullback- Leibler  or  KL  divergence. It  quantifies  the  amount of  information  lost   by  using  probability  distribution  two one  probability  distribution  one   is  the  correct  one. KL  divergence  has  the  property  of  always being  greater  than  or  equal  to  zero and  is  only  equal  to  zero   when  the  two  probability  distributions are  the  same. This  is  to  say  that  using   the  wrong  distribution  always  leads to  a  theoretically  quantifiable, strictly  positive  information  loss. This  is  pretty  heady  abstract  stuff, so  I'm  going  to  translate  it  into  the  language  of  statistical  modeling. When  we  are  using  data  in  statistics to  learn  about  how  something  works, we  are  explicitly  or  implicitly  fitting probability  models  to  the  data to  approximate   the  true  model  that  generated  it. If  we  knew  the  true  probability  generating mechanism,  we  could  use  the  KL  divergence to  quantify  how  far   or  how  wrong  the  model  is  from  the  truth. We  could  then  try  several  models  and  find  the  one  that  is  the  closest to  the  truth. Akaike  recognized  this   and  plugged  the  true and  the  model  probability  formulas  into  the  KL  divergence  formula and  used  a  little  algebra  to  see that  the  KL  divergence  had  two  terms. The  first  term  only  contains   the  true  probability- generating  mechanism for  the  data,  which  we  can  never  know since  we  can  only  work  with  models. However,  this  is  a  constant  that  is  the  same  for  all  models that  you  fit  to  the  data as  long  as  we  play by  a  couple  simple  rules. The  second  term  is  what  Akaike  discovered is  empirically  estimable  and  with  a  lot of  math,  he  found  a  simple  formula to  estimate  this  second  term. In  particular, he  discovered  that  two  times the  KL  divergence  is  equal  to  a  constant that  is  the  same  for  all  models, plus  two  times  the  negative   log likelihood of  the  data  used  to  fit  the  model, plus  two  times  the  number  of  parameters. Everything  had  been  multiplied  by  a  factor of  two  just  to  follow  the  same  convention as  a  likelihood  ratio  test,   since  the  constant  term  is  the  same for  all  models as  long  as  we  don't  change the  response  data,   we  can  fit  several  models, and  the  one  whose  AIC  is  the  smallest   is  the  one  that  is  estimated to  have  the  smallest K L  divergence from  the  truth  and  in  a  sense  is  the  one that  is  the  least  wrong. Using  the  AIC  for  model  selection  is  entirely  analogous  to  there  being a  collection  of  islands  and  you  want to  know  which  of  the  islands  you  know  of is  closest  to  another  island  that  you  know  you'll  never  be  able to  get  to. The  direct  solution  to  this  problem  would  be  to  calculate  the  distances from  each  of  the  islands  to  the  one  that  we  want  to  get  close  to. Now,  what  if  the  island  we  wanted  to  get  close  to  was  surrounded by  a  circular  high  fence  that  we  could  approach? The  island  is  perfectly  in  the  middle  of  the  fence, so  the  distance  from  the  center  of  the  island  to  the  fence is  always  the  same. But  the  fence  was  far  enough  away from  the  island  that  it  enclosed that  we  couldn't  see  it   or  measure  the  distance from  the  fence  to  the  interior  island. We  can  still  estimate  the  distance from  each  island  to  the  fence. Because  the  main  island  is  in  the  center  of  the  fence, we  know  that  the  island  closest to  the  fence  is  the  closest  island. This  is  exactly the  situation  with  the  AIC. With  the  AIC,  we  can  estimate  the  distance from  the  truth  to  each  of  the  models. Each  AIC  estimate   is  off  by  the  same  amount. While  we  can't  estimate  the  absolute  distance  of  the  models from  the  truth, we  can  know  which  model  is  the  closest in  a  relative  sense. The  original  AIC  is  based on  the  likelihood  of  the  training  data  plus  a  parameter  penalty. The  training  likelihood assesses  the  goodness-of- fit  of  the  model. We  can't  use  this  term  by  itself  though, because  it  is  biased  downward as  the  model  parameters   were  chosen  to  minimize the  negative   log likelihood. With  a  lot  of  math,  Akaike  derived  a  very  simple  expression that  corrects  for  this  bias. The  original  penalty  is  just  2 K where  K is  the  total  number of  estimated  parameters. For  linear  regression  with  a  slope and  an  intercept,   we  also  have  to  count  the  variance. For  that  case  you  would  have K  equals  three  and  not  two. There  are  important  assumptions that  led  to  the 2 K  penalty. We  can  characterize  them  loosely that  the  model  has  to  be  reasonably  good. The  AIC  is  still  going  to  be  robust however,  because  if  a  model  is  bad, then  the  likelihood  component   will  be  large  and  will  dominate the  penalty  amongst  the  good  models. The  2K  term  will  favor  the  smaller  models  as  long as  the  sample  size  is  large. However,  it  didn't  take  long   for  people  to  find  that  this  original  AIC often  shows  models   that  overfit  in  small  samples, so  a  more  accurate,  higher- order approximation  to  the  bias  was  derived. When  this  extra  term  is  added, the  criteria  becomes  known   as  the  AICc  or  the  corrected  AIC. Unfortunately,   the  reputation  that  the  AIC  overfits had  become  commonplace before  the  correction  was  discovered and  widely  known  about. The  correction  becomes  infinite  as  K  approaches N pushing  the  model  selection  criteria  away from  models  that  are  nearly  saturated. Notice  also  that  the  correction  term goes  to  zero  as  N  goes  to  infinity. In  large  samples   the  AIC  and  AICc  are  equivalent. The  AICc  is  what  we  reported  in  Trump because  it  works  well  for  small  samples and  although  it  was  derived   for  Gaussian  distributions, experience  suggests  that  it's  good  enough with  other  commonly  used  distributions. Now  I'm  going  to  illustrate  the  AICc in  a  real  example  that  was  a  five- factor central  composite  design  with  31  runs, and  the  response  was  the  amount of  p DNA  produced  by  a  bioreactor. I'll  illustrate  the  AICc using  the  generalized  regression  platform, giving  it  a  full  response  surface  model with  all  main  effects  interactions and  second- order  terms. I  fit  four  models  to  the  data. One  is  a  full  response  surface  model  using least  squares  that  was  fit  automatically. Then  I  use  forward  selection under  the  normal,  logNormal, and  exponential  distributions. I  chose  the  exponential  distribution to  illustrate  poor  model  fit. The  models  had  2 2, 9, 9,   and  1  parameters  respectively, and  the  model  with  the  lowest  AICc   was  the  logN ormal  with  an  AICc of  about  334.8. We  can  break  the  AIC  and  AICc  calculations down  to  see  how  different  parts of  the  penalty  are  contributing. The  full  least  squares  model has  the  lowest  likelihood, but  the  highest  AICc  overall. When  we  look  at  the  second- order   corrections  and  the  original  AIC  values, we  see  that  it's  the  second  order correction  term  that  is  pushing the  model  selection  criteria  to  be  very  large  for  this  model. The  logN ormal  forward  selection  log  likelihood  is  a  little  lower than  the  normal  forward  selection  one. They  both  have  nine  parameters, so  their  penalties  are  the  same and  the  logN ormal  forward selection  model  has  the  lower  AICc. The  exponential  forward  selection  model has  the  poorest  model  fit  as  measured by  the   log likelihood,   but  also  only  has  one  parameter in  the  model. Overall  it  has  the  smallest penalty  contribution  to  the  AICc. But  the  poor  fit  of  the  model  is  such  that  the  likelihood  dominates and  the  exponential  model  is  the  second from  the  worst  as  measured  by  the  AICc. If  you  review  the  general  derivation of  the  AIC  in  the  Burnham  and  Anderson  book, you'll  see  that  what  it's  actually estimated  is  the  expected  value of  a  hypothetical  test  set  likelihood for  a  data  set  that  has  the  same  size and  response  structure, but  not  values  as  the  training  set. The  expected  values  also  take into  consideration  the  variability in  the  estimate  of  the  MLE. I  find  this  cross-validation   interpretation  of  the  AIC to  be  pretty  compelling. I  think  it's  also  important  to  point  out that  this  cross-validation  derivation of  the  AIC  does  not  assume  at  all  that  we  have  the  correct  model. To  show  that  this  cross-validation interpretation  really  works, I  created  a  simulation  formula  using  an  average  of  the  models  I've  shown in  the  previous  slides   as  well  as  some  other  ones. This  way  we  knew  that  none  of  the  models  were  actually  the  correct  one. I  fit  each  of  the  four  models  to  new  training  data  a  thousand  times and  set  it  up  so  that  job  would  report  an  independent  holdout  likelihood using  another  new  data  set. I  kept  each  of  the  four  models  structure and  distributions  intact   and  did  not  apply  variable  selection. This  was  to  perfectly  mimic  the  exact  cross-validation  interpretation of  the  AIC. From  there,   I  created  a  table  of  simulated holdout  likelihoods  and  computed   their  average  for  each  of  the  four  models. This  is  the  AIC   and  AICc  summary  table  from  before, with  the  simulation- based average  holdout   log likelihoods added  over  here  to  the  right, you  can  see  that  the  full  normal  model holdout  likelihood   is  very  close  to  its  AICc  value and  that  the  second- order  correction  term  was  essential  for  this  match  to  happen. On  the  other  hand, you  see  that  the  simulated   average  exponential  holdout   log likelihood is  also  very  close  to  the  AICc. Both  the  normal  and  logN ormal   holdout  likelihoods  are  close to  the  original  log Normal  models A ICc. The  normal  holdout  likelihood is  a  little  smaller. I  attribute  this  to  averaging  a  bunch  of  simulation  models, making  the  simulated  data  a  little  bit  more  normally  distributed than  the  original  data  was. There  are  a  couple  simple  rules  that  are  needed  to  make  AICc  comparisons really  valid  between  different  models. The  most  important  is  that   the  stochastic  part  of  the  data has  to  stay  the  same, the  same  rows  have  to  be  used and  it  is  the  Y's  in  particular that  must  be  the  same. The  X's  can  be  different,  of  course, even  if  they  were  originally  random, not  only  must  the  Y's  be  the  same, but  they  can't  be  changed  or  transformed. The  transform  would  have  to  be  built  into  the  model  appropriately. The  AIC  is  also  only  defined for  well-behaved   maximum  likelihood  estimators and  other  closely  related  methods. This  explains  why  you  don't  see  AICc for  neural  networks   and  other  machine  learning  models. Also,  you  have  to  keep  in  mind  that  just because  you  found  a  model  that  the  AICc says  is  the  best,  it  doesn't  mean  that  it  is  a  good  model. Use  your  past  experience   and  model  diagnostic  plots  to  ensure that  the  model is  right  enough  to  be  useful. Returning  to  the  pDNA  data, we  see  two  equivalent  models. On  the  top,  we  have  a  logN ormal  model and  on  the  bottom  we  have  a  normal  fit  to  the  log- transformed  response. You  can  see  that  the  generalized  RS quares are  the  same  for  these  two  models, but  the  AICcs  are  very  different. This  is  because  the  logN ormal  fit implicitly  builds  the  transform into  the  likelihood. But  the  log  scale  normal  fit  does  not. In  this  case,  the  right  thing to  use  is  the  logN ormal. Here's  a  quick  demonstration   that  you  have  to  decide  the  distribution and  the  input  variables  at  the  same  time. Here  is  simulated  data  from  a  T-t est  type  model two  groups  of  normally  distributed  data   with  the  same  variance, but  different  means. If  you  fit  all   in  the  distribution  platform, it  chooses  the  normal two  mixture  with  an  AICc  of  1036. This  is  the  correct  distribution if  you  don't  know   the  group  identity  of  the  rows. Once  you  include   the  grouping  variable  though, you  see  that  the  normal comes  out  on  top  of  an  AICc  of  717  or  so. We  also  tried  the  Weibull  logNormal and  gamma  and  the  normal  still  came  out on  top,  even  though  those  distributions did  better  in  distribution   without  including  the  grouping  variable. You'd  have  to  try  different  model  structures and  distributions  together to  find  the  right  combination. Now  I'm  going  to  change  gears  and  talk  a  little  bit  about  the  BIC, which  is  the  other  main  analytical model  selection  criteria  and  JMP. The   BIC is  motivated   in  a  completely  different  way than  the  AIC. Schwartz  used  a  large  sample  argument in  a  Bayesian  context  to  approximate the  log  probability  of  the  data   after  having  integrated  the  model  out, assuming  a  flat  prior  on  the  parameters, an  expression  similar  to  the  AIC  pops  out with  a  K  log  in  type  penalty  term   rather  than  two  times  k. There  were  also  other  terms   in  the  integral  that  are  always  ignored. One  is  K  log  2  pi, which  was  considered  too  small to  deal  with and  the  other  one is  a  normalized  variance  of  the  MLE, which  would  also  be  of  order K. I  didn't  study  the  AIC  or  BIC in  any  depth  in  school. I  just  remember  hearing  the  refrain  AIC  overfits, BIC under fits  several  times in  different  classes, which  I  interpreted   as  a  strong  skepticism  about  both  of  them. Comparing  the  AICc  and   BIC penalties, we  see  that  the  AICc  will  prevent big  models  from  being  chosen when  the  sample  size  is  small, whereas  the  BIC will  still  allow  large  models. I  see  the  K  log  and  normalization   constant  penalty  in  the   BIC as  somewhat less  compelling  than  the  cross-validation interpretation  of  the  AIC- type  penalties. Something  that  leads to  a  marginal  probability  of  the  data is  more  abstract  to  me  than  something that  is  directly  interpretable as  a  cross-validation  metric taking  into  account  parameter  uncertainty. I'm  fully  aware that  I'm  editorializing  here, but  this  is  what's  worked  well for  me  so  far. Returning  to  the  p DNA  DoE  one  more  time. Here  are  the  same  models  fit  in  the  pDNA  example using  the   BIC first  selection  on  top  and  the  AICc  on  the  bottom. Notice  that  the   BIC  of  the  full  normal  model is  not  as  far  away from  the  other  models  as  with  the  AICc. The  best  model  overall  as  rated  by  the  BIC  is  a  logNormal, but  with  13  parameters this  time  around  rather  than  nine. The  forward  selected   BIC normal  model also  has  a  couple  more  parameters. In  small  samples, contrary  to  the  AIC  overfits, BIC under fits,  the  AICc  can  choose  smaller models  than  the  BIC  in  small  samples. Here  we  see  the  effects  chosen by  the   BIC and  the  AICc. The  set  of   BIC-selected  effects  is  a  superset  of  the  ones  chosen by  the  AICc. A lso  notice,  interestingly, that  all  four  effects  not  chosen by  the  AICc  are  statistically  significant under the BIC. Under  the  BIC, the  pHsquared  term  is  highly  significant, but  it  isn't  present  in  the  AICc  model,  for  example. I  would  say  that   all  the  significant  effects should  have  asterisks  by  them, but  all  significant   p-values have  asterisks  by  them  in J MP  reports. Instead,  I'll  just  say  that  I  take   p-values  of  effects after  selection  with  a  grain  of  salt. Although  the  two  models   choose  different  effects, some  of  them  highly statistically  significant  in  the  model. If  we  look  at  the  profile or  variable  importance   from  these  two  models, they  tell  a  very  similar  story. Feed  rate  is  by  far  the  most  important and  after  that  the  ordering   is  the  same  between  the  two  models. pH only  impacts  3%  of  the  variation in  the  response  surface  under  the   BIC best  model  and  isn't  included at  all  in  the  AICc  best  model. This  is  a  very  clear  example of  statistical  significance   and  practical  relevance being  two  different  things. There  are  a  lot  of  opinions  out there  about  the  AICc  and  the  BIC. For  example,  Burnham  and  Anderson say  that  both  methods  are  consistent   for  the  quasi- true  model  as  N  goes to  infinity. But  then  there's  others  that  say that  the   BIC is  the  only  one consistent  for  the  truth. Burnham  and  Anderson  say  that  you  can  set up  simulations  to  make  one  look  good, change  the  way  it's  set  up   a  little  bit  and  it'll  flip  the  results. Burnham  and  Anderson, who  are  about  the  most  diehard  AICc  fans out  there  in  their  simulations, found  that  the  AICc  chooses  fewer really  bad  models  than  the  BIC. I  think  it's  not  a  bad  idea   to  look  at  both   BIC and  AICc after  applying  variable  selection. If  the  best  models  under  both  are  pretty much  the  same,  which  is  often  the  case, you  can  feel  pretty  good   about  either  of  them,  if  they're  different it's  good  to  think  about  the  reasons  why  and  use  your  subject  matter  expertise to  help  make  a  decision. My  last  topic  is  model  selection criteria  and  linear  mixed  models. This  is  a  pretty  complicated  situation, especially  because  there  isn't  consensus between  software  vendors  in  how to  compute  the  model  selection  criteria. To  illustrate  this,  I  created   a  split  plot  design  with  four  factors. There  are  two  whole  plot  effects and  two  split  plot  effects. If  you  take  the  same  data  and  fit  the  same  model  in  JMP  Pro and  SAS  using  fit  mixed  and  proc  mixed, you  will  see  that  the  likelihoods and  model  selection  criteria  don't  match, but  the  variance  estimates  do,  you  get  different  fixed  effects parameter  estimates,   but  the  fixed  effects  tests  agree. One  of  the  reasons  for  this  is  that  JMP and  SAS  fixed  effects  design  matrices use  a  different  coding  strategy for  categorical  effects. On  the  left  I  have  the  JMP  design  matrix for  the  split  plot  example, and  on  the  right  you  see  the  SAS  one. JMP  creates  a  row  of  minus  ones for  the  last  level  of  categorical  effects which  is  seen  in  blue  here whereas  SAS  creates  a  row  of  zeros. Neither  one  of  these  is  right  or  wrong. It's  like  changing  units   or  changing  coordinate  systems. JMP  categorical  effects  sum  to  zero, whereas  SAS  categorical  effects can  be  interpreted   as  differences  from  the  last  level. Although  the  raw  parameter  estimates  differ, predictions  will  be  the  same between  the  two  codings  because  the  models are  fundamentally  equivalent. Most  things  that  matter won't  be  different   between  the  two  software  products. However,  REML, the  method  used  to  estimate  mixed  effects  models has  an  ambiguity  in  it. The  base  Gaussian  likelihood  at  the  top will  be  the  same  in  either  software because  it's  a  real  likelihood. But  the  REML  or  residual  likelihood reported  by  proc  mixed and  JMP  pro's  fit  mixed  isn't  a  real  likelihood. If  it  was  a  real  likelihood, then  we  would  get  the  same  values regardless  of  which  coding or  software  we  used. This  is  because  there's  an  extra  penalty added  to  the  Gaussian  likelihood for  REML  that  reduces  the  bias of  the  variance  estimates. But  this  depends  on  the  design  matrix in  a  way  that  is  sensitive  to  the  coding  used. JMP  reports,  the  raw  Gaussian  likelihood,  and  the  AICc  and   BIC that  it  reports are  based  on  that rather  than  the  residual  likelihood. The  number  of  parameters  fit  mixed  counts is  the  total  including  both  fixed  effects  and  variance  parameters. We  did  it  this  way  to  make  it   so  that  you  can  use  JMP  to  compare  models with  different  fixed- effect  structures as  well  as  variance  models. In  SAS,  they  only  report   the  residual  or REML  log likelihood and  it  reports  model  selection  criteria  based  on  it. You  can  see  here  that  it  also  only  counts variance  parameters  as  well because  the  difference   between  the  SAS  likelihood and  its  AIC   is  for  implying  two  parameters, a  variance  component  and  a  residual. All  this  means  is  that you  can  only  use p roc  mixed for  comparing  variance  models with  the  AIC   because  the  model  selection  criteria includes  the  REML  penalty  and  it's  only   counting  variance  parameters. With  all  due  respect, I  can  think  of  some  good  reasons for  the  SAS  approach, and  there  are  probably  some  other good  reasons  I  don't  even  know  of. But  I  personally  prefer  the  flexibility afforded  by  the  JMP  approach. To  summarize,   if  you  compare  results  across  software for  non- mixed  models, the  mean  parameter  estimates  may  differ, but  otherwise  everything else  should  be  the  same. As  long  as  the  software  computes the  constants  and  the  likelihood correctly  as  JMP  does. When  we  get  to  Gaussian  mixed  models, there  are  very  important  software  differences and  the  scope  of  the  decisions you  can  make  about  the  models using  the  software  may  be  very  different depending  on  the  details  of  how  its  likelihood  is  calculated. JMP  model  selection  criteria are  comparable  both   within  the  same  platform and  across  other  modeling  platforms. I'll  close  with  this  slide, which  gives  my  basic  recommendations for  applying  the  tools  discussed  today. Hypothesis  testing  is  a  tool   for  when  you  need  to  prove  something and  is  best  used  in  situations   when  you  have  a  good  idea of  the  model  structure  in  advance. When  you're  working  on  a  problem in  industry  and  the  sample  size  is  small, I  would  stick   to  classical  statistical  models and  use  the  AICc  as  the  primary  tool for  choosing  between  them. With  larger  data  sets, when  I  have  enough  data  to  hold  out at  least  a  third  of  the  observations, I  use  holdout  cross-validation  to  compare classical  statistical  models  as  well  as  machine  learning  models. In  my  own  work,  I  tend  to  avoid  K- fold  cross-validation and  its  variance. The  model  selection  criteria  are  equivalent  to  it  in  larger  samples, and  I  tend  to  stick  with  simpler  models  with  smaller  data  sets. I  know  that  not  everyone  is  going  to  agree  with  me  on  this, but  this  is  what  works  for  me  and  is  a  pretty  safe  way to  approach  model  selection. Choosing  the  most  useful  model  from  a  set of  alternatives  that  must  all  be  wrong on  some  level  is  an  important  decision, and  these  are  the  main  considerations I  have  when  deciding  upon a  model  selection  strategy. Thank  you  for  your  attention  and  I  look  forward  to  talking  with  you in  the  Meet  the  Expert  Sessions.
Quality by Design (QbD) is a systematic approach for building quality into a product. The Design Space Profiler, new within the Prediction Profiler in JMP 17, helps solve the fundamental QbD problem of determining an optimal operating region that assures quality as defined by specifications associated with critical quality attributes (CQAs) while still maintaining flexibility in production.    We explain JMP’s approach to solving this difficult problem and demonstrate how to use the Design Space Profiler to find these optimal subregions of the design space. The Prediction Profiler in JMP has long been a powerful tool for visualizing and optimizing models; having the Design Space Profiler within the Prediction Profiler makes it an indispensable tool for high-quality product and process innovation.     Hello. My  name  is  Laura  Lancaster. I'm  a  statistical  developer   in  the  JMP  group, and  today  I'm  here  to  talk  about finding  optimal  operating  regions for  critical  quality  attributes with  the   design space profiler. The  first  thing  I  wanted  to  talk  about  is the  general  quality  paradigm   called  quality by design, also  commonly  referred   to  as  QbD  for  short. Quality by design  is  a  systematic  approach for  incorporating  quality into  the  entire  product  lifecycle   beginning  at  the  design  phase. It  was  first  introduced  by  Joseph  Juran, and  it  was  made  popular  in  his  book, Juran  on  Q uality by Design   way  back  in  1992. It  was  a  very  popular  book, and  a  few  years  after  it  was  published, the  FDA  and  the  International  Conference on  Harmonization,  also  referred  to  as  ICH, adopted  these   quality by design  principles for  the  development, manufacturing,  and  regulation  of  drugs. They  published  several  guidelines for  implementing   quality by design in  the  pharmaceutical  industry called  ICH  Q8- Q12,  and  we're  going   to  focus  on  ICH  Q8  guidelines. What  exactly  do  we  mean  by  design  space? Well,  this  is  a  very  important  concept in  the  pharmaceutical  industry for  quality  by  design, and  it's  defined  in  the  ICH  Q8( R2)  guidelines  as the  multidimensional  combination and  interaction of  material  attributes and  process  parameters   that  have  been  demonstrated to  provide  assurance  of  quality. Essentially,  the  design  space  is  what identifies  your  optimal  operating  region that's  going  to  give  you  maximal  flexibility  in  your  production while  still  assuring   that  you  get  a  quality  product. JMP's  new   design space profiler,   new  in  JMP  17, helps  us  find  this  design  space, this  optimal  operating  region. Now  there's  actually  several  steps that  need  to  be  taken  before  you  can  use the   design space profiler to  determine  your  design  space. These  are  outlined  in  the  Q8  guidelines, and  so  we're  going to  run  through  those  steps. The  first  step  that  you  want  to  take is  you  want  to  define   your  quality  target  product  profile, and  this  is  defined   as  a  prospective  summary of  the  quality  characteristics of  a  drug  product that  ideally  will  be  achieved   to  ensure  the  desired  quality, taking  into  account  safety and  efficacy  of  your  drug. Next,  you  want  to  determine  what  are the  critical  quality  attributes and  what  are  their  appropriate specification  limits. A  critical  quality  attribute, also  referred  to  as  a  CQA, is  defined  as  a  physical,  chemical, biological  or  microbiological  property or  characteristic  that  should  be within  an  appropriate  limit, range,  or  distribution  to  ensure the  desired  product  quality. Once  you've  determined  those, then  you're  going   to  design  experiments  to  determine what  are  the  critical  manufacturing   process  parameters that  affect  these  critical   quality  attributes. A  critical  process  parameter, also  referred  to  as  a  CPP  for  short, is  a  process  parameter  whose  variability has  impact  on  a  critical  quality  attribute and  therefore  should   be  monitored  or  controlled to  ensure  the  process  produces   the  desired  quality  product. Once  you've  determined  all  of  that, then  you're  going  to  find   a  good  prediction  model for  your  critical  quality  attributes in  terms  of  your  critical   process  parameters. Once  you've  done  that, then  you  can  use   the   design space profiler  in  JMP to  determine  your  design  space for  your  optimal  operating  region. Let's  talk  a  little  more  specifically about  the   design space profiler  in  JMP. First  of  all,  as  I  mentioned, it's  new  in  JMP  17, and  it  resides within  the  Prediction  Profiler. Hopefully  everyone  is  familiar with  the  Prediction  Profiler  in  JMP. It's  a  wonderful  tool  for  exploring, visualizing  and  optimizing  any  model   that  you  produce  within  JMP. The  goal  of  the   design space profiler is  to  determine  a  good  design  space by  finding  the  largest  hyper-rectangle that  fits  into  that  acceptable  region that's  defined by  your  critical  quality  attribute  spec   limits  applied  to  your  prediction  model. Now,  once  you  have  that  hyper-rectangle, that  gives  you  your  lower  and  upper  limits of  your  critical  process  parameters to  determine  a  good  design  space for  maintaining  a  quality  product. Now,  I  just  wanted  to  mention  very  quickly that  a  design  space   does  not  have  to  be  a  rectangular, but  having  a  rectangular  design  space   makes  it  really  convenient for  checking  each  factor  one  at  a  time. Now,  the  problem  with  this  approach is  that  the  acceptable  region   is  usually  nonlinear, and  finding  the  largest  hyper-rectangle in  a  nonlinear  region  is a  very  difficult  mathematical  problem. How  does   design space profiler  work  then? Well,  instead  of  using   the  mathematical  approach to  finding  the  largest  hyper-rectangle,   it  uses  a  simulated  approach. It  generates  thousands  of  uniformly distributed  points  throughout  the  space defined  by  your  initial  critical  process perimeter  factor  limits. Then  it  uses  a  prediction  model to  simulate  your  responses   for  your  critical  quality  attributes. Note  that  it's  always  best  to  include   some  error  for  your  prediction  model because  your  prediction  model   is  not  perfect,  it's  not  without  error. Then  once  you  have   that  simulated  set  of  data, you're  going  to  take  that  set and  calculate  the  in-spec  portion by  counting  the  total  number  of  points that  are  in- spec  for  all  the  responses, all  your  critical  quality  attributes from  the  points  that  are  within  the  space defined  by  the  current  CPP  factor  limit settings of  your  current  design  space. Now  the  easiest  way  to  see  how  this  works is  to  go  to  an  example  and  go  to  JMP. That's  what  we're  going  to  do. We're  going  to  look  at  an  example for  a  pain  cream  study. The  goal  of  this  study  is  to  repurpose a  habit-forming  oral  opioid  drug into  a  topical  pain  cream  that  provides the  same  relief  as  the  oral  drug. The  first  thing  they  did  was  figure  out their  quality  target  product  profile, and  then  they  were  able  to  determine  what their  critical  quality  attributes  were. It  turns  out  there  were  three  of  them: entrapment  efficiency,  vesicle  size, and  in vitro  release, and  these  are  the  spec  limits that  give  a  quality  product   for  these  critical  quality  attributes. Next,  they  ran  experiments  to  determine what  are  the  process  parameters that  affect   these  critical  quality  attributes. It  turns  out  there  were  three   critical  process  parameters: emulsifier,  lipid,  and  lecithin. These  are  the  initial factor  limit  settings. Once  they  did  that, they  used  the  Custom  Designer  in  JMP   to  design  response  surface  model  designs. They  ran  the  experiment, and  then  they  used  Fit  Least  Squares to  fit  response  surface  models   for  the  three  critical  quality  attributes in  terms  of  the  three   critical  process  parameters. Once  they  did  all  of  that, now  we  can  go  to  the   design space profiler in  JMP  to  determine  the  design  space. Let's  go  to  JMP. This  is  my  data  table  that  was  created from  Custom  Designer  when  I  created a  design  for  the  response  surface  models of  my  three  critical  quality  attributes as  responses  in  my  three   critical  process  parameters  as  factors. It's  what  I  have  right  here. I  wanted  to  note   that  I  went  ahead  and  I  added the  critical  quality  attribute  spec  limits as  column  properties because  the   design space profiler has  to  know  what the  spec  limits  are  to  work. If  you  don't  add  them   as  column  properties, you'll  be  prompted  to  add  them when  you  launch  the   design space profiler. But  I've  already  done  that. I've  also  already  saved  the  script for  my  models  that  I  created using  Fit  Least  Square. I'm  going  to  go  ahead  and  run  that. It  automatically  launches Fit  Least  Squares. I  have  the  models  closed  and  just the  Prediction  Profiler  open. Notice  that  it  looks  like  the  Prediction Profiler  that  you're  used  to  seeing. I've  also  added  the  spec  limits as  reference  lines  just  so  I  can  see  them   in  relation  to  my  models. To  get  to  the   design space profiler, all  you  do  is  go to  the  Prediction  Profiler  menu, turn  it  on,  and  a few  down, you'll  see  there's  a  new  option called   design space profiler. If  I  click  on  that, the   design space profiler  will  appear right  below  the  Prediction  Profiler. Now  notice  that  because  I  had  added the  spec  limits  as  column  properties, it  automatically  brought  those  in. If  you  go  to  the  bottom  right-hand  side, you'll  see  where  it's  brought   in  my  spec  limits for  my  three  responses   or  three  critical  quality  attributes. You'll  also  notice   that  it's  brought  in  some  error, which  it's  going  to  use  as  my  error   for  my  prediction  model. This  has  come  in  from  the  Least  Square model's  root  mean  squared  error. For  example,  up  here, if  I  go  to  my  in-vitro  release  model, you  can  see  the  root  mean   square  error  is  1.2972, and  that's  the  same  value that's  listed  down  here  as the  error  standard  deviation. Now  you  can  change  these   if  you  think  these  are  not  accurate, these are too  big  or  whatever. You  can  even  completely  remove them  and  have  no  error, but  we  highly  recommend   that  you  do  add  some  error  to  your  models. Okay,   there  are  several  things   that  you  might  notice  are  different about  this  design space profiler  than  other  profilers. One  of  the  first  things  you'll  notice  is that  over  here  on  the  Y-axis, instead  of  having  a  value   like  you  normally  do, like  up  here  in  the  Prediction  Profiler, it  just  says   in-spec portion. But  the  actual  value  for  the  Y- axis is  actually  over  here  to  the  right. My  in-spec  portion  for  all  three critical  quality  attributes  when  I  have the  initial  factor  limits  set   at  the  full  range  is  71.2%. Then  you  can  also  see  over  here that  it  says  the  volume  portion  is  100%. That's  because  I'm  using  100% of  the  entire  simulated   data  set  starting  up because  everything's at  the  full  factor  range. Another  thing  that's  different  is   for  each  factor  cell, you'll  see  that  there's  two  curves   instead  of  the  usual  one  curve. That's  because  we're  looking for  factor  limits, so  a  lower  and  upper  limit   for  each  factor. The  blue  curve  represents  the  in-spec portion  as  the  lower  limit  changes. There's  a  handy  legend over  here  to  help  me. The  red  curve  represents  the  in-spec portion  as  the  upper  limit  changes. I  can  interact  with  this  profiler to  change  my  factor  limit  settings or  my  design  space. I  can  move  these  markers. I  want  to  obviously  move  this to  get  a  higher   in-spec portion, and  so  you  would  want  to  find a curve  that  has  an  upward  slope. I  can  move  this  one  inward by  dragging  this  marker. I  could  also  just  enter  values   down  here  below  the  cells. I  could  enter  values  here   in  these  fields  next  to  the  factors. Or  there's  also  another  way, which  is  to  use  these  buttons, move  inward  and  move  outward. This  move  inward  button,  if  I  click  on  it, it's  going  to  give  the  largest... It's going to make the move, it's going to give the largest increase  in  inspect  portion. It's  going  to  look  for  the  curve with  the  steepest  path. If  I  click  on  move  outward, if  I  were  trying  to  increase the  sizes  of  my  design  space, it  would  give  me  the  least  decrease. I  would  look  for the  least  steep path  downward. Let  me  go  ahead  and  click  move  inward. I  noticed  when  I  click  that, the  lower  limit  for  a  emulsifier  went from  700- 705,   and  the  in-spec  portion  went  up. Now,  if  I  want  to  look  at... This  is  the  in-spec  portion   for  all  three  critical  quality  attributes, but  if  I  want  to  look   for  them  individually, they're  to  the  far  right-hand  side of  each  response   or  each  critical  quality  attribute. I  can  see  the  in-spec  portion for  each  one  individually. But  up  here  it's  finding   it  corporately  for  all  of  them. Let's  click  move  inward  again and  notice  that  a  emulsifier  went  in  again and  now  it's  up  to  710 and  my  in-spec  portion  is  up  to  78. The  volume  portion  now  is   down  to  89.79%. Click  it  again. Now  my  lecithin  lower  limit  went  up. My  goal,  I'm  going  to  see  if  I  can  get my  in-spec  portion  all  the  way  up  to  100%, if  I  still  can  maintain  factor  limits that  seem  reasonable  and  realistic. But  before  I  do  that, there  are  several  options from  the  menu  of  the  design  space  profiler that  I  really  like  to  use  while  I'm  doing  this. The  first  one  is  I  like  to  turn  on  make and  connect  random  table. This  is  going  to  create   a  new  table  of  random  data. It's  going  to  do  10,000  by  default, I'm  going  to  leave  that. I'm  going  to  add  random  noise. It's  the  same  random  noise  added to  the   design space profiler  based on  these  bare  standard  deviations. Click  okay  and  I  get  a  new  set   of  10,000  data  points, uniformly  distributed throughout  the  factor  space. These  data  points  are  color  coded. The  green  points  are   in-spec  for  all  of  my  critical  quality  attributes and  the  red  are  out. Anything  that's  selected  is  still   within  my  design  space or  my  current  factor  limit  settings. What  I  really  like  about  this  table  is the  graphs  that  are  created  by  the  scripts that  are  automatically  saved  to  the  table. I  really  like  to  look at  the  scatterplot  matrix  Y. Let  me  turn  that  on. This  gives  me  a  view  of  my  data with  all  combinations of  my  response variables and  my  critical  quality  attributes. It  has  the  spec  limits  drawn  on  and  it  has the   in-spec portion  shaded  green and  all  the  red  points  are  out  of  spec and  the  green  points  are  in- spec. I  also  like  to  look  at  the  factor  space, and  so  I  could  do  that  by  looking at  the  scatter  plot  matrix  X. This  is  going  to  give  me  my  factor  space or  my  critical  process  parameter  space. Once  again,  same  color  coding and  what's  shaded  is within  the  current  factor  settings. See  if  I  can  situate  this   so  we  can  look  at  both  of  these while  I'm  adjusting   my  factor  limits  over  here. Okay,   you  can  see  as  I  move  inward, you  can  see  how   the  shaded  area  is  shrinking for  my  factor  space,  my  design  space, and  the  number  of  out-of-spec points  are  also  shrinking. Now,  I  really  like  to  also  turn on  the  connect/hide  mode, and  what  that  does  is  it  just  hides   any  points  that  are  no  longer within  my  current  factor  limit  settings or  my  current  design  space. Now  if  I  keep  clicking  move  inward, you  can  see  how the  red  points  are  starting  to  disappear. One  other  option  you  can  use, if  you  prefer, is  you  can  choose  to  only  look   at  the  points  that  are   in-spec or  only  look  at  the  points that  are  out  of  spec. I'm  going  to  turn  Y  out  of  spec  on to  only  look  at  the  points   that  are  still  out  of  spec. I'm  going  to  see  if  I  can  hit  100. Move  inward. I'm  going  to  keep  going. Still  looks  good. Keep  going. Okay,   now  I've  hit  100 %,   and  I  think  this  design  space, these  factor  limits  do  look   like  they're  probably  reasonable. You  might  notice  that  I  still  have a  red  point  over  here  because  this  is a  separate  set  of  simulated  data, but  I'm  not  worried  about   that  one  random  point. Okay,   let's  examine  this design  space  that  I  have  set  here. What  I  want  to  do  is  I  want  to  send the  midpoint  of  this  design  space back  to  the  Prediction  Profiler to  see  what  that  looks  like. I  can  easily  do  that  with  this  option that  says,  "Send  midpoints  to  profiler." When  I  do  that,  it  automatically  sends the  midpoint  of  this  current  design  space back  to  the  profiler so  I  can  look  at  it. I'm  going  to  turn  on  the  desirability, and  I  think  that  looks  quite  good. I'm  going  to  go  ahead and  save  this  setting. I  want  to  do  that  just   so  I  can  compare  this  against the  optimal  setting  if  I  were  to  optimize by  maximizing  all   of  my  critical  quality  attributes, which  is  something  you  may or   not  want  to  compare  against. But  it's  very  easy  to  do   and  I  can  look  at  the  difference. You  can  see  there's  not  a  huge  difference in  the  desirability  between  the  center of  my  design  space  and  the  optimal  value if  I  maximized  all   my  critical  quality  attributes. I'm  pretty  good  with  this  center   of  this  design  space. I'm  going  to  turn  it  back  to  that  point. Another  thing  that  you  can  do  is... Well,  I  don't  really  believe... The   design space profiler uses  uniformly  distributed  points, but  I  think  that  my  critical  process parameters  actually  follow a  normal  distribution. It's  very  easy  to  send  these  limits   back  to  the  profiler and  back  to  the  simulator so  I  can  simulate  using   the  normal  distributions. If  I  click  Send  Limits  to  Simulator and  choose  normal with  limits  at  three  sigma, what  it's  going  to  do  is  it's  going to  send  the  limits  back   to  the  simulator  and  figure  out what  my  standard  deviation  would  be  if that  design  space  was  set  at  three  sigma. Of  course,  you  can  change  these  values, you  can  change  the  distribution. I'm  going  to  use  these  settings   and  see  when  I  simulate what  my  defect  rate  looks  like. Looks  like  it's  at  zero  every  time I'm  clicking  this,  which  is  great. I  also  want  to  see   what  happens  way  out  at  the  tails. I  can  easily  do  that  by  using the  normal  weighted  distribution. Let  me  turn  that  on  for  each  of  these. This  is  a  way  to  check what  happens  to  the  tails. Now  when  I  click  simulate, it's  not  quite  zero  because  I'm  testing way  at  the  tails,  but  it's  still  very  low. I'm  very  happy  with  this. I  also  like  to  run  a  sanity  check for  capability, and  I  can  easily  do  that  by  using the  Simulate  to  Table  option, which  is  going  to  simulate  using the  normal  distribution for  my  critical  process  parameters and  the  prediction  models  with  the  error, the  same  error  I  used  before. I  click  Make  Table, I  get  a  simulated  table and  it  has  a  save  distribution  script  that  if  I  run, will  automatically  open  up  capability because  I  saved  my  spec  limits   as  column  properties. When  I  check  these  capability  reports for  the  three  critical  quality  attributes, it  looks  very  good. Of  course,  this  is  simulated  data, so  you  want  to  check  it  on  real  data, but  I'm  pretty  happy  with  what  I'm  seeing about  my  design  space  right  now. I  want  to  save  this  design  space, and  you  can  easily  do  that  by  going to  the   design space profiler  menu and  checking  Save  X  Spec  Limits, and  this  will  save these  factor  limit  settings  back to  the  original  data  table as  spec  limits  in  your  critical   process  parameter  columns. I  want  to  click  that  and  I  go back  to  my  original   data  table  just  back  here. Close  these. Okay,  there  it  is. When  I  get  back  to  my  original  data  table, you  can  see  that  these  spec  limits have  been  saved  to  my  critical process  parameter  columns. This  is  a  great  way  to  save  this information  if  I  save  the  data  table. It's  also  nice  because  if  I  do  that and  I  save  my  predictions  as  formulas, I  can  look  at  the  design  space in  terms  of  the  contour  profiles. I  can  do  that  by  using the  graph  contour  profiler. I've  already  gone  ahead and  saved  a  script  for  this. I'm  going  to  run  it  and  I  have  it  set  up to  show  the  contour  profilers in  terms  of  all of  my  critical  process  parameters. You  can  see  this  faint  rectangle  is my  design  space and  the  shaded  contours or  the  contours  are  my  spec  limits. You  can  see  how   my  design  space  is  well  within... It's  well  within  my  spec  limit  contours. I  have  that  nice  buffer   because  I  added  the  error, so  I  can  look  at  that  in  terms of  all  of  my  critical  process  parameters. I'm  very  happy  with  this  design  space. Let's  go  ahead  and  look at  a  different  example. Okay,   this  is  an  example that's  outside  of  the  realm of  the  pharmaceutical  industry. It's  a  polymer  manufacturing  study, and  the  goal  of  this  study  is  to  improve the  quality  of  their  white  polymer. I  wanted  to  use  an  example  to  show that  these  methods  can  be generalized  in  any  industry. It  doesn't  have  to  be  pharmaceutical. This  example  was  inspired by  an  example  that's  in  the   Visual  Six Sigma  Second  Edition  book  that  uses  JMP. It's  a  great  book. It's  in  my  references. I  highly  recommend  you  check  it  out if  you  haven't  ever  looked  at  it. They  want  to  improve  their  white  polymer and  they  figured  out  that   they  had  two  critical  quality  attributes: melt  flow  index  and  color  index. These  are  the  spec  limits  which  will ensure  quality  for  the  white  polymer. They  ran  experiments  to  figure  out which  process  parameters affected  these  critical   quality  attributes. There  were  three  of  them, Amps  for  slurry  tank  stirrer, viscosity  modifier percent ,   and  percent  of  filler. These  were  the  initial factor  limit  settings. Another  thing   that  was  different  about  the  study is  that  they  used  historical  data to  find  prediction  models for  their  critical  quality  attributes in  terms  of  their  critical process  parameters. They  used  two  different  platforms and  two  different  types  of  models for  these  critical  quality  attributes. They  used  generalized  regression  platform and  the  Lasso  method  to  find   a  model  for  the  melt flow  index, and  they  used  the  fit  neural  platform to  find  a  neural  model   for  the  color  index. Because  they've  used   two  different  platforms and  two  different  types  of  models, they  need  to  use  the  profiler  platform that's  underneath the  graph  menu  in  JMP. To  use  that,  you  need  to  save your  models  as  formula  columns. I  want  to  show  you  how  you  can  still  use the  Prediction  Profiler  platform to  use  the   design space profiler. Let's  go  back  to  JMP. I'm  going  to  open  up  the   polymer  data. This  is  historical  data. I've  got  my  two   critical  quality  attribute  columns, my  three  critical process  parameter  columns. I've  gone  ahead  and  saved the  spec  limits  as  column  properties, and  these  are  my  two  models. This  is  the  model  I  created with  generalized  regression. This  is  the  model  I  created  with  neural. I've  gone  ahead  and  saved both  of  these  models   to  the  table  as  formulas. I've  also  saved the  Prediction  Profiler  platform  script, which  I'm  going  to  run, and  you  can  see  this  is  my  Prediction Profiler  in  terms  of  these  two  models. The   design space profiler,  once  again, I  go  to  the  Prediction  Profiler  menu, turn  on  the   design space profiler. It's  slightly  slower  here  just  because  I'm using  formulas  and  I  have  a  neural  model. But  once  again,   it  opens  up  and  it  shows  me  that with  the  initial  factor  limit  settings, my  in-spec  portion  is  about  65 %. It  looks  like  what  I'm  going  to  want to  do  is  I'm  going  to  want  to  decrease the  upper  limit  of  my  XF  factor critical  process  parameter. Indeed,  if  I  click  move  inward, see  how  my  in-spec  portion  is  going  up by  decreasing  the  XF  upper  factor  limit. I  want  to  see  if  I  can  get  to  100 %. If  I  can  get  there   and  have  reasonable  factor  limits, see  what  things  look  like  if  I  keep  heading  for  that. Okay,   I've  gone  to  100 %. My  factor  limits  on  XF  are  a  little  tight, so  let's  see  what  this  looks  like in  the  prediction  profiler. I'm  going  to  send  a  midpoint  up  there. I  think  it  still  looks relatively  reasonable. But  one  thing  I  wanted  to  point  out  is if  you  do  end  up you're  not  happy  with  things, you  can  actually  also  use  this  design space  profiler  to  do  what  if  scenarios, if  it  were  realistic  to  be  able   to  change  your  spec  limits, let's  say  you  were  able to  make  them  wider, or  what  if  your  consumer  demands   that  they're  tighter? You  can  check  out   what  if  scenarios  as  well by  changing  these down  here  in  these  fields. But  in  this  case,  I  think  I'm  pretty happy  with  this  design  space. I  think  it  will  work. I'm  going  to  go  ahead  and  save  the  spec limits  to  the  original  data  table. See,  they  got  saved  here. Once  again,  I've  saved  a  script for  the  contour  profiler. I'm  going  to  run  that  really  quickly. Once  again,  I  can  see  my  design  space   in  terms  of  my  contours. That's  the  faint  rectangles  here, and  the  shaded  regions  are  my  spec  limits. I  could  see  how  the  design  space   is  well  within  my  specs, so  I'm  pretty  happy  with  this. Let's  go  back  to  PowerPoint  just to  give  you  some  takeaways. Okay,   some  quick  takeaways about  the   design space profiler. First  of  all,  the  in-spec  portion  that's reported  in  the   design space profiler, those  values  should  not  be  considered probability  statement  unless  you  think that  your  critical  process  parameter factors  follow  a  uniform  distribution within  the  limits  because  that's  what's being  used  to  create  that  statement  there. Also,  the   design space profiler   is  not  meant  for  models that  have  large  number  of  factors   or  very  small  factor  ranges, and  that  is  because  of  the  simulated nature  of  the  approach  it  takes. It's  also  recommended,  as  I  mentioned a  couple  of  times  in  my  talk, to  always  use  random  error for  your  prediction  models, for  your  responses  because   your  models  are  not  without  error. Finally,  finding  a  good  design  space   is  applicable  to  more  than  just the  pharmaceutical  industry, even  that's  where  the  idea  came  from. That  second  example   was  just  to  demonstrate how  it  can  be  used  in  any  industry where  you  care  about  having   a  robust  process  in  maintaining  quality. These  are  my  references, and  here's  contact  information. I  wasn't  able  to  show  you  everything  about the  design  space  profiler, so  I  hope  that  you  will  check  it  out. If  you  have  any  questions   or  if  you  have  any  feedback, please  contact  me. Thank  you  so  much.
This presentation demonstrates the use of JMP for a practical real-world application in high-throughput analytical chemistry; it also highlights the power of even basic JMP functionality to reduce costs without affecting data quality. This talk provides a compelling introduction to the synthetic biology industry, which is at the cutting edge of scientific discovery, aiming to revolutionize how to manufacture chemicals sustainably.    At Amyris, hundreds of thousands to millions of different strains are created each year. Each strain is screened using various analytical platforms to identify improvements, allowing us to link genetic coding to specific phenotypes. This scale pushes the boundaries of today’s automation and analytical technologies, consequently challenging the scalability of conventional approaches to analytical chemistry.   This talk highlights not only the use of JMP as a powerful tool to make a complex analysis easy but also demonstrates the use of the workflow builder to enable general users to perform these analyses. A business case is shown in which JMP helped analyze and improve analytic calibration protocols in the lab.      Today  we're  going  to  be  talking about  using  JMP,  our  favorite  software and  applying  it  to  a  real- world  problem in  our  analytical  sciences  department at  Amyris  which  is  a  symbio  company. Before  we  jump  into  that, I  wanted  to  introduce  myself  as  well  as Scott,  who  helped  me  along  this  journey. I'm  Stefan,  I'm  an  associate  director of  R& D  data  analytics A myris. I  have  twelve  years of  industry  experience, a  lot  of  diverse  background, I've  worked  in  various  labs from  analytical  chemistry  to  fermentation science  and  in  more  recent  years focused  more  on  the  quality and  data  science  side  of  things. Scott  has  helped  me in  a  lot of  the  content  here and  has  been  working  with  Amyris for  a  number  of  years and  he  is  one  of  the  JMP  pros working  for JMP. I'd  like  to  start  off  by  just  saying  thank you  to  Scott  for   helping  us  out  here. We're  going  to  split the  talk  today  into  three  parts. I'm  going  to  give  a  bit  of  background and  context  both  on  synthetic  biology, if  you  haven't  heard  of  that  before, and  analytical  chemistry. The  main  part  of  the  talk  is  really  going to  be  focused  on  then  applying  JMP to  a  specific  question  we  had and  then  finally  we'll  wrap  it  up briefly  touching  on  automation and  then  the  impact of  the  analysis  and  this  case study  we'll  look  at  together. S ome  of  you  may  not  be  familiar with  synthetic  biology or  analytical  chemistry and  I  really  like  to  understand context  and  background and  is  going  to  be  relatively  important for  the  case  study  we  look  at, we'll  focus  on  that  today  and  start  there. S ynthetic  biology  really leverages  microorganisms, as  we  like  to  call  them, as  living  factories. We  use  mainly  yeast  in  the  case of  Amyris  that  we  precision  engineer, and  we  use  the  process  of  fermentation, which  is  not  a  new  thing  it's  something people  have  been  using  for  thousands of  years,  mainly  to  make  alcohol and  bread  in  a  lot  of  cases. In  our  case, we're  using  the  yeast  in  fermentation, feeding  it  sugar  and  converting that  sugar  into  a  variety  of  target ingredients  and  chemicals. Those  ingredients and  chemicals  we  can  then  make higher  purity  so  they  may  be higher  performing  lower  costs and  in  a  more  sustainable  fashion. To  give  an  example, this  isn't  just  a  fairy  tale. This  is  reality,  it's  not  an  idea,  we  have 18  molecules  today  that we  manufacture  at  scale  and  I'm  showing a  subset  of  those  here. There's  an  example  on the  top  left  you  have  Artemisinin. That's  an  antimalarial  drug, it  was  our  first  molecule and  that's  how  our  company  was  founded. In  the  top  middle,  we  have  Biophene which  is  actually  a  powerful building  block  that  we  then  convert into  other  chemicals  and  applications. One  example  being  Squalene  which  is  a very  popular  emollient  used in  the  cosmetics  industry and  traditionally is  sourced  from  shark  livers and  one  that  might  be  familiar in  the  bottom  middle,  we  have  patchouli. Some  people  associate  that with  the  hippie  smell,  it's  a  fragrance, but  it's  actually  really  ubiquitous in  the  fragrance  industry as  a  base  note, so  it  goes  into  thousands  of  products. Things  like  Tide  detergent  have  patchouli in  it  and  we  can  manufacture  this, which  is  traditionally extracted  from  plants with  our  synthetic  biology  platform. I  work  in  the  R&D  function, and  so  our  goal  is  really  to  identify the  best  E  strains  that  we  can then  use  at  manufacturing  scale, and  that  requires  research  at  scale. We  run  highly  automated a  lot  of  high- throughput  workflows at  Amyris  in  Emeryville,  and  so from  the  left  there  we  start  screening our  yeast  strains at  a  capacity  of  about 600,000  strains  per  month. We  take  those  top  performers and  we  promote  them to  what  we  call our  bench- scale  bioreactor  fermentations, which  you  can  see pictured  on  the  right  there. Throughout  all  of  this, we're  creating  a  lot  of  strains, which  means  we  also  need  to  understand what's  happening  in  those  strains, what  are  they  producing,  how  much,  and that's  really  where  analytics  come  in. Those  analytics  need  to  be  run at  a  scale  to  match  that  so  we  can  really get  the  data  to  understand what's  happening. With  this  scale  of research,  there's  a  lot  of  opportunities, and  a  lot  of  those  opportunities  come from  looking  at  conventional approaches  and  reconsidering how  to  do  those. I will talk  a  little  bit  about analytical  chemistry. Again,  that's  not  anything  that's unique  to  synthetic  biology. It's  pervasive  in  a  lot  of  industries, petroleum  industry, environmental  sciences,  pharma, very  common  way  just  to  measure  things. I'll  talk  here  really about  chromatography, and  as  an  example,  I'll  take  fermentation that  we  do  on  the  bioreactive  scale. From  this  fermentation,  we're  going to  sample  that  while  it's  running. We're  going  to  get  a  dirty  sample  from that  which  we  then  can  further prepare  and  dilute. We  have  this  mixture of  components  in  this  final  form. We'll  then  take this  mixture  of  components, we'll  run  it  across some  separation  techniques. That's  a  chromatograph. What  that's  going  to  do  is  based on  the  property  of  those  components, might  be  size,  it  might  be  polarity, it'll  allow  us  to  separate  those  out. We  then  feed  that  into some  detection  mechanism. There's  a  variety  that  you  can  use and  what  that  gives  you  is  a  separation of  these  components  over  time and  then  some  intensity  of  response. The  last  piece  and  where we're  going  to  focus  today is  intensity  isn't  really  a  useful  thing for  you  or  me  to  make  decisions  on. We  need  to  translate  that  into something  useful  like  a  concentration. The  calibration  curve allows  us  to  translate that  intensity  into  a  concentration, and  of  course,  you  can  imagine if  you  get  that  translation  wrong, your  data  is  going  to  be  wrong and  it's  going  to  mislead  you. Calibration  curves is  where  we'll  focus  today, and  that's  the  heart  of  the  question. A  calibration  curve  is  created, by  running  standards  with varying  levels  of  your  known  component. The  example  I'm  showing  here, we  have  a  low  mid- high, so  a  three  -evel  calibration. We  know  what  the  concentration  is in  those  because  we  prepared  them, and  we  measure the  response  on  these  instruments. From  there, we  can  fit  some  calibration  curve. In  this  example,  I'm  showing just  a  simple  linear  fit, and  then  we  can  run  unknown  samples, read  the  response  off  our  instrument and  do  an  inverse  prediction. We're  taking  our  response  from  the  Y and  predicting  what  the  quantity is in  that  sample. It's  a  very  common  way  to  be  able to  quantify  things  in  unknown  samples. That's  our  background. We're  going  to  jump  into  the  case study  looking  at  this  key  question we  had  around  optimizing  a  part of  our  process  in  our  analytics. A bit  more  background  here  is  that when  we  do  calibration  in  our  labs, there's  a  cost  associated maintenance  of  these  calibrations and  calibration  curves  and calibration  standards  is  expensive, both  due  to  people's  time, but  also  materials. These  materials  can  often  cost  thousands, even  tens  of  thousands of  dollars  per  gram. With  the  scale  that  we're  doing  our research  at,  it  really  pushes  us again  to  reconsider  those conventional  approaches. We're  running  millions  of  samples per  year,  and  we  have  a  really  diverse set  of  analytical  methods so  we  have  currently in  our  lab  in  Emeryville, over  100  different  analytical  methods measuring  all  components. One  place  we  looked  at  is  conventionally. We  see  this  with   most  people  we  hire, this  is  where  people  start. Conventionally  calibration  curves often  have  five  to  seven  levels, whether  they're  linear  or  not. We  think  about  they  say,  okay, five  to  seven  levels,  linear  fit. In  theory,  the  most  you  might  need  is or  the  minimum  you  might  need  is  two and  there's  a  cost to  each  additional  level, both  in  materials and  preparation  costs  and  maintenance. This  is  where  we  wanted to  look  and  ask  the  question, look,  can  we  actually  reduce this  number  for  an  existing  method without  significant impact  on  our  actual  data  quality? The  way  we  quantify  our  unknown  samples. This  is  where  JMP  comes  in, we're  going  to  use  Jump  here  to  simulate some  alternative  calibration schemes,  in  this  case, reducing  the  number of  levels  of  calibrations and  to  reiterate  what  we've  walked through  our  problem  ultimately  is  that calibration  maintenance  is  costly. That's  exasperated by  the  scale  we  do  it  at. Our  general  approach  is  really  going  to  be to  look  at  how  can  we  optimize  this. Let's  look  at  reducing  the  levels of  those  calibrations, and  then  our  specific  solution is  using JMP  here to  ask  the  question,  look, if  we  went  back  in  time and if  theoretically,  we  had  run  two calibrators  or  three  calibrators instead  of  six  or  seven,  how  would that  have  impacted  our  data? Our  case  here,  we're  going to  focus  on  a  single  method  today. This  is  a  real  method  we've  been running  for  about  six  months. We  have  22  batches of  samples  we've  run  on  this  method, so  it's  about  1000  samples. Our  existing  calibration  I  show  here on  the  right  is  a  linear  calibration. It  has  six  levels  and we've  estimated  if  we  can  reduce  this to  the  minimum  of  two  levels,  we could  save  an  estimate  of  $15,000  a  year. There's  a  real  measurable  motivation to  understand  if  we  can  pursue  this. Showing  here  the  general workflow  that  we  came  up  with. I'm  going  to  go  through  this really  quickly  right  now  but  no  worries. We're  going  to  walk  through it  step  by  step  together. We're  really  going  to  just pull  the  historical  data. We're  going  to  recreate  our  historical calibration  in  JMP  to  validate  what we're  doing  in  JMP  matches what  we've  done  historically, and  then  we're  going  to  say,  okay, let's  eliminate  some  of  these  levels, recreate  the  calibration with  those  reduced  levels, and  then  evaluate what  impact  that  has  on  our  targets. Now,  I  think  in  this  case  it's  also  really important  to  emphasize  you  see, we  have  two  pass- fail  forks  in  the  road. Often  when  we're  doing  analysis on  data  in  hand, we're  looking  for  statistical  significance with  studies  like  this,  it's  really important  to  determine  what  your  practical requirements  are. In  this  case,  what  does  that  mean? We're  talking  about  impact on  the  measurement  of  unknown  samples. Ultimately,  we  want  to  make  sure that  reducing  the  calibration  is  not  going to  bias  the  measurement in  one  way  or  the  other. We  want  the  measurement  to  be  the  same. As  many  people  will  tell  you,  the  same is  not  really  a  quantifiable  thing, it  depends  on  your  sample  size, the  noise  in  your  process. We  need  to  define  what  is  no  different, same  or  no  impact  mean. Here  we're  going  to  set  our acceptance  criteria  ahead  of  time for  this  first  step  as  accuracy within  half  the  percent, and  for  the  second  step as  accuracy  within  1%. We'll  see  these  come  back  as we  walk  through  this. Our  first  step,  and  every  page  here, I'm  going  to  show  in  the  top  right what  step  we  are  in  the  process, as  well  as  highlighting what  JMP  platforms  we're  using. For  our  first  step,  we're  going to  be  pulling  our  historical  data from  a  database,  in  our  case, we  have  a  database. We  have  a  Lin  system  that  already has  the  data  in  a  structured  format. You  could  also  import  this  from  CSV, however,  you  can  access  the  data. We're  pulling  it  in  our  case using  raw  SQL  and  JSL and  it  pulls  in  a  structured  format showing  a  subset  of  the  columns  we  have, but  what  you'll  notice  is  in  this  case we  have  our  six  calibrators as  well  as  a  number  of  unknown  samples. We're  pulling  in  the  historical  data  as the  core  data  set  we're  working  with. The  first  step  is  recreating and  validating  the  same  calibration  curve so  that  same  six- point calibration  in  JMP. Now,  you  might  ask why  we  have  to  do  this. There are  two  main  reasons. One  is  calibration  curves  can have  a  lot  of  caveats. They  can  have  weighting, they  can  have  anchor  points, they  could  be  forced  through  zero, they  could  be  nonlinear. This  is  a  good  way  to  validate that  you're  using  the  right parameters  and  JMP  to  recreate  this. The  other  reason  is  that  we  don't  expect these  values  to  be  exactly  the  same. The  reason  being  that  a  lot  of  these analytical  software  uses  some  proprietary regression  that  is  not  exactly  like,  let's say,  ordinary  least  squares  regression. To  do  this,  we're  going  to  use  a specialized  modeling  fit  curve  parameter and  really  just recreating  our  calibration  curve. Just  like  I  showed  earlier, where  we  have  our  known  quantity  of  our six  standards  on  the  X and  our  raw  intensity or  signal  response  on  the  Y. In  our  case,  we  have  22  batches, I'm  not  showing  all  of  them  here, but  we're  reproducing  this  for  22 different  sequences  in  essentially  one click  and  what  I  call  the  power  of  the Control  key  if  you  don't  know  this  trick. Will  save  you  a  ton  of  time, if  you  hold  down  the  Control  key, click  on  the  red  button,  whatever you  do  is  going  to  apply  to  every analysis  in  that  window. Recently  learned  that's apparently  called  broadcasting, so  you  could  use  that  as  well. We're  recreating  a  calibration  curve for  each  of  our  batches and  then  in  the  same  specialized modeling  platform,  we're  then saving  the  inverse prediction  formula. Because  we're  predicting  from  Y  to  X, if  you  remember  back to  our  calibration  intro, to  be  able  to  save  the  predicted values  back  to  our  data  table. This  then  looks  like  this where  on  our  data  table we  have  first  our  historical  quantity, what  we  pull  from  the  database, and  now  we  have  our  raw  quantity that  we  generated  from these  newly  created calibration  curves  and  JMP. We  have  a  multiplier  we  have  to  apply, do  the  sample  prep  we  do that  we  pull  from  the  database so  that's  already  there and  it's  going  to  stay  constant. We  simply  need  to  just  apply a  calculated  column  here to  have  a comparative  value  to  our  historical  data. If  you  look  in  this  first  raw, our  value  is  very  close  to but  not  exactly  the  same as  our  historical  data. Next  up,  we're  going  to  visually do  a  comparison,  plotting  our  historical against  the  JMP  recreation of  that  calibration, and  this  is  a  good  check again,  to  look  through  your  data. What  you  would  expect  or  hope  for  is a  line  that  essentially looks  like  Y  equals  X. Now  we  don't  want  to  stop at  a  visual  analysis. We  of  course,  want  to  bring some  statistics  into  it. This  is  where  we  introduced the  passing  Bablock  regression. It's  actually  something  that  was  just added  into  the  base  jump  functionality. I  think  with  JMP  17  used  to  be an  add- on  for  a  long  time. I'm  glad  it's  there  now. This  is  a  specialized  regression that's  non- parametric  and  robust  outliers, that's  really  designed  specifically for  comparing  analytical  methods. For  many  of  you,  probably  irrelevant you're  never  going  to  have  to  use  it, but  we  need  to  use  it in  the  world  we're  working  in. What  this  regression  does, it  gives  you  two  hypothesis  tests to  test  for  constant  bias  as well  as  proportional  bias. S tarting  with  a  constant  bias, where  we're  seeing  if  there's  bias. Imagine  the  line  moving  up and  down  the  same  across  the  range. We're  evaluating  if  the  confidence interval  of  our  intercept does  or  does  not  include  zero. For  proportional  bias,  where  the  bias would  change  based  on  the  response. We're  evaluating  if  the  confidence interval  of  our  slope does  or  does  not  include  one. Now  in  our  case,  we  reject  the null  hypotheses  in  both  of  these  cases, which  tells  us  that  we  do have  statistically  significant  bias, both  constant and  proportional  in  our  data  set. From  here  you  might  say,  okay, we're  done  there's  bias  we  can't  move  on, but  thinking  back,  this  is  why it's  really  important  to  define what  the  practical significance  is  because any  statistician  will  tell  you  in our  data  set  we  have  1000  samples, you  have  1,000  samples you're  going  to  be  looking  at very  tight  confidence  intervals. You're  going  to  be  able  to  detect very  small  differences. We  have  a  statistically significant  difference  but  does  it  matter? That  brings  us  to  our  last  step we're  going  to  calculate, again  using  the  column  formula, the  relative  difference between  the  two  methods and  I'm  showing a  distribution  of  that  below and  that  distribution  then gives  us  access  to  this  test  equivalence. This  allows  you  to  test  a  distribution of  values  against  the  constant that  you  define  within  some  confidence. Here  in  this  window, we'll  enter  our  target  mean  is  zero because  we  hypothesized  that  they're going  to  be  the  same  so  no  difference. Now  we  get  to  enter  our  acceptance criteria,  which  was  0.5%. This  gives  us  this  very  nice  output with  our  final  two  hypotheses tests  where  if  we  reject  these, we  can  determine  essentially  that  the  mean of  this  data  set  is  equivalent  to  zero within  plus  or  -0.5%. This  one  you  might  say,  hey  Stefan, this  is  doing  a  t- test, your  distribution  is  not  exactly  normal and  I  think  you'd  be  right and  if  I  went  back  I  might  actually use  the  test  mean  platform because  that  gives  you  access to  non- parametric  equivalence  tests. Regardless,  this  is  a  really  useful and  direct  way  to  test for  practical  significance. We've  pulled  our  historical  data from  the  database, we've  recreated  and  evaluated the  calibration  curve and  we've  established  that  it  passes our  acceptance  criteria. If  it  had  failed,  it  could be  an issue  with  the  data  set. You  might  not  be  using  the  right calibration  parameters. There are  a  number  of  reasons, we  generally  would pretty  much always  expect  this  to  pass. It  usually  just  requires  some investigation  to  what's  going on in  the  way  you  recreated this  calibration. Our  next  step  is  down- sampling or  reducing  the  number  of  levels of  our  calibration. Now,  if  we  try  to  do  this  without  JMP, we  have  to  go  into  every  single sequence  in  our  analytical software  manually remove  calibrators, recalculate  things. Be  really  long  and  tedious  thing. JMP  this  is  as  easy  as  just using  the  data  filter. In  our  case  with  this  six- point calibration  we  have  a  linear  one. We  know  that  the  minimum  number of  points  we  need  for  linear  fix is  two. We're  picking  the  highest and  the  lowest  calibrators and  just  filtering  down  to  those. From  here  I'm  going  to  go  pretty  quickly, but  really  all  we're  going  to  do is  recreate  this  calibration with  two  points  in  JMP. Again,  we're  using  the  specialized modeling  platform,  doing  a  fit  line. The  only  difference  now is  we  have  two  points  instead  of  six and  we're  applying  that  inverse  prediction formula  back  to  the  data  table, which  again  is  going  to  give us  our  inverse  prediction and  then  we  apply  the  multiplier and  because  I  know  I'm  going  to  do the  practical  significance, I'm  just  going  to  preemptively  calculate a  relative  difference  between the  two- point  calibration  and  then the  historical  difference. Again,  we  go  through  the  Passing  Babock and  not  so  surprisingly, again  considering the  size  of  our  data  set, we're  going  to  reject  the  null hypothesis  here  and  establish  that  we  have statistically  significant  bias, both  proportional  and  constant. We  move  on  to  test  our acceptance  criteria. Remember  back  now our  threshold  is  1%  instead  of  0.5% and  that's  working with  the  stakeholders  of  the  data to  establish  what  is an  acceptable  equivalency. That's  always  important  pre- work  to  do and  we're  going to  test  that  equivalency. Here  we  find  that  these  two  methods are  equivalent  within  plus  or  minus 1%. On  the  unknown  samples and  that's  really  important. We're  using  those  historical real- world  samples to  really  ask  the  question  what  if we  went  back  in  time and  reran  all  these  calibrations with  two  points and  reported the  data  of  these  unknown  samples. How  would  those  values  change? On  average  we  see that  they  change  very  little, and  so  it  gives  a  lot of  credence  to  considering inducing  those  calibration  levels. We've  essentially demonstrated  this  now and  so  this  calibration  on  the  left and  the  calibration  on  the  right we're  saying  are  equivalent ,  aren't going  to  provide  equivalent  quantitation within  1%  and  so  we  have  essentially  the evidence  we  need  to  push  for  this  change. We  passed  our  first  check, we  reran  the  evaluation  with  the  two, we  passed  that and  now  we're  at  our  final  step of  implementing  those changes  in  our  process. Now  it's  arguably  the  most important  part, if  you  do  an  analysis,  we  just  leave  it sitting  there,  doesn't  do  much  good. This  can  sometimes  be the  hardest  part. You  have  to  go  out,  you  have  to  convince people,  especially  in  cases  like  this and  you  have  to  take  consideration of  maybe  are  there  additional things  that  this analysis  didn't  consider. I'm happy  to  talk  to  anyone  about  that, but  we're  not  going  to  go  in  depth of  what  the  other  considerations  we  have to  think  about  before putting  this  into  action, but  with  this  example,  we  did actually  end  up  reducing  calibration levels  from  six  to  two and  that  reduced  the  annual  cost of  running  that  method  by  about  $15,000. From  there  we  might  say, okay,  what  now? Are  we  done? Of  course  not. Right  now  we  need  to  look  at  we did  it  for  one  method, we  have  a  suite  of  another  100  plus methods  that  may  also  have  these many  level  calibrations  that  might be  overkill  for  what  we  need. We  want  to  look  at  repeating the  analysis  for  other  methods. That's  where  I  think automation  comes  in. It  is  a  really  great  way  to  scale this  one- off  analyzes  for  ourselves, but  also  for  others. My  rule  of  thumb is  if  I  find  myself  doing  an  analysis more  than  two  or  three  times. Let's  build  that  out  in  the  automation, say  future  me  a  lot of  time,  spend  a  little  time  now. I'll  just  touch  on  this  very  briefly and  I  want  to  shout  out  Scott  here for  helping  me  with  a  lot  of  the  workflow builder  work  and  the  scripting but  these  native  automation  tools in  JMP  are  really  powerful and  they're  very  user- friendly, there's  a  lot  of  code- free  options and  so  there's  really  different ways  you  can  do  this. You  can  do  it  on  the  left  side,  right  in a  classic  way  doing  all  the  scripting even  allows you  to  save  these  global  variables so  it  could  give  us  place for  you  to  have  users  enter  their acceptance  criteria which  might  change and  or  you  can  leverage  the  workflow builder  which  is  a  bit  of  a  newer  feature but  really  lets  you  build out  this  automation. Even  if  you  just  want  to  script  it  raw, you  can  build the  framework  that  you  can  then  flesh  out. The  two  things  I  will  say  about  this  is how  much  you  can  automate  or  maybe how  much  effort  you  have  to  put  into  it is  going  to  be  limited  to  some extent  by  how  rigid  that  workflow  is. If  users  need  it  to  be  really  flexible, need  to  interact  with  it, it  could  become very  challenging  to  automate, and  of  course  the  data consistency  is  key  as  well. This  is  really  a  great  tool to  help  others  reproduce  the  analysis, but  you  really  do  have  to  also  train them  and  document  the  work, make  sure  they  know what  it's  actually  doing. As  we  all  know, every  analysis  has  its  caveats. You  need  people  not  just to  click  and  have  a  report, but  also  understand a  little  bit  like  what  are  some potential  things  that  could  come  up, especially  if  you're  trying to  future- proof of  work. I  like  to   bring  it  back  together and  wrap  it  up  there and  hope  today  that  I've  showed you  that  JMP you  don't  have  to  do  like  crazy  complex or  sophisticated  things  in  JMP you  could  piece  together a  lot  of  simple  functionality to  create  really  impactful  workflows. Whether  you're  working  in  a  lab at  your  organization,  wherever  it  is, look  to  identify  these  improvements in  existing  workflows. I  like  to  think  about if  you  all  are  in  the  experience most  of  us  are  in,  there's  more  data than  what  we  know  what  to  do  with. Look  at  the  data  that  no  one  is  looking  at and  then  challenge the  conventional  thinking. The  way  we're  working  is  always  changing, ask  why  do  we  do  it  this  way. In  our  case,  for  a  long  time, this  is  the  way  we  do  it. Five,  six- point  calibration. Ask  why,  what  if  we  didn't, what  would  the  impact  be. Of  course,  don't  have  to  tell anyone  listening  here. Use  JMP  for  the  scalable  analysis and  then  use  automation  to  make  it  easy, and  it  really  doesn't  have  to  be  fancy. It  just  has  to  work for  what  you  need  it  to  do. Finally,  you  can  use  that  to  impact  these impactful,  implement  impactful  change and  use  data to  drive  those  decisions. It's  probably  one  of  the  most  convincing tools  that  we  have  today. If  you're  talking  to  management, do  it  in  units  of  dollars because  they  love  that. I'll  wrap  it  up  there I  think  last  thing  I'd  like  to  say is  just  a  thank  you  to  the JMP  Discovery  Summit  Committee, all  the  people  organizing  special  thank you  to  Scott  for  all  the  help he  gave  me  in  the  past  with  Amyris, but  also  with  this  talk  and  this  analysis, and  then  a  number  of  people  at A myris who  were  involved in  with  this and  with  that,  I  will  wrap  it  up. Thank  you. Bye.
Several authors have addressed validating statistical software. More work is needed to assess the usability of such software since modern statistical software is increasingly in the hands of users with limited statistical training to address complex real-world problems. This poster presents a usability case study for a design of experiments tool in JMP.   The study focuses on a dialog used to specify a designed experiment. Such specifications require users to identify factors, responses, the linear regression model, and constraints. The cost of conducting experiments is usually a primary concern, so users typically iteratively refine specifications before experimenting. As a result, the ability to easily specify and change specifications is essential to users.   We begin with the challenges of the current dialog related to discoverability, clarity, and tedium when performing everyday tasks. We present details of the usability study, including dialog prototypes that address various challenges. We discuss how these prototypes were dynamically generated with the JMP Scripting Language and how the usability of each prototype was assessed by using simple and complex tasks. We discuss the variables and regression model used to assess the potential effect of each variable or combinations of variables. We also explain how we used JMP Custom Design to determine the best configurations for each subject, with the order of the configurations accounting for the subject’s learning over the study, as well as how qualitative data were collected by using an integrated questionnaire that was presented after all tasks were completed.     Hello.  I'm  Mark  Bailey.   I'm  here  with  my  colleague  Jacob  Rhyne, to  talk  about  some  recent  work  on  the  new user  interface  in  JMP  Custom  Design. The  primary  focus  of  our  work was  a  usability  study to  evaluate  the  new  interface. First,  I'm  going  to  talk   a  little  bit  about  the  current  interface and  our  motivation  for  the  change. We  think  of  experimentation  as  a  workflow. In  the  very  first  step,  we  define our  factors,  responses,  and  goals. This  critical  step  determines so  much  in  what  follows. Adding  factors  is  a  very  important  step when  you're  designing  an  experiment. We  learned  that  customers have  some  difficulty with  common  tasks  around  defining  factors. There's  confusion  about  adding, deleting  or  renaming  factors. They  find  it  difficult  to  change the  factor  type,  the  factor  levels, even  reordering  the  factors  in  the  list. Figure  2  shows  the  original User  Interface  for  comparison. The  goals  for  this  prototype were  to  decrease  confusion, to  reduce  the  number  of  operations  in  the time  that  it  took  to  achieve  the  result. Our  usability  study  wanted to  compare  different  controls that  might  be  used to  achieve  these  goals. Figure  3  shows  the  prototype for  the  new  user  interface. This  prototype is dynamically  configurable. That  is,  we  can,  that  will  turn different  user  controls  on  and  off. This  led  to  a  prototype with  384  possible  configurations. It  was  impossible  to  include  all of  them  in  a  usability  study. A practical  study  required a  smaller,  optimal  design. Looking  at  figure  3, especially  along  the  top, you  see  the  User  Controls that  we're  going to  primarily  focus  on in  our  usability  study. You  see  the  Trash  icon, the   Delete icon, the  Plus  and  Minus  buttons, the  Undo  and  Redo  buttons, the  Add  N  Factors  control, and  what  we  call  our  Buffet  menu  button. In  figure  4,  you  see  just  two of  the  384  possible  configurations. In  order  to  make  sure that  the  usability  study was  efficient  and  effective, we  used  Custom  Design to  make  most  of  the  runs  for  our  study. In  this  case, some  of  the  User  Interface  controls were  treated  as  a  binary  factor. That  is  they were  turned  on  or  off  in  the  prototype. Other  User  Interface  controls could  have  more  than  just  two  states. Each  time  the  subject  was  asked to  perform  several  different  tasks repeatedly  with  different  configurations for  comparison. In  the  table  in  the  lower  left, you  see  all  of  the  factors that  were  included  in  our  study. The  first  six,  seven, have  to  do  with  controls in  the  User i nterface  that  were  turned  on or  off  or  to  a  different  state. The  last  factor  is  the one  that  determined what  we  wanted  them  to  do with  a  particular  configuration. That  is  we  asked  them to  delete  a  factor, change  the  type  of  the  factor, or  change  its  position  in  the  list. On  the  right  in  figure  5, you  see  the  completed Custom  Design  platform with  the  response  defined. We're  going  to  measure  the  time  it  takes for  them  to  complete  each  task. We're  trying  to  minimize  that. All  of  the  factor  definitions, and  finally,  the  model that  we  want to  use  to  quantitate  the  time to  complete  each  task. I'm  now  going to  turn  it  over  to  Jacob. Who's  going to  talk  about the  resulting  Custom  Design, how  that  fit  into  our  study, and  our  findings. Thank  you,  Mark. Mark  has  laid  out  how  we  use the  Custom D esign  platform  in  JMP to  set  up  our  usability  study. In  figure  6,  you  can  see,  all  the  runs in  the  study,  for  the  first  participant. I'll  note  that  we  did  include a  participant  and  Run  Order  in  our  design as  covariates  because  we  wanted  to  include these  in  our  analysis  as  well. You'll  also  note  from  figure  6   that  we  included  a  few, what  we  call, special  tasks  in  our  study. These  were  tests  we  knew  would  take longer  to  complete, and  we  didn't  want to  ask  participants to  do  these  more  once. If  you  look  at  the  final  task   of  the  of  the  study, these  were,  special  exercises   that  we  ask  the  participants  to  complete a  series  of  tasks  using  a  provision of  the  prototype and  the  same  series  of  tasks  using the  Custom  Design  platform. This  allows  us  to  get  a  direct  comparison between  the  new  configurable  prototype   and  Custom  Design. To  apply  this  Custom  Design, we  developed  a  custom  JSL  application that  allowed  us  conduct an  unmoderated  usability  test. Now  I'll  give  an  example of  how  this  works  in  figure  7. From  the  top  of  figure  7, you  can  see  that,  for  Run  Order  2, the  task  was  to  delete. As  you  can  see  in  the  second  part of  the  screenshot, the  instructions  the  participant  saw in  the  study,  instructed  them  to  click the  Launch  Prototype  button and  then  delete  the  factor  charge. When  the  participant  clicked the  Launch P rototype  button, the  version  of  the  prototype  that  appeared here  in  the  third  part  of  the  screenshot, that's  entirely  determined by  the  other  factors  in  our  study. For  example,  you  could  see the  Trash  icon  is  set  to  off. That  means  you  don't  see   the  Trash  can  icon, down  here  in  the  prototype. The   Delete icon  is  on, so  you  do  see  the   Delete icon. For  every  exercise,  in  our  study, the  version  of  the  prototype that  appears  and  the  instructions that  the  participant  would  see is  entirely  determined   by  our  Custom  Design. Let's  look  at  some of  the  qualitative  results  from  our  study. In  our  usability,  study  scripted we  included  an  integrated  questionnaire at  the  end  where  we  ask  participants to  get  us  some  feedback on  what  they  liked, what  they  didn't  like,  et cetera. I'll  highlight, a  few  of  the  options  in  figure 8  here. We  ask  participants  to  give  us  feedback on   whether  they  liked  using the  Plus  button or  the  Add N  Factors  control to  add  factors,  and  you  can  see from  the  top  of  figure  8 that  participants  overwhelmingly preferred  using  the  Plus  button to  the  Add N  factors  control. We  also  ask  participants  how  they  liked, Deleting  factors, either  using  the   Delete icon   or  the  Trash  icon. In  this  case,  the   Delete icon was  overwhelmingly  preferred. We  also  asked  participants, what  was  their  favorite  Row  icon that  you  could  see  beside  the  rows in  the  factor  list. The  Pin  icon  was  the  most  popular  with, giving  five  of  the  10  votes, compared  to  the  Oval  icon, which  only  got  two  of  the  10  votes. Then,  finally,  I  point  out  if  you  look down  towards  the  bottom  of  figure 8, you  can  see  that  the  participants overwhelmingly  voted that  the  User I nterface of  the  new  prototype  was  easy  to  learn. It  was  the  new  interface was  also  greatly  preferred, to  the  original  factor  listing Custom  Design. Now  let's  look  at  some  of   the  quantitative  results  from  our  study. We  got  these  results  by  using the  generalized  regression  platform available  in  JMP  to  fit  models for  the  time  it  takes  to  complete the  task,  and  the  factors  in  our  model or the  effects  on  model,  excuse  me, or the  different  elements of  the  prototype. If  you'll  look  at  figure  9,  for  the  output of  our  generalized  regression, you  can  see,  that  the  factors, Row  States,  Trash  icon, and  Pin C olumn  Type  were  found  to  have a  significant  impact  on  time. You  can  also  see, from  the  profile  or  the  Run  Order, in  this  case,  did  not  have a  significant  impact  on  time. Let  me  draw  your  attention, to  the Pi n  Column  Type F actor. You  can  see  that  the  time  to  complete, the  task  of  changing  a  factor  order was minimized  when  the  Oval  icon was  used  to  complete  the  task. I'm  pointing  out this  Pin  Column  Type  Factor  specifically because  in  this  study, the  way  participants  would  change the  order  of  a  factor  is  they  would  click the  Row  icon  beside  the  factor in  the  factory  list  and  drag that  icon  to  the  appropriate  spot. In  this  case,  the  Pin  Column T ype is  what  I  wanted  to  focus  on. Looking  at  figure  10, you  can  see  that, participants,  when  they  were  asked  to  tell us  their  favorite, only  two  of  the  10  participants reported  liking  the  Oval  icon. When  it  came  to  completing  the  task, that  participants  interacted with  the  icons  the  most,  the  Oval  icon was  actually  the  best- performing  icon. Now  let's  look  at  the  results, for  the  task  of  deleting  a  factor. In  this  case,  participants  would  be  asked to,  given  version  of  the  prototype, remove  one  factor  from  the  factor  list, and  they  could  do  that  by  either  using the   Delete icon,  the  Trash  icon, or they  could  have  the  option  to  use  both. Again,  we  fit  models  using a  generalized  regression with  time  as  the  response, and  you  can  see  that, the  icon  was  used  and  Run  OrderThe bot had  a  significant  impact  on  time. The  time,  for  this  task  was  minimized when  the   Delete icon  was  used  as  opposed to  using  the  Trash  icon or have any  option  to  use  both. In  contrast,  to  the  previous  slide, our  quantitative  results  here  match the  qualitative  results because  as  you  can  see  in  figure  12, the  participants  overwhelmingly  preferred the   Delete icon,  to  the  Trash  can  icon. I'll  end  this  study  by  commenting on the  results  of  the  last  two  exercises. The  last  2  exercises  in  the  study, we  had  we  had  participants  complete a  series  of  exercises  using  the  prototype and  then  complete  the  same  series of  exercises  using  Custom Design. The  instructions  for  the  exercises were  the  same. The  only  thing  that  was  different  is in  one  case  you  use  the  prototype, and  in  the  next  case, you  use  Custom  Design. When  participants   were  giving  the  prototype, it  took  them  an  average of  68  seconds  to  complete  the  task. When  participants   use  Custom  Design  platform, it  took  them  an  average  of  316  seconds   to  complete  this  exercise. I'll  also  note  this  316  seconds is right- centered because  a  couple of  the  early  participants that  we  gave  the  study  to reported  that  they  gave  up. A fter  this,  we  started  popping  up a  notification  after  five  minutes  saying, "We  thank  you for   completing  this  exercise. You  can  move  on  to  the  next  one." That's  all  we  have  to  share  today. Want to  thank  you  for  your  interest in  our  poster, and  please  let  us  know if  you  have  any  questions.
This project studied JMP Black Belt exam patterns among students to understand exam difficulty and discrimination across platforms in JMP 16. Through preliminary scoring distribution, the instructor can identify each exam question's difficulty and discrimination. The statistical parameters can then be used to determine which questions should be included in which level of examination, such as Green Belt or Black Belt. By comparing three different Black Belt STIPS exams with a dual plot, questions can be designed for different types of exams.   The characteristic curves are related the difficulty and discriminatory properties of each question and are then compared with each student’s ability and probability of answering correctly. Dual plots use the same scale to show the relative difficulty for a group of students' capability when answering each question. Parameter estimates are used to measure the difficulty and discrimination of each question. The sample questions in three STIPS exams use four kinds of questions and four levels of difficulty and discrimination. By using JMP Item Analysis platform, instructors can adapt any learning program and their instruction to accommodate different levels of students according to the difficulty and discrimination of test questions.     Hello,  everyone.   My  name  is  Kemp  Wu.  I  come  from  Taiwan, and  I  work  in  Applied  Material for  Continuous  Improvement  Group. Today,  I  will  present BB  JMP Exam Item Analysis. First,  let's  do the  project  adjusted  in  2023. The  entities  sub way  used  in GB/BB  training  course is  changed  from  mete r  to  JMP. To  modify  the  AGU, GB/BB exam questions based  on  JMP  survey  is  necessary. In  this  project,  we  use Item  Analysis  to  study  the  2022 BB JMP exam  questions  to  support this  GB/BB exam questions change . Item analysis  will  help  us  to  identify the  difficulty  and  the  discrimination of  each  questions. For   GB  we  need  easier  questions because it is  data  label requirement  for  the  beginner. We  won't  want  too  many  hard questions  inside   GB exam. When  we  move  to  BB exam,  the  questions should  be  harder  and  better discriminated. For  the  last  year, for  the  three  chapter of  the  BB JMP  training  programs, we  have,  step  one, is  the  Intermediate  Statistics and  step  two,  for  the  Graphical  Entity and  step  three  is  MSA,  PCA  and  SPC. Based  on  that,  we  would  like  to identify  each  questions  difficulty and  discriminations  and  by  day, if  any  opportunity  to  modify the  training  material  is  sent questions  based  on  the  NAC  results. Also  the  capability  of  the  questions to  judge  the  ability  of  the  trainees. This  is  the  last  year  field JMP insignation  results. For  the  step  two, is  the  Graphical  Entities. It's very   basically  for  the  reading, also  the  drawing  for  the  users. For  the  step  three, is  the  MSA,  PCA  and  SPC. This  is  more  advantaged. For  the  best  one, we  got a  more  higher  score  for  the  average. Then  the  advantage f or  the  statue, the  score  is  more  lower. If  we  want  to  be  used   the based  on  the  score, it's  very  difficult  to  separate for  the   GB  or  the   BB,  the  questions. That's  why  we  come  out. We  need  to  use  the  JMP Item Analysis. They  can  deliver  more  information for  us  to  do  the  data  analysis for such  as  the  questions'  difficulty or  the  discriminations. How  to  use  for  the  JMP  Item  Analysis application  on  the   GB and  the   BB  installations. F irst, let's  look  at  the  characteristics  curve. Characteristics curve control  the  relationship between  the  probability   of  quality  answers. The  item  and  the  ability for  different  probability labels on  the  ability  scope  also  show correlation  between  for  the  difficulty and  the  discrimination  attributes. There  are  two  parameters in  the  used  parameters  curve. First  is  the  a  parameter, is  about  for  the  discriminating  power. Another  one  is  the  b  parameter, is  about  the  item  difficulty for  the  left  button  of  the  chart. For  the   y-axis  is  the  probability of  the  quality  of  the  answers  from 0-1. One is  meaning  for  the  100%. For  the  x-axis, is  the  abilities  from  minus 4-plus 4. More  higher  number is  being  more  difficulty. Also  more  ability  for  the  center. For  the  periphery, if  the  curve  is  more  steepness of  the  curve  it mean  is  more  powerful for  the  discriminations. For  the  parameter  b is  above  the  difficulty. We  can  see  for  this  chart, the  right  hand  side of t his  curve is  more  difficulty than  the  left  hand  side. For  example,  the  ability  for  the  deal to  go  up  to  match  for  the  right  side of the  curve  is  about  just  only about  the  10%  the  corrected.  If  compared  with  the  left  side of  the  curve,  it  can  go  to  about 90%  can  be  corrected. Also  we  look  at  the  example for  the  question  24, b is  the  vertical  line of  the  preferred  of  the  discriminations. And  the  right  hand  side for  the  question 12  is  for  the   difficulty, for  the  terratinous  curve   [inaudible 00:06:28]  curve from  the  bottom  left to  the  up  of  the  right  side. Next  we  do  add  the  parameter  estimated. The  parameter  estimated,   it show  the  item which  is,  each  of  the  questions showed  by  the  difficulty and  the  discriminations. It show  by  the  numbers also  by  the  histogram. For  the  difficulty, easy  one  is  less  than zero and  the  hard  questions  is  over  zero. With  the  discrimination,   it  is  over  plus  and  minus  three. The up is,  there's  no  discriminations. This  example  for  our  result for  the  step  one,  we  have  nine  questions to  show  by  the  parameter  estimated for  each  of  the  number. Also  for  this  parameter  estimated, the   number also  will  be  showing in  the  characteristic curve. We  separate  for  the  three  group of  the  data  based  on  the  difficulty and  the  discriminations. For  the   group 1  is  the  green  zone. It is  the  easy  with  no  discriminations. We  can  see  for  the  red  virtual line , it is  less  than zero . Even  the  question  is  easy. Also  with  the  fake  curve, for  this  blue  curve it  means  is  no  discriminations. For  another one  is  the   group 2  is  the  easy  with  the  discriminations. We  can  see  the grey  line is  less  than  the  zero or  another  one  is  a  vertical  line. This  show  for  the  over  the  blue  line and here  it  can  be  100%  correct. If left  lower  than  this  line, it mean is  probably  very  hard  to  have  the  correct. Next  one  is  the  group four. Group  four  is  the  hard  questions with  the  discriminations. We  can  see  the  ability  is  over  the  needle. It  also  showed  the  virtual  line for  the  blue  curve. It says  group 1  have  no  sample so  we  didn't  find  out  there's  any  show by  the  glossary. Next  we  will  go  for  the  Dual  Plot. The   Dual Plot  report  contain  a  plot that  show  question  difficulty and  the  student  ability  in  the  same  scale. I t  will show  on  the  Y  scale for  the  ability  from  minus  4- four. There  are  some  three  zones. The  first  zone  is  this  area. This  is  about  the  question  difficult. More  higher  it  being the  question  is  more  difficult and  more  lower  it being the  question  is  more  easier. Another  one  in  the  middle  zone is  for  the  student  ability. More  higher  it being the  student  have  more  ability and  more  lower it being  the  student has  low  ability. How  to  compare  with  the  question difficulty  and  the  student  ability? For  example,  the   student 2 to  compare  with  the  question  28 because  the  student is  more  higher  than  the  question  28. The   student 2  he  have  more probability  to  answer  correct  questions. The  obvious  is  for  the   student 1. Student 1  compared  with  the  question  15. It  means  the   student 1 is  very  hard  to  correct  the  answers. A lso  for  this  one  is,  the  right  zone is  for  the  group  shown  by  the  histogram is  shown  for  the  two  group  of  the  student. From  minus   2-0  is  one  group and  zero -2  is  another  group. For  this  example, it  show  for  the  two  bimodal  more the  upper  zero  is  for  the  G B and  the  lower zero  is  for  the  BB exam . Another  one  is  the  group three STIPS  descent  result to  show  by  the  dual  plot. First  we  go  for  the  cell  plot  to  see the  reference  for  each  questions. For  x-axis  is  for  each  questions and  y-axis  for  each  of  the  students. For  the  red  color,  it  means the  answer  is  right, and  for  the  blue color  the  answer  is  wrong. We  come  to  see  the  dual  part. For  the  step  one  you  show  the  pattern  is… We  see  the  student  number  5 is  more  high  performance  than  others. Step  two  they  have too  good  of  the  students. Step  three, this  group  of  students all better  than  zero  and  also  show the  students  six  is  not  that  good, but  if  compared  with  the  student  1 you  have  very  long  difference. It  means  the   student 1  is  the  outlier. From  this  we  have  four way  for  the  improvements. The  first  is  we  can  base  on  the ability  zero  to  separate  the  upper  side is  for  the   GB  and  the  lower  list  zero is  for  the   BB itself. Second,  we  can  base  on  the  discrimination to  show a  different  kind  of  happen like   state 1  we  can  find the  most  performance  student. For  the   state 2  we  can  separate from  the  people  have  good  study or  no  study  for  the  same. The  third  one  is  the  question difficulty  such  as  this  question. None  of  the  students  can  be correctly  to  answer  this  question. Maybe  there's  a  material need  to  be  improved or  some  area we  didn't  teach  for  the  students. The  last  one  is  we  also  can  find who  is  the  student  need  to  have more  extra  coach  such  as  the   student 1, is  [inaudible 00:15:05] . Maybe  we  need  to  pay more  attention  on  him. All  of  the  essential can  be  decided  by  you. You  can  see  from  the  door  part  to  see if  the  ascent  achieve  your  expectations. Now  I  will  show  you  the  real  case  on the  issue  of  the  group  of  the  questions. The   group 1 is  the  easy  with  no  discrimination. This  is  the  question  of  this and  this  the  issue  of  the session  or  the  answers? For  this  question  the  answer  is  D. It  is  very  easy and  simple  for  the  JMP  operations and  it  can  be  found in  the  training  material. You  can  see  for  the  cell  plot, most  of  people  are say  correct and  another  one  is  the  group  2 is  the  easy  with  these   questions. Is  about  for  the  parallel  part and  the  answer  is  B,  negative  correlation. Even  this  parallel  part  is  easy but  if  the  student, they  don't  really  understand how  to  read  the  parallel  part, they  cannot  understand how  to  say  the correct  answers. We  are  going  to  see  the  group  3 with  no  discriminations. This  question  is  hard also  in  the  training  material is  the  guide  the  way let  you  to  say  the  answer  D. We  can  see  most  of  people are  saying  wrong  for  these  questions. Maybe  we  need  to  reconsider  is  there  any training  material  need  to  be  corrected? For  another  one, the group  4, hard  with  the  discrimination  questions. Even  that's  all  the  question. It  looks  very  similar  and  hard  to   decide. But  if  we  pay  attention on  the  case  or  making  the  nose, you  will  understand the  answer  is  the  B. From  the  item  analysis  we  can  be analyzed  for  the  issue  of  question for  the  difficulty  and  discriminations to  dive  into  for  good  of  the  questions for  the  continuing improvement  opportunity. According  for  good  of  the  questions, we  can  deliver  for  the   GB and  the   BB  proposal. For  the   GB  we  will  consider for  the  30%  for  the  easy and  the  no  discrimination  for  basically questions  for  the  students and  for  60%  for  the  easy and  with  the  discrimination  to  separate the  student who  has  studied  and  not  studied. Also,  we  consider  10%  for  the high- performance  student and  for  the   BB  we  are  not considered  for  the  best  questions for  the   BB  due  to  you  need  to  be  more difficulty  and  we  consider  30% for  the  easy  with   discrimination, not  every  question  need  to  be  hard and  also  for  the  70%  for  the  hard with  the  discriminations but  we  not  consider for  the  difficulty  and  the  discrimination and  the  duration  of  the  questions is  for  the  reference. Of  course, it  depends  on  the  ascent  requirement to  adjust the  percentage  for  the  ascent. Last  I  will  be  talking  about for  the  takeaway  learning the  JMP  item  analyze  can  help  to  organize GB  and BB exam  questions  based  on JMP. The  item  analyze  help  to  detect  difficulty and  discrimination  of  the  same  questions for  instructor  identify  any gap  in  teaching  efficiency. Also  modify   GB /BB  training  material based  on  JMP  item  analysis  models. Last  one  it  can  be  proliferate to  the  internal  training  programs  such  as, 8D  program  or  the  Lean  training. Thank you very much.