cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
Explainable AI: Unboxing the Blackbox (2022-US-45MP-1147)

Peter Hersh, JMP Senior Systems Engineer, SAS
Florian Vogt, Systems Engineer, JMP
Russ Wolfinger, Director of Scientific Discovery and Genomics, JMP
Laura Lancaster, Principal Research Statistician Developer, JMP

 

Artificial intelligence algorithms are useful for gaining insight into complex problems. One of the drawbacks of these algorithms is they are often difficult to interpret. The lack of interpretability can make models generated using these AI algorithms less trustworthy and less useful. This talk will show how utilizing a few features in JMP can make AI more understandable. The presentation will feature performing “what if” hypothesis testing using the prediction profiler, testing for model robustness utilizing Monte Carlo Simulations, and analyzing Shapley values, a new feature in JMP Pro 17, to explore contrastive explanations.

 

 

Welcome  to  the  talk E xplainable  AI:  Unboxing  the  Black box.

Let's  introduce  ourselves and  let's  start  with  Laura.

Hello,  I'm  Laura  Lancaster, and  I'm  a S tatistical Developer  in  JMP

and  I'm  located  in  the  Cary  office.

Thanks.

What  about  you,  Russ?

Hey, everyone. Russ  Wolfinger.

I'm the  Director  of  Life  Sciences  R&D in  the  JMP  Group

and  a Research  fellow  as  well.

Looking  forward  to  the  talk  today.

And  Pete?

My  name's  Peter  Hersch.

I'm  part  of  the  Global  Technical Enablement T eam

and  I'm  located  in  Denver,  Colorado.

Great  and  my  name  is  Florian  Vogt.

I'm  Systems  Engineer for  the  Chemical  Team  in  Europe

and  I'm  located in  beautiful  Heidelberg,  Germany.

Welcome  to  the  talk.

AI  is  a  hot  topic  at  the  moment and  a  lot  of  people  want  to  do  it.

But  what  does  that  mean for  the  industries?

Does  it  mean  that  scientists  and  engineers need  to  become  coders,

for  processes in  the  future  run  by  data  scientists?

A  recent  publication  called Industrial  Data  Science -  a  review

of  Machine  Learning  Applications for  Chemical  and  Process  Industries

explains  industrial  data  science fundamentals,

reviews  industrial  applications  using

state  of  the  art machine  learning  techniques

and  it  points  out  some important  aspects  of  industrial  AI.

These  are  the  accessibility  of  AI, the  understandability  of  AI

and  the   consumability of  AI, and  in  particular  the  output.

We'll  show  you  some of  the  features  that  we  think

are  contributing to  this  topic  very  well  in  JMP.

Before  we  start  into  the  program of  today,  let's  briefly  review

what  AI  encompasses  and  what our  focus  today  is  located  on.

I've  picked  a  source  that  actually separates  it  into  four  groups.

Those  groups  are:  first,  supporting  AI also  called  Reactive  Machines

and  this  aims  at  decision  support.

The  second  group  is  called  Augmenting  AI

or  Limited  Theory and  that  focuses  on  process  optimization

and  the  third  group  is  Automating  AI,

or  Theory  of  Mind,  which, as  the  name  suggests,  aims  on  automation,

and  the  fourth  is  called  autonomous  AI

or  self  aware  AI,  which  encompasses autonomous  optimizations.

Today's  focus  is  really on  the  first  and  the  second  topic.

We  had  a  brief  discussion  before.

Russ,  what  are  your  thoughts  on  these also  with  respect  to  what  JMP  can  cover?

Well,  certainly  the  term AI  gets  thrown  around  a  lot.

It's  used  in  many  kind of  different  nuanced  meetings.

I  tend  to  prefer  meetings  that  are

definitely  more  tangible and  usable  and  more  focused.

Like  we're  going  to  zoom  in  on  today with  some  specific  examples.

The  terminology  can  get a  little  confusing  though.

I  guess  I  just  tend  to  kind  of  keep  it fairly  broad,  open  mind

whenever  anyone  uses  the  term  AI  and  try to  infer  its  meaning  from  the  context.

Right.  That's in  terms  of  introduction,

now  we'll  get  a  little  bit  more into  the  details  and  specifically  into

why  it  is  important  to  actually understand  your  AI  models.

Over  to  you,  Pete.

Perfect. Thanks,  Florian.

I  think  what  Russ  was  hitting  on  there and  Florian's  introduction  is,

we  oftentimes  don't  know  what  an  AI  model is  telling  us  and  what's  under  the  hood.

When  we're  thinking  about how  well  a  model  performs,

we  think  about  how well  that  fits  the  data.

If  we  look  here,  we're  looking at  a  neural  network  diagram

and  as  you  can  see, these  can  get  pretty  complex.

These AI  models  are  becoming more  and  more  prevalent

and  relied  upon  for  decision  making.

Really,  understanding  why  an  AI  model is  making  a  certain  decision,

what  criteria it's  basing  that  decision on,

is imperative  to  taking full  advantage  of  these  models.

When  a  model  changes  or  updates,

especially  with  that  autonomous  AI or  the  automating  AI,

we  need  to  understand  why.

We  need  to  confirm that  this  model  is  maybe  not

extrapolating  or  basing  it  on  a  few  points outside  of  our  normal  operating  range.

Hold  on.

Let  me  steal  the  screen here  from  Florian,

and  I'm  going  to  go  ahead and  walk  through  a  case  study  here.

All  right,  so  this  case  study

is  based  on  directional  drilling from  wells  near  Fort  Worth,  Texas.

The  idea  with  this  type  of  drilling  is

unlike  conventional  wells, where  you  would  just  go  vertically,

you  go  down  a  certain  depth, and  then  you  start  going  horizontal.

The  idea  is  these  are  much  more efficient  than  the  traditional  wells.

You  have  these  areas  of  trapped  oil and  gas  that  you  can

get  at  with  some special  completion  parameters.

We're  looking  at  the  data  here from  these  wells,

and  we're  trying  to  figure  out  what  are the  most  important  factors,

including  the  geology,  the  location, and  the  completion  factors,

and  can  we  optimize  these  factors  to increase  or  optimize  our  well  production?

To  give  you  an  idea, here's  a  map  of  that  basin,

so  like  I  mentioned, this  is  Fort  Worth,  Texas.

You  can  see  we  have  wells  all  around  this.

We  have  certain  areas  where  our  yearly production  is  higher,

others  where  it's  lower.

We  wanted  to  ask  a  few  questions looking  at  this  data.

What  factors  have the  biggest  influence  on  production?

If  we  know  certain  levels  for  a  new  well,

can  we  predict  what  our production  will  be?

Is  there  a  way  to  alter  our  factors, maybe  some  of  the  completion  parameters

and  impact  our  production?

We're  going  to  go  ahead and  answer  some  questions  with  a  model.

But  before  we  get  into  that, I  wanted  to  ask  Russ,

since  he's  got lots  of  experience  with  this.

When  you're  starting to  dig  into  data,  Russ,

what's  the  best  place  to  start  and  why?

Well,  I  guess  maybe  I'm  biased, but  I  love  JMP

for  this  type of  application,  Pete,

just  because  it's  so  well  suited  for  quick exploratory  data  analysis.

You  want  to  get  a  feel  for  what  target you're  trying  to  predict

and  the  predictors  you're  about  to  use, looking  at  their  distributions,

checking  for  outliers  or  any unusual  patterns  in  the  data.

You  may  even  want  to  do  some  quick pattern  discovery  clustering

or  PCA  type  analysis  just  to  get  a  feel for  any  structure  that's  in  the  data.

Then  also  be  thinking  carefully  about

what  performance  metric  would  make  the most  sense  for  the  application  at  hand.

Typically  the  common  one,  obviously for  continuous  predictors  would be

like  root  means  square  error, but  there  could  be  cases  where

may be that's  not  quite  appropriate, especially  if  there's

direct  cost  involved, sometimes  absolute  error is  more  relevant

for  a  true  profit- loss  type  decision.

These  are  all  things that  you  want  to  start  thinking  about

as  well  as  how  you're  going  to  validate  your  model.

I'm  a  big  fan  of   k-fold cross validation.

Where  you  split  your  data  into  distinct subsets  and  hold  one  out

and  being  very  careful  about  not  allowing leakage  or  also  careful  about  overfitting.

These  are  all  concerns  that  tend

to  come  top  of  mind  for  me  when I  start  out  with  a  new  problem.

Perfect. Thanks, Russ .

I'm  going  to  walk  through  this  in  JMP

some  of  the  tools  we  can  use   to  start  looking  at  our  problem.

Then  we're  going  to  cover  some of  the  things  that  help  us  determine

which  of  these  factors  are  having the  biggest  impact  on  well  production.

I'm  going  to  show  Variable  Importance and  then  Shapley  Values

and  we'll  have Laura  talk  to  that  and  how  we  do  that.

But first,  let's  go  ahead and  look  at  this  data  inside  a  JMP.

Like  we  mentioned  here, I  have  my  production  from  these  wells.

I  have  some  location  parameters so  where  it  is  latitude  and  longitude,

I  have  some  geologic  parameters.

This  is  about  the  rock  formation we're  drilling  through.

Then  I  have  some   completion  parameters

and  this  is  factors  that  we  can  change while  we're  drilling  as  well,

some  of  these  we can  have  influence  on.

If  we  wanted  to  go  through and  this   dataset  only  has  5,000  rows.

When  talking  to  Russ  in  starting  to  prep this  talk,  he  said  just  go  ahead

and  run  some  model  screening  and  see what  type  of  model  fits  this  data  best.

To  do  that,  we're  going  to  go  ahead and  go  under  the  Analyze  menu,

go  to  Predictive  Model and  hit  Model  Screening.

I'm  going  to  put  my  response, which  is  that  production,

take  all  of  our  factors,  location, geology  and  completion  parameters,

put  those  into  the  X  and  grab my  Validation  and  put  it  into  Validation.

Down  here  we  have  all  sorts  of  different options  on  types  of  models  we  can  run.

We  can  pick  and  choose which  ones  maybe  make  sense

or  don't  make  sense  for  this  type  of  data.

We  can  pick  out  some  different  modeling options  for  our  linear  models.

Even,  like  Russ  mentioned  here, if  we  don't  have  enough  data

to  hold  back  and  do  our  validation.

That  way  we  can  utilize k-fold cross validation in here.

Now  to  save  some  time,  I've  gone  ahead and  run  this  already  so  you  don't  have

to  watch  JMP  create  all  these  models.

Here's  the  results.

For  this  data,  you  can  see

that  these  tree  based  methods: Boosted  Tree,  Bootstrap f orest  and  XGB oost

all  did  very  well  at  fitting  the  data, compared  to  some  of  the  other  techniques.

We  could  go  through and  run  several  of  these,

but  for  this  one,  I'm  going  to  just  pick

the  Boosted  Tree  since  it  had  the  best RS quare  and  root  average  square  error

for  this   dataset.

We'll  go  ahead  and  run  that.

After  we've  run  the  screening, we're  going  to  go  ahead  and  pick  a  model

or  a  couple  of  models  that  fit well  and  just  run  them.

Al right,  so  here's   the  overall  fit  in  this  case.

Depending  on  what  type  of  data you're  looking  at,

maybe  an  RS quare  of  .5  is  great, maybe  an  RSquare  of  .5  is  not  so  great.

Just  depending on  what  type  of  data  you  have,

you  can  judge  if  this  is  a  good  enough  model  or  not.

Now  that  I  have  this, I  want  to  answer  that  first  question.

Knowing  a  few  parameters  going  in,

what  can  I  expect my  production  level  to  be?

An  easy  way  to  do  that  inside  a  JMP with  any  type  model  is  with  this  profiler.

Okay,  so  we  have  the  profiler  here,

we  have  all  of  the  factors  that  were included  in  the  model,

and  we  have  what  our  expected  12  month  production  to  be.

Here  I  can  adjust if  I  know  my  certain  location.

I  know  the  latitude  and  longitude  going in

maybe  I  know  some of  these  geologic  parameters.

I  can  maybe  adjust  several  of  these  factors

and  figure  out  of  the  completion  parameters

and  figure  out  a  way  to  optimize  this.

But  I  think  here  with  a  lot  of  factors, this  can  be  complex.

Let's  talk  about  the  second  question,

where  we  were  wondering  which  one  of  these factors  was  having  the  biggest  influence.

You  can  see  based  on   which  of  these  lines  are  flatter

or  have  more  shape  to  them, what  is  the  biggest  influence.

But   let's  JMP  do  that  for  us.

Under  Assess  Variable  Importance,

I'm  going  to  just  let  JMP  go  through

and  pick  the  factors that  are  most  important.

Here  you  can  see it's  outlined  the  most  important

down  to  ones  that  are  less  important.

I  like  this  feature   colorize  the  profiler.

Now  it's  highlighted the  most  important  factors

and  gone  down to  the  least  important  factors.

Again,  I  can  adjust  these   and  see,  oh,  it  looks  like

maybe  adjusting  the  depth  of  the  well,

adding  some  more   [inaudible 00:15:51], it  might  improve  my  production.

That  is  the  way  we  could  do  this but  we  have  a  new  way  of  looking

at  the  impact  of  each  one  of  these  factors on  a  certain  well.

We  can  launch  that  under  the  red  triangle in  JMP  17  and  Shapley  values.

I  can  set  my  options  or save  out  the  Shapley  values.

Once  I  do  that,  it  will  create  new  columns  in  my  data  table

that  save  out  the  contributions from  each  one  of  those  factors.

This  is  where  I'm  going  to  let Laura  talk  to  some  Shapley  values.

I  just  wanted  going  to  talk  briefly   about  what  Shapley  values  are

and  how  we  use  them.

Shapley  values  are  a  model  agnostic  method  for  explaining  model  predictions

and  they  are  really  helpful for  Black box  models

that  are  really  hard to  interpret  or  explain.

The  method  comes  from  cooperative  game  theory

and  I  don't  have  time  to  talk  about  the  background

or  the  math  behind the  computations,  but  we  have  a  reference

at  the  bottom  of  the  slide  and  if  you Google  it,  you  should  be  able  to  find

a  lot  of  references  to  learn more  if  you're  interested.

What  these  Shapley  values  do,

they  tell  you  how  much  each  input  variable

is  contributing  to  an  individual prediction  for  a  model.

That's  away  from  the  average predicted  value

across  the  input   dataset

and  your  input   dataset  is  going to  come  from  your  training data.

Shapley values  are  additive,

which  makes  them  really  nice  and  easy to  interpret  and  understand.

Every  prediction  can  be  written as  a  sum  of  your   Shapley values

plus  that  average  predicted  value,

which  we  refer  to  as the  Shapley  intercept  in  JMP.

They  can  be  computationally  intensive

to  compute  if  you  have  a  lot  of  input values,  input variables in  your  model

or  if  you're  trying  to  create Shapley values  for  a  lot  of  predictions.

We  try  to  give  some  options for  helping  to  reduce  time  in JMP .

These  Shapley  values, as  Peter  mentioned,

were  added  to  the  prediction  profiler for  quite  a  few  of  the  models  in  JMP  Pro17

and  they're  also  available in  the  graph  profiler.

They're  available  for  Fit Least S quares Nominal Logistic, O rdinal  Logistics,

Neural,  Gen Reg, P artition,  Bootstrap Forest  and  Boosted  Tree.

They're  also  available  if  you  have  the  XB oost  Add -In.

Except  in  that  Add-I n, they're  available  from  the  model  menu

and  not  from  the  prediction  profiler.

Okay,  next  slide.

In  this  slide  I  want  to  just  look  at  some of  the  predictions  from  Peter's  model.

This  is  from  a  model   using  five  input  variables.

These  are  stack  bar  charts of  the  first  three  predictions

coming  from  the  first  three  rows  of  his  data.

On  the  left  you  see  a  stack  bar chart of  the  Shapley  values  for  the  first  row.

That  first  prediction  is  11.184  production  barrels

in hundreds of thousands.

Each  color  in  that  bar  graph  is  divided out  by  the  different  input  variables.

Inside  the  bars  are the   Shapley values.

If  you  add  up  all  of  those  values, plus  the  Shapley  intercept

that  I  have  in  the  middle  of  the  graph, you  get  that  prediction  value.

This  is  showing  you  that  first  of  all, all  of  these  are  making

positive  contributions  to  the  production and  they  show  you  how  much,  so the  size,

I  can  see  that  longitude  and  proppant

are  contributing  the  most for  this  particular  prediction.

Then  if  I  look  to  the  right  side to  the  third  prediction,

which  is  2.916  production  barrels  and  hundreds  of  thousands,

I  can  see  that  two  of  my  input  variables

are  contributing  positively to  my  production

and  three  of  them are  having  negative  contributions,

the  bottom  three  here.

You  can  use  graphs  like  this to  help  visualize  your  Shapley Values .

That  helps  you  really  understand

these  individual  predictions.

Next  slide.

This  is  just  one  of  many  types  of  graphs

you  can  create.

The  Shapley  values get  saved  into  your  data  table.

You  can  manipulate  them

and  create  all  kinds  of  graphs   in  Graph  Builder  and  JMP.

This  graph  is  just  a  graph of  all  the  Shapley  values

from  over  5,000  rows  of  the  data split  out  by  each  input  variable.

It just  gives  you  an  idea of  the  contributions  of  those  variables,

both  positive  and  negatively to  the  predictions.

Now  I'm  going   to  hand  it  back  over  to  Peter.

Great.  Thanks,  Laura.

I  think  now  we'll  go  ahead and  transition  to  our  second  case  study

that  Florian  is  going  to  do.

Should  I  pass  the  screen  share  back  to  you?

Yeah,  that  would  be  great.

Make the transition.

Thanks  for  this  first  case  study and  thanks  for  the  contributions.

Really  interesting.

I hope  we  can  bring  some  light  onto  a  different  kind  of  application

with  our  second  case  study.

I  have  given  it  the  subtitle, Was  it  Torque?

Because  that's  a  question  we'll  have  hopefully  answered

by  the  end of  the  second  case  study  presentation.

This  second  case  study  is  about  predictive  maintenance

and  the  particular  aspects   of  why  it  is  important  to  understand

your  models  in  this  scenario.

Most  likely  everybody can  think  that  it's  very  important

to  have  a  sense  for  when  machines require  maintenance.

Because  if  machines  fail, then that's  a  lot  of  trouble,

a  lot  of  costs,  when  plants have  to  shut  down  and  so  on.

It's  really  necessary to  do  preventative  maintenance

to  keep  systems  running.

A  major  task  in  this  is  to  determine

when  the  maintenance  should  be  performed

and  not  too  early,   not  too  late,  certainly.

Therefore,  it's  a  task  to  find  a  balance

which  limits  failures   and  also  saves  costs  on  maintenance.

There's  a  case  study  that  we're  using to  highlight  some  functions  features

and  it's  actually  a  synthetic  data set

which  comes  from  a  published  study.

The  source is  down  there  at  the  bottom.

You  can  find  it.

It  was  published   in  the  AI  for  Industry  event  in  2020.

The  basic  content  of  this  dataset

is  that  it  has  six  different features  of  process  settings,

which  are  product  or product type

which  denotes   for  different  quality  variants.

Then  we  have  air  temperature,

process  temperature, rotational  speed,  torque,  and  tool wear.

We  have  one  main  response  and  that  is  whether  the  machine  fails  or  not.

When  we  think  of  questions

that  we  could  answer  with  a  model  or  models  or  generally  data.

There's  several  that  come  to  mind.

Now,  the  most  obvious  question

is  probably  how  we  can  explain and  interpret  settings,

which  likely  lead  to  machine  failure.

This  is  something   that [inaudible 00:24:38]

to  create  and  compare  multiple  models

and  then  choose the  one  that's  most  suitable.

Now,  in  this  particular  setting  where  we  want  to  predict

whether  a machine  fails  or  not.

We  also  have  to  account  for  misclassifications

that  is  either  a  false  positive  or  a  false  negative  prediction.

With  JMP's  decision  threshold  graphs

and  the  profit  matrix, we  can  actually  specify  an  emphasis

or  importance   to  which  outcome  is  less  desirable.

For  example,

it  is  typically  less  desirable  to  actually  have  a  failure

when  the  model  didn't  predict  one

compared  to  the  opposite,   misclassification.

Then  besides   the  binary  classification,  of  course,

you'd  be  also  interested  in  understanding what  drives  failure  typically.

There  are  certainly  several  ways to  deal  with  this  question.

I  think  visualization   is  always  a  part  of  it.

But  when  we're  using  models   we  can  consider,

for  example,  this  self  explaining  models

like  decision  trees   or  we  can  use  built- in  functionality

like  the  prediction  profiler and  the  variable  importance  feature.

The  last  point  here

when  we  investigate and rate which  factors  are  most  important

for  the  predictive  outcome,

we  assume   that  there  is  an  underlying  behavior.

The  most  important  factor  is  XYZ,

but  we  do  not  know  which  factor  actually

has  contributed  to  what  extent to  an  individual  prediction.

A gain,  Shapley  values  are  a  very  helpful

addition  that  can  allow  us  to  understand  the  contribution

for  each  of  the  factors  in  individual  prediction.

On a  general  level,

now,  let's  take  a  look into  three  specific  questions

and  how  we  can   answer  those  with  the  software.

The  first  one  is  how  do  we   adjust  predictive  model

with  respect  to  the  high  importance of  omitting  false  negative  predictions?

This  assumes  a  little  bit   that  we've  already  done  a  first  step

because  we've  already  seen  model  screening  and  how  we  can  get  there.

I'm  starting  one  step  ahead.

Let's  move  into  JMP  to  actually  take  a  look  at  this.

We  see  the   dataset, we can see it's  fairly  small,

not  too  many  columns.

It  looks  very  simple.

We  only  have  these  few  predictors and  there's  some  more  columns.

There's  also  a  validation  column that  I've  added,

but  it's  not  shown  here.

As  for  the  first  question,

let's  assume  we  have  already  done the  model  screening.

Again,  this  is  accessible

on the  analyzed  predictive  model  screening

where  we  don't  specify   what  we  want  to  predict

and  the  factors  that  we  want  to  investigate.

Again,  I  have  already  prepared  this.

We  have  an  outcome that  looks  like  this.

It  looks  a  little  bit  different than  in  the  first  use  case

because  now  we  have  this  binary  outcome

and  so  we  have  some  different  measures

that  we  can  use  to  compare.

But  again,  what's  important  is  that  we  have  an  overview

of  which  of  the  methods are  performing  better  than  other  ones.

As  we  said,  in  order  to  now  improve

the  model  and  emphasize  on  omitting  these

false  negative  predictions.

Let's  just  pick  the  one and  see  what  we  can  do  here.

Let's  maybe  even  pick  the  first  three  here,

so  we  can  just  do that  by  holding  the  control  key.

Another  feature  that  will  help  us  here

is  called  decision  threshold

and  it's  located   in  the  red  triangle decision  threshold.

The  decision  threshold  gives  us  several  contents.

We  have  these  graphs  here, these  shows  the  actual  data  points.

We  have  this  confusion  matrix

and  we  have  some   additional  graphs  and  matrix,

but  we  will  focus  on  the  upper  part  here.

Let's actually  take  a  look   at  the  test  portion  of  the  set.

When  we  take  a  look  at  this,

we  can  see  that  we  have   different  types  of  outcomes.

The  default  of  this  probabilities  threshold

is  the  middle,  which  would  be  here  at  .5.

We  have  now  several  options to  see  and  optimize  this  model

and  how  effective  it  is  with  respect to  the  confusion  matrix.

Confusion  matrix,  we  can  see  the  predicted  value

and  whether  that  actually  was  true  or  not.

If  we  look  at  when o  failure  is  predicted,

we  can  see  that  here,  with  this  setting,

we  actually  have  quite  a  high  number  of  failures,

even  though  there  were  no  predicted.

Now  we  can  interactively explore  how  adjusting  this  threshold

actually  affects  the  accuracy  of  the  model or  the  misclassification  rates.

Or  in  some  cases,  we  can  also  put

an  emphasis  on  what's  really worse  than  an other  failure.

We  can  do  this  with   the  so  called  profit  matrix.

If  we  go  here,  we  can  set  a  value

on  which  of  the  misclassifications   is  actually  worse  than  the  other  one.

In  this  case,  we  really  do  not  want

to  have  a  prediction  of  no  failure

when  there  actually  is  a  failure  happening.

We  would  put  something  like  20  times.

More  importantly, we  do  not  get  this  misclassification

and  we  set  it  and  hit  okay,

and  then  it  will  automatically  update  the  graph

and  then  we  can  see that  the  values  for  the  misclassification

have  dropped  now  in  each  of  the  models

and  we  can  use  this  as  an  additional  tool

to  select  a  model   that's  maybe  most  appropriate.

That's  for  the  first  question   of  how  we  can  adjust  a  predictive  model

with  respect  to  the  higher  importance   of  omitting  false  negative  predictions.

Now,  another  question  here  is  also  when  we  think  of  maintenance

and  where  we  put  our  efforts  into  maintenance,

is  how  can  we  identify  and  communicate

the  overall  importance  of  predictors?

What  factors  are  driving the  system,  the  failures?

Let's  go  back  to  the  data  table   to  say  that  first,

I  personally  like  visual   and  simplistic  ones.

One  of  them  that  I  like  to  use  is  stuff  like  the  parallel  plot.

Because  it's  really a  nice  overview  summarizing

where  the  failures  group   and which  parameters  settings  and  so  on.

On  the  modeling  and  machine  learning  side,

there's  a  few  other  options that  we  can  actually  use.

One  that  I  like  because  it's  very  crisp  and  clear,

is  the  predictor  screening.

Predictor  screening  gives  us  very  compact  output

about  what  is  important  and  it's  very  easy  to  do

and  it's  under  analyzed  screening,

predictor  screening.

A ll  we  need  to  do   is  say  what  we  want  to  understand

and  then  specify  the  parameters

that  we  want  to  use  for  this.

Click  okay,  and  then  it  recalculates

and  we  have  this  output.

For  me,  it's  a  really  nice  thing

because as I said, crisp  and  clear  and  consuming.

But  we've  talked  about  this  before

and  Russ,  when  we're  working with  models  particularly,

do  you  have  any  other  suggestion or  do  you  have  anything  to  add

to  my  approach  to  understanding  the  factors

of  which  predictors  are  important.

Yes,  it  is  a  good  thing  to  try.

As  I  mentioned  earlier,   you  got  to  be  really  careful

about  overfitting.

I tend to work  with  a  lot   of  these  wide  problems,

say  from  Genomics  and  other  applications,

where  you  might  even  have

many  more  predictors than  you  have  observations.

In  such  a  case, if  you  were  to  run  predictor  screening,

say  maybe  pick  the  top  10  best

and  then  go  turn   right  around  and  fit  a  new  model

with  those  10  only, you've  actually  just  set  yourself  up

for  overfitting  if  you  did  the  predictor screening  on  the  entire  data set.

That's  the  scenario I'm  concerned  about.

It's   an  easy  trap  to  fall  into,

because  you  think   you're  just  filtering  things  down,

but   you've  got  to  reuse  the  same  data  twice.

The  danger  would  be  if  you  were  to  plug then  apply  that  model

to  some  new  data, it  likely  won't  do  nearly  as  well.

If  you're  in  the  game   where  you  want  to  reduce  predictors,

I  tend  to  like  to  prefer   to  do  it  within  each  fold  of  a  K-fold.

The  drawback  of  that  is  you'll  get

a  different  set  every  time, but  you  can't  aggregate  those  things.

If  you  got  a  certain  predictor

that's just showing  up consistently  across  folds.

It's  very  good  evidence  of that.  That's  a  very  important  one.

I  expect  that's  what  would  happen in  this  case  with,  say,  torque.

Even  if  you  were  to  do  this  exercise,

say  10  times  with  10  different  folds, you'd  likely  get  a  pretty  similar  ranking,

but  it's  more  of  a  subtlety

but  again,  a  danger  that  you  have  to  watch  out  for.

Job  can  make  it  a  little  bit  easier  just because  things  are  so  quick  and  clean,

like  you  mentioned,  that  you  might  fall into  that  trap  if  you're  not  careful.

Yeah,  that's  very  valuable   addition  to  this  approach.

Just accompanying to  this  additional  information,

there's  also  the  other   option  that  we  have,

particularly  when  we  have  already

gone  through  the  process  of  creating a  model  where  we  can  then

actually  again,  use  the  prediction profiler  and  the  variable  importance.

It's  another  way  where  we  can  assess

which  of  the  variables have  the  higher  importance.

Russ,  do  you  want  to  say  word  on  that  also

in  contrast  maybe  to the  predictor  screening?

Yeah.  Honestly,  Vogt,  I   like the  importance  were it a  little  better.

Just  dive  right  into  the  modeling.

Again,  I  would  prefer  with  K-fold.

Then  you  can  just  use   the  variable  importance  measures,

which  are  often  really informative  directly.

They're  very  similar. In  fact,  predictor  screening,

I  believe,  it's  just  calling  bootstrap  forest  in  the  background

and  collecting the  most  important  variables.

It's  basically  the  same  thing.

Then  following  up  with  the  profiler, which  can  be  excellent  for  seeing  exactly

how  certain  variables  are  marginally affecting  the  response,

and  then  drilling  that  even  further  with  Shapley

to  be  able  to  break  down  individual  predictions

into  their  components.

To  me,  it's  a  very  compelling and  interesting  way

to  dive  into  a  predictive  model

and  understand what's  really  going  on  with  it,

kind of unpacking   the  black  box

and  letting  you  see   what's  really  happening.

Yeah,  thanks.

I  think  that's  the  whole  point,

making  it  understandable   and  making  it  consumable

besides,  of  course,  actually getting  to  the  results,

which  is  understanding  which  factors are  influencing  the  outcome.

Thanks.

Now,  I  have  one  more  question, and  you've  already  mentioned  it.

When  we  score  new  data,  in  particular,

what  can  we  do  to  identify

which  predictors  have  actually influenced  the  model  outcome?

Now,  with  what  we  have  done  so  far,

we  have  gained  a  good  understanding of  the  system  and  know

Which  of  the  factors  are  the  most  dominant and  we  can  even  derive  operating  ranges.

If  the  system  changes,  what  if  a  different factor  actually  drives  a  failure?

Then  as  it  would  be  expected  in  this  case and we have talked to  Laura  beforehand,

and  Shapley V alues  again  are  a  great   addition  that  will  help  us  to  interpret.

we've  seen  how   we  can  generate  them,

and  you've  learned  on  which  platforms they'll  be  available.

Now,  the  output  that  you  get when  you  save  out Shapley Values is,

for  example,  also  a  graph   that  shows  us  per  actual  per  row.

In  this  case,

we  have  10,000  rows  in  the  data  tab, so we have 10,000 stack bar charts,

and  then  we  can  already  see  that besides the common pattern,

there's  also  times  when  there's   actually other influencing factors

that  drive  the  decision   of  the  model.

It's  really  a  helpful  tool  to  not  only  raise an individual prediction,

but  also  add  on  to  that  what  Russ just  said,

understanding  of  the  system, which  factors  contribute.

When  we  move  a  little  bit on

in  this  understandable   or  exploratory path,

we  can  use  these   Shapley Values   in  different  ways.

What  I  personally  liked   was the suggestion

to  actually  plot  the  Shapley  value   by  their  actual parameters setting,

because  that  allows  us   to  identify  areas  of  settings.

For  example,

if  we  take  rotational  speed  here, we  can  see  that  there's  actually

areas  of  this  parameter that  tend  to  contribute  a  lot

in  terms  of  the  model  outcome, but  also  in  terms  of  the  actual  failure.

That  also  helps  us in getting  more understanding

with  respect  to  the  actual  problem  of machine failure and what's causing it,

and  also  with  respect  to  why  the  model predicts something.

Now,  finally, I  like  to  answer  the  question.

When  we  take  these  graphs   of  Shapley values,

and  we  have  seen  it  before  in  several occasions,

stock  is  certainly  a  dominant  factor.

But  from  all  of  these, I've  just  picked  a  few  predictions,

and  we  can  see  that  sometimes it  stocks,  sometimes  it's  not.

With  the  Shapley  values, we have really a great way

to  interpreting  a  specific   prediction  by  the  model.

All  right,  so  those  were  the  things we  wanted  to  show.

I  hope  this  gives   some  great  insight

into how  we  can  make   AI  models  more  explainable,

more  understandable,

more  easy  to  digest  and  to  work  with, because  that's  the  intention  here.

Yeah,  I'd  like to  summarize  a  little  bit.

Pete,  maybe  you  want to come  in  and  help  me  here.

I  think  what  we're  hoping   to  show  is  that

as  these  AI  models  become more  and  more  prevalent

and  are  relied  upon  for  decision  making, that  understanding,  interpreting,

and  being  able  to  communicate   those  models  is  very  important.

We  hope  that  with  these  Shapley v alues,

with  the  variable  importance, and with the profiler,

we've  shown  you  a  couple  of  ways   that  you can share share those results

and  have  them   easily  understandable.

That  was   the  take- home  there  between that  and  being  able  to  utilize

model  screening, and things like that,

that hopefully,  you  found  a  few  techniques

that  will  make  this  more  understandable and  less  of  a  black  box.

Yeah,  I  absolutely  agree.

Just  to  summarize,   I  really  like  to  thank Russ and Laura

for  contributing  here   with  your  expertise.

Thanks,  Pete. It was  a  pleasure.

Thanks,  everybody for listening.

We're  looking  forward  to  having   discussions  and  questions  to  answer.