cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
Euthanasia in Animal Shelters: Why are felines and canines being put down? (2022-US-EPO-1136)

Shalika Siddique, Graduate Student, Oklahoma State University
Anand Manivannan, Graduate Student, Oklahoma State University

 

We lose approximately 920,000 shelter animals to euthanasia requests every year. Instead, these animals could’ve made 920,000 families happier. We would like to explore the current data from Austin’s animal center to understand what conditions lead to a euthanasia request and if measures can be adopted to prevent them. The data is sourced from Austin’s Open Data Portal and consists of two tables - intakes and outcomes, dating from Oct 1, 2013, to the present. Intakes represent the status of animals as they arrive at the animal center while outcomes represent the status of animals as they leave. Each animal in is identified by a unique Animal ID. Each table consists of 136K data points and 12 features. We first explore the distribution of data by various categories such as breed, gender, age, and intake condition. Finally, classification models like logistic regression and random forest classifiers are used to make predictions on whether an animal will be euthanized. Understanding the key factors like their intake condition, sub-type of euthanasia, breed, and age could unveil crucial insights into understanding the causes for these animals to be put down and consequently advise on where to target funding for research and facilities.

 

 

Hi,  my  name  is  Shalika Siddique

My  name  is  Anand Manivannan

and  we're  both  students  from  Oklahoma S tate  University

and  we  currently  pursuing  a  business   analytics  and  data  science  degree.

Today  we  are  presenting  a  boaster  where  we  explore  euthanasia

in animal  shelters,

and  we  hope  to  understand  why  cats  and  dogs  are  being  put  down.

Every  year  we  lose  about  920,000  animals  annually.

Using   JMP Pro,   we  would  like  to  identify  the  key  factors,

that  lead  to  euthanization  of  cats  and  dogs.

Once  we  identify  these  key factors,

funds  can  be  channeled  to  relevant  sectors

to  prevent  euthanization   of  animals  that  could  have  been  saved.

In  addition  to  this, we  aim  to  make  predictions  to  identify

which  animals  are  most  likely  to  be  euthanized.

A  little  information   about  our  data  set  here,

we  source  the  data from  Austin's  data  portal,

and  the  animal  shelter  that  we  use  for  analysis  is  located

in  Austin,  Texas.

Overall,   we  had  about  130,000  records.

After  cleaning  and  filtering, we  focused  on  about  67,000  records

that  were  specific  to  cats  and  dogs.

Prior  to  our  analysis we  explode  the  data  set

and  attempted  to  derive  insights.

We  use  JMPs,   graph  builder  to  create  visualizations

such  as  bar  graphs.

From  the  67,000  records, there  were  about  3,171  animals

which  were  euthanized.

Which  is  about  4.7 %  animals  of  the  shelter.

In  comparison  to  animals surrendered  to  the  shelter  by  the  owner,

stray  animals   were  most  prone  to  euthanasia.

When  we  compare   the  age  of  the  animals,

we  notice  that  kittens  under  the  age  of  15 months,

contribute  to  25 %  of  euthanasia,

while  pups  contributed  to  13 of  euthanization

This  bar  graph  here,

is  an  example  of  one  of  the  visualizations

created  using  JMPs  builder.

The  lavender  bar   represents  cats,

while  the purple bar  represents  dogs.

We  can  see  that  intact  males  followed  by  intact  females

are  more  prone  to  euthanization,

in compared  to neutered  animals.

Next A nand  will  go  over  in  detail  over  the  modeling.

Thank  you Shalika.

Yes,  I'd  like  to  talk  a  bit  more  about  our  approach  towards

modeling  using  JMP.

Before  we  could  start  modeling,

we  performed  a  few  data  preprocessing  steps  to  prepare  our  data.

We  did  things  like   standardizing  units

for  certain  variables,

such  as  age,  which was  in  weeks,  months  and  years.

We  wanted...

We  converted   that  to  just  months.

We  bend  on  the  age variable  so  we  could  convert  it  into

a  categorical  variable.

It  looked  in  the...

It  looked  like  age ranges  like  10- 15  and  15- 25.

We  grouped   rare breeds  and  colors  to  reduce

the  number  of  categories.

Additionally,   we  also  filtered  just  cats  and  dogs

from  all  the  other  animals that  went  through  the  shelter.

During  a  modeling  phase   we  noticed  something  very  peculiar.

We  noticed  that  class  imbalance  in  our  target  variable,

which  talked  about whether  an  animal  was  adopted,

and  whether   an  animal  was  euthanized.

About  64,000  records, out  of  67,000  records  were  adopted

animals,  and  only  3,000  animals  were  euthanized  animal.

Since  a  model   was  to  focus  on  predicting  euthanasia,

we  had  to  resolve  this  issue, and  hence  we  used  JMPs  Bootstrap  model

and   Boosted Forest  to  resolve  this  issue.

It  used  the  concepts  of  bagging  and  boosting  to  do  this.

Since  bagging   and  boosting  models  don't  really

give  a  lot  of  room  for  interpretation

in  terms  of  what  the  variables  do,

we  used  logistic  regression to  interpret  these  variables  as  well.

After  modeling,

we  tuned  up  parameters  to  get  the  best  results.

We  chose  a  few  certain  metrics,

to  choose  the  best  model  based   on  its  performance  on  validation  data.

We  used   a  70:30 %  validation  split,

and  prior  to  modeling, we  also  tested  the  assumptions

for  logistic  regression.

Or  over  to  the  top   right,  you  can  see  that  we  tested

for  multicollinearity  and  independence  among  variables

using  JMPs  contingency  analysis,

which  spread  out  a  muse  plot, and  gave  us  a  P  and  correlation  value,

that  basically  told  us  which  variable  was  correlated  with  each  other.

Now  I'd  like  to  dig  a  bit  deep  into  each  model

and  how  we  selected  our  models.

Over  to  the  top  left,

you  would  see   that  we  chose  metrics  like  specificity,

this  classification  area  under  the  Cove  and  R-S quare

to  choose which  model  performed  the  best.

These  metrics   were  chosen  for  a  particular  reason

that  aligned  with  our  goal.

Our goal  was  to  predict which  animals  would  want  to  be  euthanized.

The  cost  of  our  model,

incorrectly  predicting  a  euthanized  animal,

as  a  non- euthanized   animal,  would  mean  that  animal

would  probably die  and  not  be  saved.

Hence  we  wanted   to  focus  on  increasing  the  accuracy

of  euthanized  animals and  reducing  the  misclassifications.

Hence  these particular  metrics, were chosen

First we  ran   the  nominal  logistic  regression  model,

which  you  can  see  over  to  the  bottom  left.

The  Log  worth  immediately  gave  us  which  variables

were  the  most  important  in  predicting  euthanasia.

Turns  out  it  was   sex  of  the  animal  intake  condition,

intake  type   and  outcome age .

A  lot  of  these  are  not  surprising,

and  it  matched  with  what  research  shows.

The  whole  model  turned  out  to  be  significant  as  well,

the  P- value   less  than  0.001

Following  that,   we  ran  the  Bootstrap F orest  model,

which  was  tuned  to  have a  hundred  trees  and  feature  selection

criteria  value  of  three  Bootstrap.

We used receiving  operating characteristic  or  the  AUC  curve,

to  determine   which  classification  threshold

gave  us  the  best   classification  results.

We  ended  up  using  0.1  or  10 %   as  our  classification  threshold.

Over  to  the  right,

you  would  see  that  we  ran  the   Boosted Forest  model,

with  parameters  of 87  layers  and  a  learning  rate  of  0.179

Over  at  the  bottom,

we  used  the  decision  matrix   for  all  three  models  to  calculate

the  specificity of  each  particular  model.

Which you  give  us   how  accurately  the  euthanized  animals

were  being  predicted.

We  also  use   misclassification  rate  and  R-S quare

from  the  overall   statistic  tab  of  JMP.

In  every  metric, we  found  that  our  Bootstrap  model

outperformed  the  other  models,

and  hence  we  chose  that  as  the  winning  model  to  make

predictions  on  euthanasia.

Next,   I  would  like  to  go  over,

some   important  results  that  logistic  regression  gave  us.

With  regards  to  sex,

we  found   that  intact  cats  and  intact  dogs,

were  way  more  likely  to  be  euthanized  than

neuter spayed  animals.

With  regards  to  breed, we  found  that  mixed  cat  breeds,

and   Pit Bull  dog  mixed  breeds,   were  more  likely  to  be  euthanized

than  all  other  breeds.

With  regards  to age  we  found  that  cats  that  are  4.5

to 6  years,  are  more  likely   to  be  euthanized than  younger  cats.

Dogs  under  1.2  years   are  the  least  likely  to be euthanized.

This was  widely  surprising  because  it's  contradictory

to  what  we  found  during  a  data  exploration phase.

Similarly,   with  regard  to  intake  type,

we  found  that   older  surrendered  animals,

are  twice  more  likely  to  be euthanized  than  stray  animals.

This  is  completely,  again,  contradictory  to  what  we  found

in  the  data  exploration  phase.

That  goes  to  show  that  what  the  power  of  statistical

analysis  and  unbeating  true  facts.

Next,  I  will  be  handing  it  off  to  Shalika  again

to  go  what  recommendations  we can  make  to  these  animal  shelters.

Thank  you Anand.

Based  on  our  analysis, we  have  a  few  recommendations

that  animal  shelters   could  use  to  lower  euthanizations.

We  believe   that  animals  taken  into  the  shelter

should  be  neutered  or  spayed

This  is  in  accordance  with  medical  research,

which  proves  that  intact animals  are  more  prone  to  diseases.

Animal  shelters  could  also  use  our  Bootstrap  Forest  model

to  prioritize which  animals  needs  to  be  saved,

in  case  a  difficult  decision  needs  to  be  made.

In  support  of  that,

here  are  some  recommendations   from  Austin's  animal  shelter.

This  particular  shelter  would  need  to  prioritize  cats  over  dogs

as  they are  more  prone  to   euthanizations.

With  regards  to  age, cats  aged  between  4.5-6  years,

and  dogs  over  1.2  years  would  require  more  attention.

Owner  surrendered  dogs   need  to  be  prioritized  over  stray  animals.

Finally,   when  it  comes  to  breeds,

Pit Bull  mix  dog  breeds   and  mixed  cat  breeds,

are  more  prone  to euthanization and  would  likely  require  more  attention.

That  brings  us  to  the  end  of  our  presentation.

We  hope  that   animal  shelters  could  use  this  analysis,

to  reduce  the  need for  an  animal  to  be  euthanized.

Thank you.