cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
Species and Sex Classification in the Fisher Using the Footprint Identification Technique (FIT) - (2023-US-PO-1497)

Caleb King, Research Statistician Tester, JMP
Jody Tucker, Deputy Program Manager and Biological Scientist, U. S. Forest Service
Ryan Lekivetz, Manager, Advanced Analytics R&D, JMP
Zoë Jewell, WildTrack Co-Founder, WildTrack
Sky Alibhai, WildTrack Co-Founder, WildTrack

 

The federally endangered southern Sierra Nevada fisher (Pekania pennanti) is spread out at low density across a large and rugged landscape, comprised of approximately 300 individuals across a 12,000 km2 area.  Its vulnerability has been further amplified by periods of severe drought and extensive wildfires in the region. 

 

Identifying and preserving female reproductive habitat has been outlined as the most important demographic feature for sustaining and increasing the population. In this presentation we describe a customized, cost-effective and non-invasive Footprint Identification Technique (FIT) in JMP software developed by WildTrack to identify both species and sex using footprints collected at track stations. 

 

We created a data set of known fisher and Pacific marten (Martes caurina) footprint images and known-sex fisher images. To automate the feature extraction in JMP, we developed a customized  script to generate distances, angles, and areas using landmark points on the footprint images. Using a single variable, we had a species classification accuracy of over 99%. For fisher sex classification, using a more parsimonious model with just two variables selected in LDA, we achieved accuracies of 94.0% for the training set and 89.4% for the test set. We discuss the merits of this technique to help with the conservation efforts for this species.

 

 

Hello. My  name  is  Caleb  King.

I'm  a  senior  developer  in  the  design of  Experiments  and  Reliability  group

here  at  JMP  statistical  Discovery.

T oday  I  have  the  privilege of  telling  you

about  a  very  interesting  project that  I  was  able  to  be  a  part  of

concerning  classification of  species  and  sex

within  a  small  mammal  group called  Fishers

using  the  Footprint Identification  Technique.

F ishers,  I'll  give  you  a  quick  image  here,

so  here's  an  example  of  a  fisher.

To  me  it  looks  like  a  bit  of  a  weasel or  ferret- type  animal.

I  know  that's  definitely not  the  same  species,

but  they're  a  small  mammal,

and  we're  particularly  interested in  fishers  located  in  the  Sierra  Nevada,

as  those  are  a  federally endangered  species.

Specifically, we'd  like  to  be  able  to  identify

the  presence  of  females,

as  the  larger  number  of  females indicates  a  very  healthy  population.

They're  also  vital  to  helping  develop effective  conservation  strategies.

Now,  the  way  we  intend  to  do  that

is  use  what's  called  the  Footprint Identification  Technique  or  FIT.

This  has  been  made  popular through  wild  track,

is  a  non-i nvasive  method for  identifying  individuals

based  on  images  of  their  tracks.

This  is  especially  helpful

since  you  may  not  be  able

to  actually  see  a  fisher in  the  wild  or  capture  them,

but  their  tracks  are  everywhere  so that  should  be  helpful  to  identify  them.

U sing  JMP, we  were  able  to  create  a  technique

to  distinguish  fishers from  a  nearby  species

known  as  Pacific  martens,

as  well  as  distinguish sexes  within  species.

T he  way  this  works is  we  started  with  a  data  set

of  around  160  something  martens

and  well  over  300  fishers  consisting of  about  34  males  and  27  females.

What  they  would  then  do  is  then, as  you  can  see  here  on  the  track  image,

they  would  identify  seven  landmark  points, is  what  we  call  them,

and  then  from  those,

we  could  then  compute well  over  120  something  features

consisting  of  lengths, distances,  angles,  and  areas.

What  we  would  then  do, is  then  using  those  features,

we  would  then  feed  that into  a  linear  discrimination  analysis,

which  we  could  then  use to  discriminate  among  species

and  then  sex  ID  within  species.

To  help  assess  that  fit, we  split  the  data  into  50%  training,

and  for  the  remaining  50%,

we  evenly  split  roughly between  validation  and  testing.

Prior  to  the  modeling,

we  also  tried  to  look  at the  effect  of  track  orientation,

so  we  would  flip  the  left  tracks horizontally  to  match  the  right,

and  then  also  any  potential  bias from  the  observers.

These  are  people  identifying landmark  points,

so  we  wanted  to  check  and  make  sure

that  any  variation  there did  not  affect  our  outcomes.

T hankfully,  both  the  orientation and  the  observer  bias

did  not  have  a  significant  effect on  our  outcomes.

W hat  brought  myself  and  my  colleague Ryan  into  the  project  was they had...

I  noticed  that  some  of  the  tracks as  they  were  classified,

seemed  a  little  to  have  a  little  bit  too much  spread  in  them

to  the  point  that  maybe there  was  actually  multiple  individuals.

T he  way  they  would  collect  this  data

is  there  would  be  a  little cage  area  out  in  the  woods.

Fishers  could  easily  go  in  and  out,

and  there  was  a  track  plate  in  the  bottom that  would  capture  their  footprints,

and  there  was  also  little  spurs that  would  capture  a  bit  of  their  hair.

It  didn't  hurt  the  animal.

They  had  no  idea  what  was  going  on.

W hat  they  would  then  do  is  take  some samples  of  those  hairs

and  send  them  out  for  genetic  testing,

which  was  a  bit  of  a  long and  expensive  process.

Now,  because  of  the  way  things were  sampled,

you  might  have  a  sampled  hair that  would  identify  the  animals

as  potentially,  say,  male, but  what  could  have  happened

was  a  male  and  a  female might  have  gone  in,

and  you  only  cut  hair  from  one  of  them,

so  the  tracks  might  indicate potentially  multiple  individuals,

whereas  the  genetics  said there  was  only  one.

W hat  they  wanted was  a  method  to  be  able  to,

a  more  data- driven  method, if  you  will,

to  identify  potentially misclassified  multiple  individuals

that  we  could  then  exclude from  our  analysis

so  that  it  wouldn't  bias  the  results.

B efore  we  actually  got  into that  procedure,

one  of  the  things  that  we  would  do is  use  JMP's  Predictor  screen  tool

to  identify, for  each  response  of  interest,

what  were  some  of  the  top  predictors?

Notice  for  species and  sex  ID  here?

There's  actually  a  lot  of  common  features

that  I'll be  able  to  distinguish between  the  two

or  at  least  have  a  strong  ability to  help  distinguish  between  the  two.

Much  more  so  with  the species  than  the  sex.

We've  shown  you  what  these variables  look  like  over  here,

so  area  one is  the  complete  shaded  region.

We've  got  some  distances,  V 16,  V 15.

Y ou'll  notice  a  lot  of  them  have  to  do essentially  with  the  size  of  the  track.

We've  got  some  big  distances  in  there.

I'll  get  back  to  these  in  a  second, but  using  those  top  features ,

let  me  get  back  to  a  full  screen  of  that.

Us ing  some  of  those  top  features,

we  would  then  make  a  plot that  looks  like  this.

T his  is  just  plotting  it by  the  individuals.

All  the  red  ones  here  are  females. All  of  these  are  males,

so  already  visually,  you  can  tell  why these  are  some  of  the  top  predictors.

Just  visually,  you  can  see those  groupings,

clear  groupings  between  the  sex  ID.

What  we've  identified  with  these arrows  is  you'll  notice  a  big  spread.

You've  got  a  cluster  here  and  here.

Got  a  little  bit  here  and  there, especially  here  and  there.

This  is  what  they  were  interested  in, especially  with  the  males,

because  what  this  could  be  is we  could  have...

It  could  be  the  same  male, just  a  lot  of  spread,

that's  a  bit  unlikely.

We  could  have  a  male and  a  young  male,

or  we  could  have  a  male, and  that's  actually  more  of  a  female,

but  we  don't  really  know.

T hey  wanted  a  more data- driven  method  to  say,

is  this  something  we should  be  concerned  about?

Is  that  spread  too  much?

W hat  we  did  is,  we  used  a  control  chart, which  is  from  industrial  statistics.

We  thought  that  was  actually  ideally suited  because  control  charts

are  built  for  identifying  parts that  are  out  of  spec,

and  so  what  we  did  is  created  a  control chart  for,  here's  females  and  males,

and  notice  they  each  have their  own  limits,

this  is  because  there  are  potentially multiple  tracks  for  each  individual,

so  we  could  get  a  sense  of  their  spread on  an  individual  by  individual  basis.

Y ou'll  see,  we  flagged  some  individuals that  might  have  too  much  spread.

This  is  an  S- chart  that  stands  for  sigma.

W e're  looking  at  the  spread, if  you  will.

W e've  got  a  couple  of  individuals  that

maybe  there's  a  bit too  much  spread  in  there,

so  that  could  potentially  mean  that  there might  actually  be  multiple  individuals.

O n  that  basis,  we  then  excluded  those individuals  from  the  final  analysis,

and  speaking  of  the  final  analysis,

once  we  ran  everything  through the  linear  discriminant  analysis,

what  we  found  was, for  distinguishing  between  species,

we  only  needed  one  feature, that  is  this  V 16  right  here.

I  call  it  the  distance  between  the  thumb and  maybe  the  middle  finger  or  something.

Those  are  not  formal  biological  terms.

Please  don't  quote  me  on  that.

But  just  visually,  that's  what  I  see, so  that's  a  big  distance  measure.

Using  just  that,

we  were  able  to  successfully distinguish  between  species

with  99%  classification, successful  classification  rate,

we  missed  only  four  out  of  500  tracks, so  that  is  an  incredible  result.

For  the  sex  ID  within  fishers.

Using  just  these  two  features, v 15  and  V6 ,

which  is  a  distance  between  what I  call  the  thumb  and  the  upper  palm.

Again,  not  formal,  biological  terms.

By  using  those  two,  we  got  a  successful classification  rate  of  around  90%,

and  most  of  the  individuals that  we  misclassified,

were  actually  males misclassified  as  females.

In  our  interpretation, what  that  might  mean  is

they  could  have  been  actual  females,

or  maybe  they  could  have  also been  young  males.

In  either  case, both  are  strong  indicators  of  family  units

and  thus  potentially  healthy growing  populations.

T hat  was  our  contribution to  this  project.

We  hope  it  goes  on to  provide  a  significant  impact

in  conservation  of  the  species.

If  you  have  any  other  questions, I'll  be  around  and  meet  the  experts

and  also  the  poster  presentation  session.

I'd  be  happy  to  answer  them  there.

Enjoy  the  rest  of  the  summit.