Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
Exploratory Data Analysis and Visualization of a Large, Online Vaccine Database using JMP Pro 16 (2022-US-45MP-1153)

Stanley Siranovich, Principal Analyst, Crucial Connection LLC


Recent events have brought about much discussion in both the popular press and the scientific literature about the safety and efficacy of some recent vaccination programs. One frequently referenced data source is the Vaccine Adverse Event Reporting System (VAERS), which now covers more than 20 years. In this session, we will demonstrate the Exploratory Data Analysis and Data Visualization capabilities of JMP Statistical Discovery software. We will begin by using the Subset, Join, and Concatenate platforms from the Tables menu, followed by Graph Builder from the Graph menu. Finally, we will make use of some Screening platforms from the Analyze menu. In general, we will show how to use JMP’s “front end” data selection and management tools, drag and drop interactive graphs, and linked analyses to speed the time to discovery for a large, complex dataset.



Good  morning. Good  afternoon

and  good  evening,  everyone,

wherever  you  are.

My  name  is  Stan  Siranovich,

and  I  am  the  principal  analyst at  Crucial  Connection L LC,

and  I'm  doing  this  presentation

from  Jeffersonville,  Indiana,

right  across  the  river from  Louisville,  Kentucky.

Today  I'm  going  to  talk  about

how  to  do  an  exploratory data  analysis  and  visualization

of  an  online  vaccine  data base

using  JMP PRO  16.

That  online  database

is  the  VAER  system.

So  what  is  VAERS?

That  is  the  Vaccine  Adverse Event  Reporting  System.

It's  a  national  early  warning  system to  detect  possible  safety  problems

in  US-L icensed  vaccine.

That  is  the  same  database that  has  been  in  the  news

for  the  last  year  or  two.

Let's  talk  about  why  it  was  developed.

First of  all,  it  was to  detect  new,  unusual

or  rare  vaccine  adverse  events, monitoring  increases  in  known  events,

identify  potential  risks,

and  assess  the  safety of  newly  licensed  vaccine.

Nowhere  in  those  goals do  you  see  anything  about

make  data  analysis  vaccine  data  set  easy,

which  is  why we're  doing  this  presentation.

Now,  it's   maybe  structured a  bit  differently

than  some  of  us  are  used  to.

I  came  out  of  the  lab and  production  facilities,

and  I'm  used  to, for  lack  of  a  better  word,

chemical  scientific, rather  small  data  sets.

On  rare  occasions,

when  I  did  work  on  something  larger,

somebody  in  the  corporate  hierarchy

cleaned  all  the  data  for  me.

That  is  not  the  case  here.

It's  organized  by  year, and  there  are  three  tables  per  year.

There's  facts,  there's  data,

and  various  symptoms.

What  I  did  was, the  first week  in  June,

I  downloaded  all  the  data  as  of  May  31st,

and  this  is  what  it  looks  like.

You  can  download  to  your  heart's  content.

You  do  have  to  sign  in.

Over  on  the  right  side  of  the  screen,

let's  see  what  I  got.

Now,  notice  in  years  here, 2018,  2019,  2020,

the  zip  files  are  roughly  the  same  size.

By  the  way,  this  zip  file  is  simply these  three  files  here  zipped  into  one.

That  is  normally the  best  way  to  work  on  it.

Download  zip  and  unzip  it.

But  notice  what  happens between  2020  and  2021,

which  is  at  about a  12,  14  percent  increase.

We  go  from  11.2  Meg  up  to  almost  169  Meg.

So  something's  going  on, and  we  would  like  to  take  a  look  at  it.

This  is  what  it  looked like  on  my  desktop.

Now  let's  talk  about  tables  there.

I  mentioned  there are  three  tables  per  year.

There's  vax;

contains  all  the  vaccine  information.

It's  got  information,  such  as,

almost  100 %  unique  VAERS  ID  at  the  top,

manufacture,  lot  type

and  the  data  is  where  a  lot  of  the  data

that  we're  going to  be  interested  in  resides.

Notice  it's  got  the  VAERS ID  again, and  it's  got  some  different  columns

that  we're  interested  in,

such  a  state,  age in years,  sex,  symptoms,

which  is  a  free  form  of  a  text  field that  sometimes  seems  to  go  on  forever,

whether  or  not the  patient d ied,  et  cetera.


contains  just  the  VAERS  ID  and  symptoms,

and  they  go  from   1-5,

and  sometimes they  continue  on  from   6-10,

and  we  will  address  that  issue towards  the  end  of  this  presentation.

Let  me  get  out  of  that,

drag  that  over.

Right now,  you  should  see the  JMP  window  open.

I  am  going  to  present  from  a  JMP  project.

Let  me  go  over  that  very  quickly.

By  the  way,  I  assume everybody  watching  this

has  seen  a  JMP  window  before.

This  is  my  workspace.

I  drag  the  three  files  in

and  opened  them  up in  JMP  down  here  as  contents.

I  opened  up  a  new  instance  of  this.

When  I'm  working  on  a  project,

I  drag  my  links  and  maybe  some PDFs  or  whatnot  into  that  space.

Then at the  bottom,  we  have  recent  files.

But  mainly  what  we  want  to  do, Iis  look  at  this  window,

and  the  main  window  here,

notice  we  have  tabs  here,

and you  click  on  the  tab  just  like  a  spreadsheet

and  we  can  see  our  different  sheets.

Let's  start  cleaning.

First thing  I'm  going  to  do  is  make...

Where is it? Where  there  is  VAX.

Make  that  the  home  table, and  you'll  see  why  in  just  minute.

So  I'm  going  to  start  there.

Now   notice  in  VAX_TYPES , everything's  mixed  in  together.

We  got  the   COVID-19, which  we  are  going  to  focus  on,


Scroll  down  a  little  bit, we  have  unknown.

We  have  Flu  X,  et cetera.

So  what  we  want  to  do is  separate  up  to COVID- 19.

The  way  we  do  that, and  I  am  going  to  go  up  to Rows

Row  Selection,

Select  Where.

By  the  way,  in  JMP,

there's  almost  always several  ways  of  doing  things,

but  to  keep  things  uniform,

I  will  always  go  up  to  the  menu  bar

to  do  this  type  of  thing.

What  we  want  to  do is  separate  COVID- 19.

So  I  select   VAX_TYPE.

From  the  dropdown,

I'll  just  leave  the  default  to  equals,

and  I'll  put  in  COVID-19

and  I'll  come  down  here and  uncheck  the  Match  case.

Let's  see, it's  checked  over  the  window.

It  tells  us  what  we're  going  to  do;

select  rows  in  the  data  table that  match  specified  criteria.

It  looks  like  it.

Let's  see.

Click  OK. And  there  it  is.

Notice  it  selected  all  the  rows.

It  skipped  the  HPV9 and  the  unknown  here,  et cetera .

What  we  want  to  do, now  that  we  selected  them,

is  to  separate  them  out.

So  I  will  go  to  Tables,  Subset,

and  it  tells  us  what  we're  going  to  do.

We're  going  to  create  a  new  data  table

from  selected  rows  and  columns.

I  went  to  Selected  Rows, which  check  for  us  already.

Notice  it  says  here, we  can  save  the  script to the  table.

Normally,  that's  a  good  practice.

Makes  it  much  more convenient  to  repeat  things.

But  I'll  leave  that  unchecked  for  now,

since  this  is  a  demo.

Is  there  anything  else  I  need  to  do?

Yes,  it  says,  Output  table  name.

For  simple  analysis, JMP  will  take  care  of  that  for  you.

But  for  anything  it  starts to  get  a  little  bit  complicated.

I  recommend  deciding on  some sort of  a  naming  scheme.

So  rather  than  Subset,

I  am  going  to  name  that  COVID- 19  only.

Click  the  OK  button.

Here  it  is,  COVID  only.

We  can  scroll  down,  and verify  that.

Notice  down  here  in  All  rows,

we've  selected  129, 975.

So  keep  that  one  in  mind,

because  over  here,  we  started  off...

Oh, no, it's  data. Where  was  it?

Back. Right  here.

We  started  off  with  146, 500.

We  just  got  the  COVID  right  there.

We'd  like  to  see  what's  going  on.

Notice  here,  we  don't  have  any  symptoms,

we  don't  have  any  adverse  events.

What  we're  going  to  have  to  do is  get  them  out  of  the  data,

and  that's  right  here.

Now  let's  take  a  quick  look  at  that.

We've  got  the  columns  displayed  over  here.

We've  got  all  sorts  of  things.

They  died,  length  of  stay, onset  date,  et  cetera

and here's  the  VAERS  ID.

Now  what  we're  going  to  do is  to  join  the  two  tables  on  the  VAERS  ID.

Let's  go  back  over  here to  our COVID- 19  data  only.

There's  VAX.

We've  got  that.

Now  what  I'm  going  to  do is  go  up  here  to  Tables.

We  want  to  come  down  here  to  Join  tables.

I  won't  get  into  that  database  stuff, but  let's  just  say  we  want  to  join  tables.

Now  I  went  to  VAERS VAX ,

and  did  it  because  it  says,  up  here,

"Join  this  with   VAERS VAX,"

and  down  here,

what  I  have  to  do  is  a couple  things.

First of  all, let  me  now  save  that  to  later.

Go  down  here and  select  VAERSDATA .

Notice  we  have  some  windows

that  pop  up  here.

That  shows  us  all  the  rows in  the  second  data  table.

Now  we  have  to  match  the  rows.

We're  going  to  come  here in  the COVID- 19  data  table,

the  VAERS_ ID, and  then  jump  into  the  end.

So  we're  going  to  click  here

and  they are  two  separate  windows, so  you  don't  have  to  get  to  control.

What  we  want  to  do  is  match  them so  that  things  don't  get  mixed  up.

Let's  look  down  here, and  we  get  to  some  data  table  stuff.

It   tells  us  that  it  is  an  inner  join.

An inner  join, selects  rows  common  to  both.

But  I'm  not  sure  about  this  data.

So  what  I  want  to  do  is  come  over  here to  Main  Table  and  select  Left  Outer  Join.

That's  going  to  keep  all the  ones  in  the  original  table,

which  was  the  VAERSVAX  table,

and  all  of  the  matching  entries

in  the   VAERSDATA  table.

Let's  look  a  little  bit.

Again,  we  can  save  script.

Let's  give  it  a  name.

Let's  call  that  one,


and  see  if  there's  anything  else  we  need.

Yeah,  you  know  what,

we  could  do  this  in  a  two- step  process.

But  why  don't  we  do  it  all  in  one?

Let's  go  back  up  here  to  the   VAERS VAX,

and  what  we'll  do is  we'll  keep  the  VAERS_ID,

because we don't  want to know what  that  is.

We  don't  need  TYPE, because  they're  all  the  COVID-19 .

Well,  we  may  want  to  look at  different  manufacturers  in  the  lot,

whether  it's Series  One, Two,  Three,  or  Four,  whatever,

a   VAX_ROUTE  and   VAX_SITE.

We  won't  worry  about  that  for  now.

Come  down  here

to  the  other  data  table  and  see...

What  do  I  want  to  check?

We  don't  need  the  VAERS_ ID  again.

Let's  check  STATE,  age  in  years.

Notice,  there's  a  couple other  age  columns,

but  we'll  leave  those  go  for  now.

We  don't  really  need  them.

We  want  to  know the  response  by  sex,  probably.

Don't  know,  SYMPTOM_TEXT?

Yeah,  we'll  keep  that  in.

We  certainly  want  to  know whether  or  not  the  patient  died.

They  died.


which  is  how  many  days they  spent  in  the  hospital

if  they  went  to  the  hospital.

Let's  do   NUMDAYS.

Now  we  know, from  checking  the  VAERS  website.

NUMDAYS  is  simply the  difference  between  VAERS state.

and  the  onset  of  symptoms  states.

So  that  tells  us, how  far  after  the  vaccination.

And  we've  got a  whole  bunch  of  other  fields  here,

and  I  don't  think  we  need  any  of  those.

Let me scroll up, make  sure  everything's  still  checked.

I  hope  I  have  everything  that  I  need.

And  I'm  going to  click  this  one  here .

It  says,  "Select  columns  for  join  table,"

in  case  you  can't  see  it.

I'm  going  to  hit  Select and  put  them  all  in,

I  hope,  and  we'll scroll  down  to  check.

A gain,  we  can  save  the  script  to  table,

but  let's  hit  OK.

We  named  it  VAX  join  DATA.

There  it  is, VAX join DATA  over  here.

Now  notice  we  have the  manufacturers,  the  dose,  the  state.

Let's look  at  state.

Have  a  number  of  missing  values

and  there's  some  other  considerations,

there  too  in  that  column.

But  we'll  get  to  those  later.


We're  all  set.

Let's  look  at  this.

Let's  expand  this.

This is  just  a  free- form  column.

Let's  make  use  of  one of  my  favorite  features;

Show  Header  Graphs.

Let's  hit  that.

And  wow!

Let's  see  what  we  have  here,

let  me  pull  that  over.

We  have,  let's  look  at  manufacturers.

We  want  to  see  if  maybe  one  manufacturer or neither  has  more  adverse  events.

It  looks  like  BIONTECH is  by  far  the  largest.

Then  we  have  some  unknowns.

Look  at  this.

We  have  almost  8,000   VAX_LOTs.

So  if  you  want  to  do an  examination  by  lot,

that's  going  to  be  rather  difficult.

We  see   VAX_DOSE_SERIES  here,

and  it  looks  like  Series  1 has  more  than  series...

Though,  that's  a  little  strange.

Then  comes  2,  then  comes  3.

How  did  that  happen?

Now  we  just  note  that  and  move  on.

Notice  for  STATE,

we  have  1, 2, 3, 4, 5 plus  54.

Well,  last  I  heard we  didn't  have  59 s tates

so  were  going  to  have  to  check  that.

AGE_YRS  looks  okay.

It  looks  like  we  have a  whole  lot  more  females  than  males.

SYMPTOM_TEXT , lot  of  stuff  there.

We  noted  all  that.

We  could  close  that

and  let's  do  break  up the  monotony  here,  the  cleaning.

Let's  do  an  analysis.

Let's  go  to  everybody's favorite  platform;  Graph  Builder.

We  go  up  to  the  menus,  select G raph

and  select  Graph  Builder  from  that.

Let's  see  what  do  we  have  here.

How  about  I  did  bring in  hospital  days,  I  hope.


Let's  do  hospital  days.

We'll  select  it,  put  it  on  the  y,

and  let's  see  if  there's  any  differences

between  amongst, or I  should  say  the  states.

Didn't  want  to  do  that. Just  wanted  to  select  STATE.

Put  that  in  the x, and  I'll  select  Bar  Graph.

There  are  the  bar  graph.

Notice  we  do  have  some  anomalies  here.

Let's  look  down  at  the  x-xaxis.

We  have  State AS ,

State MH,  State  PW,   State QW.

That  isn't  right.

So  we  know  we  have  to  do a  little  bit  more  cleaning  on  that.

But  this  is  a  demo,

so  we'll  leave  those  in  there  for  now.

That's  pretty  easy  thing  to  figure  out.

What  we  want  to  do  is...

Come  on,  keeps  popping  up.

You  can  be  a  little more  careful  about  that.

Come  up  here,

and  I  right- click  on  the  x-axis

and  come  up  here  to  order  by and  to  hospital days  descending.

There  it  is.

We  have  some  unusual  results  here.

It  looks  like  Wyoming.

Oh,  I  should  point  this  out  too.

JMP  automatically selected  the mean  for  us.

Which  for  our  purposes  is  probably  okay.

It  looks  like  the  mean  hospital  stay

for  the  people  who  suffered an  adverse  event  Wyoming

is,  I  don't  know, 20  in  a  fraction  days.

Which,  man,  that's  high.

Come down  here  to  the  next  one,  Vermont,

and  it  gives  us  the  number  of  rows.

Put  that  in  automatically for  the  hover  label.

A fter  that,  it's  Mississippi,

and  see  Oklahoma  and  Utah.

Let  me  find...

Let's  see,  New  York,  Pennsylvania.

They're  all  down  in  here.

For  some  reason,

I  don't  know if  it's  just  the  chance  or  what,

but  we  make  note  of  the  fact

that  a  couple  of  the  sparsely  populated states  seem  to  have  longer  hospital  stays.

Let's  go  back  to  here.

And  this  is  another  reason why  I  like  to  use  the  JMP  projects.

You  don't  have  to  go hovering  over  closed  windows.

Let's  go  to  Graph.

We  want  to  do that  graph   [inaudible 00:20:22].

Yeah,  let's  do  one  more  example.

Go  to  Graph  Builder,

and  we'll  take  STATE,

put  that  in  the  x-axis.

And  we  look  at  what  the  JMP  did  for  us.

And  it  looks  like  we  have some  high  numbers  here.

And  we  hover  over  California,  for  example,

and  it  says  it's  got   10,629  rows,

and  it  lists  them, and  it  gives  us  the  state.

What  JMP  did  here  was  automatically put  the  number  of  rows  in  there  for  us,

which  is  a  reflection of  the  number  of  patients

who  reported  an  adverse  event.

Let's  see,  Ordered  by  state,  descending.

So  let's  do  that  order  again

and  take  a  look  at  it.

This  makes  sense.

California  is  far  and  away

the  leader  with,  unfortunately for  them  with  adverse  events,

but  it  is  also  a  highly  populated  state.

The  next  one  is  Michigan, the  same  deal  Florida.

So  it  seems  to  somewhat mirror the  population.

There's  New  York.

We'll  just  leave  that  open for  now  and  move  on.

Now  let's  take  a  look at  the  VAER  symptoms.

Click  that  tab.

First thing  we  noticed, and  I  did  not  plan  this,

but  notice  VAERS_I D, row  one  and  two.

It's  the  same  number.

How  can  that  be?

That  is  supposed to  be  a  unique  identifier,

but  that  is  somewhat  common in  the  VAERs  data  set.

Let's  take  a  look  why.

By  the  way,

I  think  this  is  an  excellent  reason to  always  spend  some  time,

even  if  you're  in  a  hurry,

to  look  at  the  data  and  see if  anything  looks  a  little  bit  weird.

And  sure  enough,  two  rows with  the  same  row  number.

Now  notice  symptoms.

This  is  the   SYMPTOMVERSION,

or  rather,  this  is  the  MEDRA  database.

It's  how  the  medical coders  code  the  symptoms.

So  there's  some  degree of  uniformity  nationally

and  also  internationally.

This  is  the  MEDRA  version for  this  entry  right  here.

And  there  are  only  two  versions that  I've  run  across  so  far

in  my  work with VAERS , and  that's  24,  one  and  25.

So  we  have  symptom  one, we  have  symptom  two,  chest  pain.

We  have  symptom  three,  the  heart  rate,

and  it  goes  on  and on.

Then  we  come  back  and  we  have  some...

Looks a little  weird  here, SARS-CoV-2 and  whatever  in  this  duplicate  row

with  the  duplicate  number  ending  in  266,

which  is  really  not  a  duplicate,

because  there  is  only one  entry  out  of  the  five  in  that  row.

So  that's  a  bit  disconcerting.

But  we're  going  to  take  care  of  that.

What  we're  going  to  do  here, is  a  feature  in  JMP,

and  it's  stacking  the  variables.

If  we  wanted  to  do  in  analysis on  symptoms  from  this  table,

what  we  would  have  to  do  is  go  and  run  it

on  each  one  of  the  columns, et cetera,  et  cetera.

But  if  we  stack  the  columns, we  won't  have  to  do  that.

So  let's  stack  the  columns.

So we  come  up  here  to,  again,  Tables,

and  we  come  up  to  Stack,

and  we  select  Stack.

So  let's  pick  SYMPTOM1.

Hold on  to  control or  command  if  you've  got  a  Mac.

Up  here  to  five, take  a  look  at  our  check  boxes.

We  want  to  keep  everything, again,  s ave  script  to  source  table,

if  we  want  to, and  in  case  something  goes  wrong,

we  may  want  to  keep  dialogue  blocks  open,

but  I  will  not  do  that.

Now,  ask  us  where  to  move  the  columns?

Well, t o  last.

But  it  would  be  much  more  convenient if  we  moved them  after the VAERS_ ID.

So  I'll  click  that  radio  button,

click  the   VAERS_ID,

I'll  put  table  name, and  we'll  just  label  that,


We'll  come  up  here,  put  them  in,

five,  see  if  there's  anything  else.

Oh,  yeah,  new  column  names.

If  you're  stacking  several  columns,

which  is  often  a  case,

especially  if  you're  trying  to  pull in  some  data  from  a  PDF  file

that  was  made  for  human  consumption,

that  could  be  an  issue  here.

But  for  right  now,

we'll  just  leave the  stacked column  labelled, D ata

and the  source  column  labeled,  Label,

because  it  makes  sense.

And  one  final  check,

I  hit  the  OK  button,  and  there  it  is.

Well,  let's  take  a  look  at  that  one.

Here  we  go,  the  VAERS_ID,

and  sure  enough, we  have  the  label,

and  here's  the  data.

Here  we  have  our  favorite  row, the  one  that  ends  in   266.

Let's  take  a  look  at  that.

What  happened?

One goes  from   1-10.

So  we  have  ten  instances  of  the  same  row,

hence  table   1, 2, 3, 4, 5,

which  are  filled  in.

And  then  it  starts  again with  SYMPTOM1,

and  the  rest  of  them  are  empty.

So  that  adds  up  to  the  10.

Here's  our  MEDRA  data.

That  makes  sense.

I  know,  because  I  did  it  before.

This  is  where  we  want  to  be.

So  let's  go  up  to  Rows  again.

Row  Selection,

select  WHERE.

In  the case of  the  nomenclature here  sounds  an  awful  lot  like  SQL,

that  is  what  JMP  is  doing  under  the  hood.

We  want  to  select  some  rows.

So  let's  pick   CHEST PAIN,

We'll  leave  that all in  caps

and  we  come  down  here  again.

There's  this  little  check  box, hidden  way  down  here,

this  Match  case.

I  don't  want  to  match  case,

because  I  don't  know how  people  enter  the  data

and  how  the  good  people at  the  CDC,

clean  the  data  before they  posted  it  on  the  website.

But  by  the  way,  the   VAERS data  set

is  not  instantaneous  loaded  up.

It  doesn't  go  there.

The  CDC  usually  updates it  about  once  a  week,

and  they  clean  the  data,  then  update  it.

So  Match  case,  CHEST PAIN.

It  should  say  something about  the  dropdown.

We  want  equals,

but  we  could  use  does  not  equals, or  whatever  it  is that  suits  our  purpose.

So  let's  click,  OK.

Wait  a  minute.

Data  equals.

That  should  do  it.

There  it  is.

Here  I  hover  over it, it  says  SYMPTOM STACK .

And  we  look  down  here  at  rows  again,

I  find  myself  referring to  that  quite  frequently,

just  to  get  an  idea  what's  going  on.

We  started  off  wit 890,000  plus  rows

and  we  have  3,667  that  were  selected.

Let's  subset.

Go  up  here  to  Tables  again,

Stack,  where  it  is  Subset,

we  will  select  subset.

Of  course,  we  get  our  pop  up  window

that  tells  us  what  subset is  going  to  do  for  us.

And  we  click  on  that.

Let's  see,  we  went  to  Selected  Rows.

We  could  link that  to  the  original  data  table,

save  script  to  the  table, Subset  of  SYMPTOMS STACKED .

Let's  go  with  our  convention,

and  we'll  call  that  CHEST PAINS


I want to make  sure I  did  everything  right.


I  want  to  call  that   CHEST PAIN.

I  hope  that's  right.

Click  on  the  OK  button,  and  there  it  is.

We'll  take  a  look  here.

Notice  we  only  have  one  row  with   266

and  we  have  all our  other  rows  with  CHEST PAIN  in  it,

and  we've  got  3,666  selected.

Now  let's  go  back  to  the  data,

do  a  bit  more  analysis after  we  did  all  that  cleaning.

VAX join DATA.

That  looks  right.

Let  me  drag  that  over  there.

Let's  look  at  a  couple of  different  variables,

or  rather  the  graph.

This  time,  let's  do  the  summary  tables.

So  let's  go  up  to  Tables and  Summary's  at  the  top.

And  this  is  what  it  looks  like.

Well,  let's  look  at... What  do we  want?

Age  and   HOSPDAYS and   NUMDAYS.

I  am  going  to  put  that  in  here,

but  before  I  can  do  that,

JMP  wants  us  to  tell  it what  statistic  to  use.

Let's  use  the  mean.

I  selected  mean  from  dropdown,

and  it's  going  to  give  us the  mean  of  those  three  columns.

Now  there  are  some  other  columns here  that  we  have  an  interest  in.

I'd  say, well, DIED .

Yeah,  probably  interested in  whether  or  not  somebody  died.

See  what's  going  on  there.

So  we'll  select  that.

Now  we  can't  take  a  mean.

It's  a  binary  categorical variables  or variable,  rather.

So  let's  select  N  and  see .

No,  wait,  let's  look  at  STATE.

Where  was  it?

Where  was  STATE?

There  it  is.

Now  STATE,  we  have  50  plus

and  we  want  a  summary  table .

So  50  plus  summary  table.

Let's  put  STATE  into  group.

And  again,  we  can  save  the  script,

but  we  won't  do  it  here.

One  final  check, that  looks  right.

We  hit  the  OK  button.

In  here  is  our  summary  table.

Notice  that STATE  again, we  have  some  serious  cleaning  to  do.

There's  no  state here,

State GU .

There's  state MH ,  again.

Some  may have  missing values,  et cetera,  et  cetera.

And  it  gives  us  a  mean  age in  years  for  all  those  states.

And  it  looks  like  the  mean  age  there, is  somewhere  in  the  40 s  thfough to  50 s.

Come  over  here.

Let's  take  a  look  at  the  mean hospital  days  by  state.

And  that  makes  sense.

5, 6, 7  days.

Looks  like  we're  all hovering  about a  week.

Just  taking  a  quick  look  at  it.

I  don't  see  any  outliers.

And  come  over  here.

Do  the  same  with  NUMDAYS .

And  that's  the  number of  days  between  vaccination

and  when  they  say  the  effect  appeared,

and  let's  see,  number  died.

One  thing  we  want  to  notice  is  right  here.

Fortunately,  there  are  not a  whole  lot  of  people,  thousands  dying.

There's  a  couple  with  a  hundred  here,

but  notice  here, N(DIED)  with  the  blank  state.

Apparently  there's  a  lot  of  state with  number  of  died  missing.

So  if  we  want  to  analyze that  in  a  bit  more  detail,

we're  going  to  have  to  be  careful  there,

because  there's  a  lot  that  can't be  assigned  to  a  particular  state.

And  what  else  can  we  do?

Let's  just  take  a  look  at  all  that.

And  we  see  our  states  again,

these  five  plus   254.

We  have  the  number  of  rows  up  here.

Let's  see,  what  else  can  we  look  at?

Mean(AGE ),

haven't  mentioned  it  yet, but  JMP  is  interactive.

So  I  click  on  that  bar,  and  let's  see,

here's  some  people up  here  in  the  old  age  range,

older  age,  I  should  say, excuse  me, or people

I can  come  up  here.

It's  a  little  hard  select  that  one,  maybe.

I  don't  see  anything that  sticks  out  at  me.

Fortunately,  number  died, it's  pretty  big  bar  at  zero.

So  that's  good.

And  the  mean  for   NUMDAYS

between  the  vaccine

and  yet  first event is  rather  low,  right  here.

We'll  just  leave  it  at  that.

I  see  that  I  am  just  out  of  time.

So  what  did  we  do?

We  looked  at  a  large  online  database.

We  were  able  to  download the  ZIP  file  on to our  desktop,

open  them  up in  JMP.

We  were  able  to  do some  rudimentary  analysis

after  spending  a  lot of  our  time  data  cleaning.

Notice,  even  though  that  we  spent a  lot  of  our  time  cleaning  the  data,

we  were  able  to  do  it  in  JMP, which  of  course,  is  very  convenient.

Because,  number  one, we  didn't  have  to  switch,

I'll  switch  here,  switch  there, run  some  SQL  code  and  bring  it  back  in.

Number two,  we  could  do  our  analysis

in  line  as  a  bar  while we're  cleaning  the  data.

We  say,  "Oh,  that  looks  interesting."

We  went  to  our  Graph  Builder,

took  a  look  at  the  data, see  if  anything  peaked  our  curiosity

or  if  anything  was  out  of  place.

When  we  finished  our  examination, we  could  continue  with  our  data  cleaning.

In  this  case,  we  went  on  to  the  SYMPTOMS.

So  that  concludes  my  presentation.

I  hope  everyone  enjoyed  it and  hopefully  learn  something.

Maybe  I  should  put a  disclaimer  here  at  the  end.

This  is  the  way  I  would  do  it.

After  using  JMP  for  a  few  years,

it's  not  necessarily the  way  you  should  do it ,

and  not  necessarily the  best  way  to  do  it.

But  using  JMP,  we  were  able  to  bring down  some  data  and  analyze  it,  clean  it.

It's  some  data that's  quite  a  bit  in  the  news  right  now.

I  thank  you  all for watching.