Analyzing COVID-19 Vaccines Tweets using JMP Pro 15 (2021-US-EPO-888)
The COVID-19 pandemic is the most severe worldwide public health crisis in our generation, with more than 2 million associated deaths as of January 2021 (Dong E, Du H, Gardener). Posing a great challenge to our modern medicine, this pandemic was met with the rapid deployment of several types of vaccines all over the world. Since the vaccines are new and were deployed quickly in a fragmented information environment, we are interested in finding out how they are perceived among world populations. Using a sample of more than 60,000 tweets and JMP Pro 16, this paper analyzes and extracts insights about people’s perception regarding COVID-19’s vaccine. We found that while Moderna and Pfizer are the two most talked about vaccines, the Russian Sputnik has become very popular outside of the US. A large portion of the tweets carry hashtags that convey support for the vaccines. According to the tweets, some people have displayed concerns about side effects, but in general the sentiment on the vaccines is positive.
Speaker |
Transcript |
Tuan Le | Hello everybody, my name is Tuan Le and today I am going to give a presentation on using JMP Pro to analyze |
COVID-19 tweets sentiment. | |
As we all know, the COVID-19 pandemic is the most disruptive event for the past year for most of us. Fortunately, our scientists came up with the vaccine relatively quickly. And now the challenge to solve the crisis is the adoption of the vaccine. | |
Besides availability and accessibility, the adoption of the vaccine depends a lot on people's perception. | |
Everyone probably have their own opinion about the vaccine and it's pretty easy to get the opinion of other people around us. | |
However, our social circle's perception on the vaccine may not be representative of the opinion of the general population. That is where's this analysis comes in. Using Twitter | |
data from December 2020 to April 2021, we analyzed and extracted insights from the sample of 60,000 tweets all over the world. | |
Using JMP we identified the most popular vaccines, some topics and concern among the online chapter. We also implemented the new sentiment analysis tool found in JMP 16 for this analysis. | |
So, before we begin our main analysis we took an overview look to get an initial understanding of the sample. | |
The data have several main variables, including the username, net in terms of tweets, the hashtag of the tweet, and the number of followers, | |
the retweet ??? and tweet content itself. We see that the amount of the tweets | |
jump in March. The accounts with the highest number of followers belong to news outlets, such as NDTV in India, followed by celebrity and influencers' accounts. The tweets in this temple are unique; retweets are excluded from the sample. | |
So the primary tool we are using for this analysis is a Text Explorer file under the analyze menu. So this is a tweet explorer dialogue. | |
Before diving into tweet content itself, we took a look at the hashtag first. So hashtag will tell us the topic of the tweet that the user assigns themselves. Tweets can contain multiple hashtags representing multiple topics. | |
In the Text Explorer option | |
when analyzing the hashtags, we don't need stemming, so no stemming. The tokenizing method is just basic words. | |
We proceed and because we already knows that these tweets are COVID 19 related, so such words like COVID 19 and its variations are excluded from the analysis, because it does not add any information value to us. | |
So, very quickly, we can see that the top term here contains the names of the most popular popular vaccine. They are Moderna, Pfizer, | |
Sputnik, Astra Zeneca. So as you can see here, each type of vaccine can be referred by more than one term or hashtag. | |
So, now that we know the popular...most popular vaccine, we want to take a deeper look at the hashtags | |
because keeping the vaccine type and the country names do not really add value to us beyond this point. We are going to add these words to our stop word dictionary, because these words are too prominent, so it will... | |
it will overwhelm other terms and topics. So | |
after adding these terms into the stop word dictionary, we use a visual app that looks at the word cloud. And looking at this, we can see the most prominent terms are get vaccinated, vaccines work, vaccines save lives. Through this we can get some initial view that | |
this topic seem to have positive sentiment. | |
Now we proceed further and go explore the tweets themselves. We perform a similar work flow. This time we use stem for combining option for the tweets' content, because we want to combine the terms with the same base. This will help reduce the dimensionality of data. | |
The tokenizing method, we are going to use regex so that when we parse the document using the built in expression and symbol, including punctuation, spaces and tab. | |
So, after this step and adding the stop words related to COVID 19 and the names of the vaccines in previous analysis with the hashtag, | |
we come up with the most frequent terms in those tweets. And, as we can see here, the number one term is side effects. This reflects a very big concern that people have with the vaccine and | |
this is not surprising, because the vaccines are relatively new so and they came out quite fast. I have personal experience | |
that I can relate well with this from talking with my friends and family in Vietnam over the summer. | |
None of them are anti vaccine, they all want to to get vaccine and get the COVID 19 over with, but they are all very worried and reluctant | |
with taking some certain types of vaccines early. They all want to wait for Moderna or Pfizer, the more proven type of vaccine. And the news related...the report related to the blood clotting issue worries them a lot. | |
After this, we performed the latent class analysis to clusters of tweets and | |
thanks to this clustering, we find out that the contents of the tweets can be categorized into several topics. It could be | |
a discussion on the deployment of vaccine. It could be about the personal story of people getting the vaccine and how they feel. It could be about the concern that people have with the vaccine, and the tweet could also be about the situation of the pandemic in different countries. | |
Finally, with JMP Pro 16 we have a new feature accessible under the triangle menu of the Text Explorer. That is sentiment analysis. This tool will assign a sentiment score for each word and taking into account | |
the modifier, including the negation terms and the intensifier terms. So the negation terms include no, not; intensifier terms can be very, really. For example, | |
let's say that the world...the word happy is assigned a sentiment score of 60 positive sentiment score. | |
If the program sees the word no or not in front of it, it will assign a score of a negative 60 instead. The user can also manually select a term from this grid and add them to the dictionary and assign a sentiment score for the term by themselves. | |
This tool is very useful because the default dictionary is pretty comprehensive, but depending on the context of the analysis, we may want to customize a sentiment score for each word depending on the context. | |
So performing this analysis on this sample we have found that | |
JMP was able to score 7,300 | |
tweets. Among them, | |
1. | |
We can also, instead of looking at all tweets holistically, we can also zero in on something more specific, such as we will we can look at | |
only the tweets that are related to certain type of vaccine. For example, we can look at the tweets that are only related to Moderna. So when I did, that the ratio of | |
1. Pfizer is about the same thing. And other types of vaccines have...all the vaccine I looked at have | |
more net positive tweets than negative tweets, but the ratio is highest for Moderna and Pfizer. Other tweets, they have lower ratio, but overall we can... | |
we can conclude that, in general, regardless of what what type of vaccine we are talking about, people have mostly positive sentiment about the vaccines in general. Now | |
in conclusion, | |
as we find that | |
the Text Explorer tool in JMP Pro is very powerful but it's very easy to use, so that we can use it to perform text analysis and extract these insights for any text sample. | |
The results we got here regarding the tweets on the COVID 19 vaccine lastly confirmed our general expectation. | |
I want to highlight there is a big concern with the side effects of the vaccine that can make some people reluctant to taking it. | |
And this concern needs to be addressed more aggressively with better communication strategy and transparency to gain people's trust and their willingness to take the vaccine. | |
And that is my presentation on using JMP Pro to analyze COVID 19 tweet sentiments. Thank you very much. |