Do some political candidates use first-person (I, we) or second-person pronouns (you, you all) more often in their campaign tweets?

In this course exercise, students learn how to test distributions (analyze, distribution, test probabilities) using a grouping variable (BY). The data set is comprised of tweets ( N = 1107) from the early stages of the 2016 U.S. presidential primary season.

First-person pronouns focus on the speaker, or possibly, the group to which the speaker belongs. Second-person pronouns communicate a personal connection to the audience, suggesting that the candidate might be seeking to establish a personal connection.

A chi-squared test of the relationship between political party (Democrat vs. Republican) and the use of  first person (Present vs. Absent) is significant. The distributions are tested against a 50-50 distribution to see if Democrats or Republicans are more likely to use first-person pronouns.

The test of the use of second-person pronouns occurs at the canditate level. In this data set, political candidates use the second person in 20% of their tweets. But who uses second person more (or less) than the other candidates? In this part of the exercise, students compare each candidate's use of the second person against the group's 80-20 distribution.

Tests are conducted on tweets from Hillary Clinton, Bernie Sanders, Marco Rubio, and Donald Trump. The results indicated that only some of the candidates used second-person pronouns more often than the group average.

Hi, I'm Robert McGee.

I'm an Associate Professor of Integrated Marketing and Communication

at the University of Mississippi, also known as Om iss.

What I have today is a demonstration of a teaching exercise I use with students.

The title of the presentation is called

Communication style and political campaigns,

promoting a personal connection with an audience.

The question is,

do some presidential candidates use the first person or the second person

more than others during their tweets on Twitter?

This is an important question because we want to form a personal connection

between a candidate and an audience.

One way they can do that is by the use of language in their social media.

The students manually coded tweets during one week

of the presidential primary season in 2016.

They recorded every tweet

that was issued by all 19 or 17 presidential candidates at this time.

What we're going to demonstrate today is how we can test the probability

of a distribution by using the grouping variable by.

The first thing we have after we recorded 1,107 tweets,

the first thing we're going to test

is whether the use of first person varies by party.

This is a typical ChiSquare test.

It's two levels: political party, Democrat versus Republican,

and first person is either present or absent in the tweet.

You can see the test of the relationship there.

The likelihood ratio is significant.

You can look at the graph

which shows us that Democrats typically used first person

a little more often than Republicans, and it was a significant difference.

Now let's get on to the second person.

You can do the same thing. Look at the candidate

and the use or presence or absence of a second person in the tweet.

You'll see also that it's a significant relationship.

The likelihood ratio, the value is 83.7 and it is significant.

Then you look at the graph and you see it well.

Some people obviously used the second person more than others,

but which ones were really different from the others?

You can look at the contingency table,

and in the contingency table, you look across the rows, you'll see

how often each candidate used the second person.

Like Ben Carson used it 4% of the time of his tweets.

Chris Christy used it about 25, 26% of the time in his tweets and so on.

We see Hillary Clinton use the second person about 16 or 17% of the time

in her tweets during that week.

But what we want to be able to do

is test that specific probability or the probability of that distribution.

It is Hillary Clinton's distribution of 17% and 83% really different

from the overall average of all the political candidates.

If you look at the bottom of the contingency table,

you'll see that the distribution really was 80% and 20%.

But you can also find this information by distribution.

Look at Analyze, then Distribution,

and we put the variable in the Y box and hit Okay.

You'll see the frequencies or

the probability the distribution is 80 and 20.

So 80%, 19.9%, I roughly put it at 80% and 20%.

What we want to know is if Hillary Clinton and other candidates

use the second person more or less than this average.

We're not looking at a 50/50 test, we're looking at an 80 versus 20 test.

To do this,

we are going to use the By box or the By field.

To subdivide this distribution by each candidate,

we're going to put the variable candidate in the By box.

We still have our dependent variable in the Y box,

the use of the second person,

but we're going to subdivide it by the variable candidate,

which will produce a unique or individual tests for each one of the candidates.

When you look at this, you'll get a result for each candidate.

For example, Ben Carson first,

and then Chris Christy second, and so on for each one of the candidates.

It'll tell us the same information that we

have in the contingency table with the little graph.

But what we want to know is if this distribution is different

from the 80-20 distribution

that we have for all of the candidates overall.

To do this, we look at the person that we're

interested in, in this case, Hillary Clinton,

and we see that the probability of the distribution is 83 and 17%.

We go up to where it says second person, the name of the variable,

and click on the drop-down menu, the red triangle,

and we find the command test probabilities.

We're going to click on test probabilities and a new dialog box opens up.

This dialog box lets us establish the own benchmark that we want to use.

Rather than testing it against 50/50,

we're going to test it by against 80 and 20.

I type in 0.8 and 0.2 because that's what we're testing.

I leave the setting at a two-tailed test.

I don't know if it's going to be higher

or lower than 80, 20 when I test these distributions.

I'm going to leave it as a two-tailed test.

But I put in my benchmark of 80% and 20%, which I got from the contingency table

or from the overall distribution of the use of second person.

Then we click done.

Here's what we have. This is part of the results.

You'll see that she had 96 tweets.

Of those, 83% did not have the second person, 17% did have second person,

and we're testing it against the distribution of 80/20.

The likelihood ratio or the ChiSquare value is 0.69

and the P value is not significant.

Her use of the second person did not vary significantly

from the overall group average of 80/20.

Let's try somebody else. We do the same thing.

This time we'll do it for Bernie Sanders.

He had 150 tweets that week.

You'll see that he used the second person only about 5% of the time.

We test that against the 80/20 distribution

of the overall group of politicians,

and we see that the ChiSquare is significant it's 29.7 or 29.8%,

and the P value is less than 0.0001.

So yes, his distribution or his use of the second person significantly varied,

but in this case it was significantly less,

only 5% compared to the overall average of 20%.

It's significantly less for him.

Let's try someone else.

Marco Rubio was a presidential candidate in 2016,

and he uses the second person about 24% of the time.

We test that again against the 80/20 percentage,

and we see that his ChiSquare value for this test is 0.88,

and it is not significantly different from the overall average.

A distribution of 20 and 80%.

His use of the second person did not vary

between his tweets versus the overall average of all the candidates.

We'll look at another one. Here's Donald Trump.

He had 105 tweets during that week,

and you see that he used second person about 30% of the time,

which means about 30% of the time he was saying you or you all

or some form of that second person in his tweets.

We want to test that against a distribution of 80 and 20%.

The likelihood ratio is significant.

The ChiSquare value is 6.4, almost 6.5,

and the P value or the significance level is 0.01. You see here the test shows

that or suggests that he used the second person

more often than most of the candidates

who were running during the primary season in January 2016.

This is a way that we can use to test each one of those rows.

At the beginning of the 2016 primary season,

we see that Hillary Clinton and Marco Rubio used second person

to do out as much as everybody else did in the electoral season.

Bernie Sanders used the second person significantly less,

and Donald Trump used the second person significantly more.

This is a way to do a follow-up test on a Chi Square

when you need to test the distribution of individual rows.

You can do this using the Buy button.

You use this to subdivide.

The option to test the probability of a distribution allows us to set

a benchmark or comparison or reference group to something other than 50/50

or generally whatever we might be looking at.

In this case, we set it to 80/20.

This is a way to do follow-up tests

on a significant Chi Square when you can test the probability of a distribution.

I'm Robert McGee at the University of Mississippi,

and if you have any questions, there's my email address, feel free to reach out.

Thank you very much.

Published on ‎03-25-2024 04:53 PM by | Updated on ‎07-07-2025 12:12 PM

Do some political candidates use first-person (I, we) or second-person pronouns (you, you all) more often in their campaign tweets?

In this course exercise, students learn how to test distributions (analyze, distribution, test probabilities) using a grouping variable (BY). The data set is comprised of tweets ( N = 1107) from the early stages of the 2016 U.S. presidential primary season.

First-person pronouns focus on the speaker, or possibly, the group to which the speaker belongs. Second-person pronouns communicate a personal connection to the audience, suggesting that the candidate might be seeking to establish a personal connection.

A chi-squared test of the relationship between political party (Democrat vs. Republican) and the use of  first person (Present vs. Absent) is significant. The distributions are tested against a 50-50 distribution to see if Democrats or Republicans are more likely to use first-person pronouns.

The test of the use of second-person pronouns occurs at the canditate level. In this data set, political candidates use the second person in 20% of their tweets. But who uses second person more (or less) than the other candidates? In this part of the exercise, students compare each candidate's use of the second person against the group's 80-20 distribution.

Tests are conducted on tweets from Hillary Clinton, Bernie Sanders, Marco Rubio, and Donald Trump. The results indicated that only some of the candidates used second-person pronouns more often than the group average.

Hi, I'm Robert McGee.

I'm an Associate Professor of Integrated Marketing and Communication

at the University of Mississippi, also known as Om iss.

What I have today is a demonstration of a teaching exercise I use with students.

The title of the presentation is called

Communication style and political campaigns,

promoting a personal connection with an audience.

The question is,

do some presidential candidates use the first person or the second person

more than others during their tweets on Twitter?

This is an important question because we want to form a personal connection

between a candidate and an audience.

One way they can do that is by the use of language in their social media.

The students manually coded tweets during one week

of the presidential primary season in 2016.

They recorded every tweet

that was issued by all 19 or 17 presidential candidates at this time.

What we're going to demonstrate today is how we can test the probability

of a distribution by using the grouping variable by.

The first thing we have after we recorded 1,107 tweets,

the first thing we're going to test

is whether the use of first person varies by party.

This is a typical ChiSquare test.

It's two levels: political party, Democrat versus Republican,

and first person is either present or absent in the tweet.

You can see the test of the relationship there.

The likelihood ratio is significant.

You can look at the graph

which shows us that Democrats typically used first person

a little more often than Republicans, and it was a significant difference.

Now let's get on to the second person.

You can do the same thing. Look at the candidate

and the use or presence or absence of a second person in the tweet.

You'll see also that it's a significant relationship.

The likelihood ratio, the value is 83.7 and it is significant.

Then you look at the graph and you see it well.

Some people obviously used the second person more than others,

but which ones were really different from the others?

You can look at the contingency table,

and in the contingency table, you look across the rows, you'll see

how often each candidate used the second person.

Like Ben Carson used it 4% of the time of his tweets.

Chris Christy used it about 25, 26% of the time in his tweets and so on.

We see Hillary Clinton use the second person about 16 or 17% of the time

in her tweets during that week.

But what we want to be able to do

is test that specific probability or the probability of that distribution.

It is Hillary Clinton's distribution of 17% and 83% really different

from the overall average of all the political candidates.

If you look at the bottom of the contingency table,

you'll see that the distribution really was 80% and 20%.

But you can also find this information by distribution.

Look at Analyze, then Distribution,

and we put the variable in the Y box and hit Okay.

You'll see the frequencies or

the probability the distribution is 80 and 20.

So 80%, 19.9%, I roughly put it at 80% and 20%.

What we want to know is if Hillary Clinton and other candidates

use the second person more or less than this average.

We're not looking at a 50/50 test, we're looking at an 80 versus 20 test.

To do this,

we are going to use the By box or the By field.

To subdivide this distribution by each candidate,

we're going to put the variable candidate in the By box.

We still have our dependent variable in the Y box,

the use of the second person,

but we're going to subdivide it by the variable candidate,

which will produce a unique or individual tests for each one of the candidates.

When you look at this, you'll get a result for each candidate.

For example, Ben Carson first,

and then Chris Christy second, and so on for each one of the candidates.

It'll tell us the same information that we

have in the contingency table with the little graph.

But what we want to know is if this distribution is different

from the 80-20 distribution

that we have for all of the candidates overall.

To do this, we look at the person that we're

interested in, in this case, Hillary Clinton,

and we see that the probability of the distribution is 83 and 17%.

We go up to where it says second person, the name of the variable,

and click on the drop-down menu, the red triangle,

and we find the command test probabilities.

We're going to click on test probabilities and a new dialog box opens up.

This dialog box lets us establish the own benchmark that we want to use.

Rather than testing it against 50/50,

we're going to test it by against 80 and 20.

I type in 0.8 and 0.2 because that's what we're testing.

I leave the setting at a two-tailed test.

I don't know if it's going to be higher

or lower than 80, 20 when I test these distributions.

I'm going to leave it as a two-tailed test.

But I put in my benchmark of 80% and 20%, which I got from the contingency table

or from the overall distribution of the use of second person.

Then we click done.

Here's what we have. This is part of the results.

You'll see that she had 96 tweets.

Of those, 83% did not have the second person, 17% did have second person,

and we're testing it against the distribution of 80/20.

The likelihood ratio or the ChiSquare value is 0.69

and the P value is not significant.

Her use of the second person did not vary significantly

from the overall group average of 80/20.

Let's try somebody else. We do the same thing.

This time we'll do it for Bernie Sanders.

He had 150 tweets that week.

You'll see that he used the second person only about 5% of the time.

We test that against the 80/20 distribution

of the overall group of politicians,

and we see that the ChiSquare is significant it's 29.7 or 29.8%,

and the P value is less than 0.0001.

So yes, his distribution or his use of the second person significantly varied,

but in this case it was significantly less,

only 5% compared to the overall average of 20%.

It's significantly less for him.

Let's try someone else.

Marco Rubio was a presidential candidate in 2016,

and he uses the second person about 24% of the time.

We test that again against the 80/20 percentage,

and we see that his ChiSquare value for this test is 0.88,

and it is not significantly different from the overall average.

A distribution of 20 and 80%.

His use of the second person did not vary

between his tweets versus the overall average of all the candidates.

We'll look at another one. Here's Donald Trump.

He had 105 tweets during that week,

and you see that he used second person about 30% of the time,

which means about 30% of the time he was saying you or you all

or some form of that second person in his tweets.

We want to test that against a distribution of 80 and 20%.

The likelihood ratio is significant.

The ChiSquare value is 6.4, almost 6.5,

and the P value or the significance level is 0.01. You see here the test shows

that or suggests that he used the second person

more often than most of the candidates

who were running during the primary season in January 2016.

This is a way that we can use to test each one of those rows.

At the beginning of the 2016 primary season,

we see that Hillary Clinton and Marco Rubio used second person

to do out as much as everybody else did in the electoral season.

Bernie Sanders used the second person significantly less,

and Donald Trump used the second person significantly more.

This is a way to do a follow-up test on a Chi Square

when you need to test the distribution of individual rows.

You can do this using the Buy button.

You use this to subdivide.

The option to test the probability of a distribution allows us to set

a benchmark or comparison or reference group to something other than 50/50

or generally whatever we might be looking at.

In this case, we set it to 80/20.

This is a way to do follow-up tests

on a significant Chi Square when you can test the probability of a distribution.

I'm Robert McGee at the University of Mississippi,

and if you have any questions, there's my email address, feel free to reach out.

Thank you very much.



0 Kudos