Our paper “Temporal Patterns of Happiness and Information in a Global Social Network: Hedonometrics and Twitter” appears in PLoS ONE this week. Their blog encourages you to tweet for the sake of science!
Among other findings, in this paper we demonstrate that human ratings of the happiness of an individual word correlate very strongly with the average happiness of the words that co-occur with it. This implies that tweets containing particular keywords can be used as an unsolicited public opinion poll.
For example, tweets containing “Tiger Woods” became decidedly less positive after his Thanksgiving disaster in 2009 as the words ‘accident’, ‘crash’, ‘scandal’, and ‘cheating’ are more abundant, while the word ‘love’ appears less often.
Happiness is measured relative to the ambient background of all tweets.
Sad words are blue, happy words are yellow. Up (down) arrow indicates that the word appeared more (less) frequently in tweets containing "Tiger Woods".
Generally, tweets containing personal pronouns tell a positive prosocial story with ‘our’ and ‘you’ outranking ‘I’ and ‘me’ in happiness. The least happy pronoun on our list is the easily demonized ‘they’.
Emoticons in increasing order of happiness are ‘:(’, ‘:-(’, ‘;-)’, ‘;)’, ‘:-)’, and ‘:)’. In terms of increasing information content (diversity of words co-occuring with each emoticon), the order is ‘:(’, ‘:-(’, ‘:)’, ‘:-)’, ‘;)’, and ‘;-)’. We see that happy emoticons co-occur with words of higher levels of both happiness and information but the ordering changes in a way that appears to reflect a richness associated with cheekiness and mischief: the two emoticons involving semi-colon winks are third and fourth in terms of happiness but first and second for information.
A list of the happiness ratings of tweets containing some interesting keywords can be seen here.
And not surprisingly, the happiness of all tweets appearing on a given day of the week correlates well with the happiness ratings humans give each day.
Happiness of tweets appearing on a given day
Human ratings of the happiness of each day of the week
You can download the language assessment by Mechanical Turk (labMT 1.0) word list here. It is a text file containing the set of 10,222 most frequently occurring words in the New York Times, Google Books, music lyrics, and tweets, as well as their average happiness evaluations according to users on Mechanical Turk. See the paper for details.
Much more to come regarding sociotechnical phenomena…