Last week, news broke of a paper published in the Psychonomic Bulletin and Review by Kyle Jasmin and Daniel Casasanto claiming to observe a positive relationship between the “right-handedness” of a word and its emotional valence. This is being called the ‘QWERTY effect’. (You may recall that ‘valence’ is psych-speak for ‘happiness’ associated with words.  What I called right-handedness they call the right side advantage of a word, \text{RSA} = (\text{\# of right side letters}) - (\text{\# of left side letters}) when typed using the ubiquitous QWERTY keyboard. )  You can read the original paper here, and there’s a Wired article that explains their conclusions.

Particularly interesting for the group here in Vermont was Jasmin and Casasanto’s use of the Affective Norms for English Words (ANEW, from Bradley and Lang (1999)) dataset, along with comparable data for Spanish and Dutch, in their analysis. The hedonometric work we’ve done on blogs, music lyrics, Twitter, etc. was initially based on the happiness scores from the ANEW study. The 1034 ANEW words were handpicked to represent the emotional spectrum, and as such don’t represent a uniform selection of words found in English-language texts. We merged the 5,000 most common words from 4 corpora (Twitter, Google Books, the New York Times, and music lyrics) and had Mechanical Turk users evaluate their valence in the same way as was done for ANEW, producing a list of ~10,000 words and their associated happiness scores. We’re calling this dataset LabMT-1.0, for Language assessment by Mechanical Turk. Since LabMT words were picked by frequency of usage, they provide much better coverage (i.e. the percent of words identified in a text) than ANEW.

When Jasmin and Casasanto’s paper appeared and achieved the impressive press coverage that it did, it also attracted the scrutiny of other language researchers who weren’t so sure of the significance of the QWERTY effect. A public debate has taken place between Mark Liberman of the Language Log blog and the authors of the study. See post1, post2, the response from J&C, and the response back. After being informed by (our) Peter Dodds of the LabMT data, Liberman made the second post, in which he calculated the RSA of our LabMT words but continued to find no or little QWERTY effect.

As we’ve explored in our hedonometrics papers, the hedonometer can be thought of as a tunable instrument when you remove neutral-valence stop words, effectively increasing the sensitivity of happiness measurements for texts. I wanted to see if tuning \Delta h_{\text{avg}} changed anything. In the process, I also repeated Liberman’s analysis of the LabMT data and am making available the R scripts and data that went into that.

analyze_rsa_labmt.R – script for the analysis and plots
labMT.rsa.txt – Liberman’s computation of RSA for the subset of LabMT-1.0 words containing only alphabetic characters

We haven’t seen any more evidence than Liberman did when looking simply at the relationship between RSA and h_{\text{avg}}. If the QWERTY effect is real, then it is exceedingly small, but the above data point to it being indistinguishable from zero.  It’s useful to look at the raw data, binned in both variables.

Raw data binned (RSA spacing of 1, h_text{avg} spacing of 0.1) and plotted.

There is not any obvious, visually distinguishable correlation.

Now, if you take the average happiness of words for each RSA value, you can do a linear regression on that data, weighting each point by the number of words for that RSA value.

Data binned by RSA with the line indicating the linear regression weighted by the number of words for that RSA. Note that this is the same as a linear fit of the unbinned data, but the resulting plot is less cluttered and easier to read.

The trend actually runs in the negative direction, but with a p-value of 0.74, meaning there is no effect. Jasmin and Casasanto controlled for more variables in a different dataset, and independent evaluation of the significance of the correlations they observed, controlling for these other attributes, would be possible if all the original data were released. Sure, the data sources are listed, but it would be a significant effort to recreate the entire set. I’d also be curious to see if similar correlations could be observed in the other affective variables measured in ANEW (arousal and dominance).

Final note: Changing \Delta h_{\text{avg}}, our tuning knob, does change the magnitude of the correlation. (Imagine removing a horizontal band from the binned plot above; this changes the correlation.) However, it is still impossible to conclude that the effect is significant. Also, analyzing positive and negative words separately shows opposite trends for \Delta h_{\text{avg}} = 1. The code for this is all included in the script above.