Rock It Science: New Study Equates Musical Tastes to Personality Traits

"wbeem", Image by Scott Diussa

“wbeem”, Image by Scott Diussa

Many people have had the experience of hearing a new song for the first time and instantly being thunderstruck by it. They ask out loud or else think to themselves “Who and what was that I just heard?!” Next, they quickly race off to start Googling away in an attempt to identify the tune that just so totally captured their senses.

But what was it about the listeners’ musical tastes that led to this? Moreover, are their affinities for certain musical styles and artists a reliable indicator of their personalities – – and vice versa?

Researchers at University of Cambridge in the UK and Stanford University in the US have recently devised a new method for predicting musical tastes. Their study was published on PLOS One in a fascinating paper entitled Musical Preferences Are Linked to Cognitive Styles, by David M. Greenberg, Simon Baron-Cohen, David J. Stillwell, Michal Kosinski and Peter J. Rentfrow.

These findings were written up in an interesting article in the August 8, 2015 edition of The Wall Street Journal entitled If You’re Empathetic, You Probably Aren’t Into AC/DC by Daniel Akst. I will sum up, annotate and, well, orchestrate a few questions of my own.

Until the introduction of this new method, researchers traditionally pursued correlating musical tastes with the big five personality traits. These include:

  • Extroversion
  • Agreeableness
  • Conscientiousness
  • Neuroticism
  • Openness to new experiences

For instance, extroverts tend to prefer pop and funk. However, test subjects were asked to rate their preferences according to genre which, in turn, can have many gradations and variances. The article mentioned that rock music can include everyone from Elton John to AC/DC, the latter of whom appear in an accompanying photo to the story. (For that matter, who would have ever expected Springsteen to cover “Highway to Hell” during his tour of Australia in 2014?)

The five traits were also previously covered in a different context in the March 20, 2015 Subway Fold Post entitled Studies Link Social Media Data with Personality and Health Indicators.

Using this new methodology, the researchers turned to whether a person is an empathizer “who detects and responds to other people’s mental states” or a systemizer “who detects and responds to systems by analyzing their rules”. The article includes a link for readers to test themselves to assess their own musically influenced leanings towards one personality type or the other.

Leading the research was David M. Greenberg, himself a sax player, who is currently pursuing a doctorate in psychology at the University of Cambridge. The combination of his academic and musical interests is what led him to this line of research. He and his team were seeking a more precise and measurable “sonic and psychological” factors in their efforts to develop a system to predict musical tastes.

The data was gathered from 4,000 volunteers who were tested for empathy and then were asked to rate 50 songs. The findings showed that:

  • Empathizers preferred R&B and soft rock (“mellow” music), folk and country (“unpretentious” music), Euro pop (“contemporary” music), but not heavy metal. Within genres, they preferred “gentler jazz”, as well as “sadder, low energy music”.
  • Systemizers preferred “more intense music” including “punk and heavy metal”.

In the future, this research might be helpful to music streaming services like Spotify to further improve their song recommendation engine. (See also the August 14, 2014 Subway Fold post entitled Spotify Enhances Playlist Recommendations Processing with “Deep Learning” Technology.)

Mr. Greenberg is also interested in researching the reciprocal of his research findings in regards to whether particular types of music can raise empathy or systemizing levels.

My questions are as follows:

  • Might this research also be helpful to a startup like Reify which is developing augmented reality apps for music as covered in the July 21, 2015 Subway Fold post entitled Prints Charming: A New App Combines Music With 3D Printing?
  • Is this research applicable to the composition of music scores for movies, plays and TV shows as storytellers and producers seek to heighten the emotional impacts of certain scenes? (The December 19, 2014 update Subway Fold post entitled Applying MRI Technology to Determine the Effects of Movies and Music on Our Brains discussed Flicker: Your Brain on Movies, a book by Dr. Jeffrey Zacks that, among many other things, covered this type of effect.)
  • Is this research applicable to marketers in developing their ad campaigns aimed at specific demographic groups?

Studies Link Social Media Data with Personality and Health Indicators

twitter-292994_1280[This post was originally uploaded on January 27, 2015. It has been updated below with new information on March 20, 2015 and February 26, 2018.]

Reports of two new studies were issued recently describing meaningful connections between the predictive value of Facebook Likes and personality types, and next the parsing of language in Tweets to forecast the likelihood of heart disease. This presents us with an opportunity to examine two highly similar human health indicators that were identified by sophisticated analytics applied to massive troves of data generated by two of the world’s leading social media platforms. Where is all of this leading and what issues arise as a result? I will first summarize some parts of these two reports, add some links and annotations, and then pose some questions. I also highly recommend clicking through for a full read of both of pieces.

The first report was posted on NewScientist.com on January 12, 2015 with the concise title of What You ‘Like’ on Facebook Gives Away Your Personality by Hal Hodson. According to this article, researchers working at Stanford University and Cambridge University have developed an algorithm that, based completely upon what people “Like” on Facebook, can be determinative of a user’s personality. The data for this was gathered in a survey of 86,000 people who filled out personality questionnaires that were then matched against their activity on Facebook. Indeed, the results showed that this new method was more accurate than the determinations of the test subjects’ family and friends.

These characteristics are called the Big Five personality traits and include (as explored in detail in the preceding Wikipedia link):

  • Openness to experience
  • Conscientiousness
  • Extraversion
  • Agreeableness
  • Neuroticism

The article includes comments from David Funder of the University of California, Riverside, who is a researcher on personality, that while this study is “impressive”, it still does not provide a truly deep understanding of an individual’s personality. Funder’s work looks at 100 dimensions, a far larger number than the researchers in the Facebook study who focused upon the Big Five.

Nonetheless, two of these researchers on this new study, Youyou Wu  of Cambridge and Michael Kosinski of Stanford, believe their work is applicable on a global scale and applied in several areas. For instance,  they foresee their new Like algorithm could be used to in hiring operations to search large data files of candidates and identify those who might be most suitable for a particular job. Other possibilities include health and education. Kosinski also acknowledges that this approach would further require appropriate policy and technology considerations in order to address issues such its potential invasiveness.

(In a similar application Facebook Likes and other data from social media sites, universities in the US are now using such information and analytics to locate and pitch to alumni as potential donors as reported in a most interesting article in the January 25, 2015 edition of The New York Times entitled Your College May Be Banking on Your Facebook Likes, by Natasha Singer. Among other things, this story reports on the work and methods of two startups in this area called EverTrue and Graduway.)

The second report linking social media data to a health indicator was Scientists Say Tweets Predict Heart Disease and Community Health by Derrick Harris posted on Gigaom.com on January 22, 2015. In a study authored by researchers at the University of Pennsylvania, as part of their Well-Being Project, entitled Psychological Language on Twitter Predicts County-Level Heart Disease Mortality, they concluded that the vocabulary use by individuals in their Tweets can  predict “the rate of heart disease deaths in the counties where they live”. This phenomenon manifests itself by showing that Tweets concerning more upbeat topics and expressed in more positive terms correlated with lower mortality rates when compared to rates reported by the Center for Disease Control (CDC). Conversely, mortality rates were higher in areas “with angry language about negative topics”.

The accompanying side-by-said graphics of the Twitter data and the CDC data covering the upper right quarter of the US states and their constituent 1,300 counties, dramatically illustrates these findings. The pool of data was drawn from 148 million Tweets with geotags.

These results also provide further support for the accuracy and predictive validity of data from Twitter, notwithstanding any “inherent geographical biases”, and exceeding that of more “traditional polls or surveys”. Indeed, language in Tweets turns out to have a comparatively higher predictive value than other economic or health-related data. The researchers further believe that their findings might be more helpful when applied to “community-scale policies or interventions” rather than to assisting specific people.

My follow-up questions include:

  • Would mapping a statistically significant number of Twitter networks in counties with higher and/or lower mortality rates, a process described in the February 5, 2015 Subway Fold post entitled Visualization, Interpretation and Inspiration from Mapping Twitter Networks, provide additional insights that would be helpful to medical professionals and local policy planners? For example, are many of the negative Twitter posters in each other’s networks such that they become self-reinforcing? Are there recognizable network effects occurring that can somehow be corrected with regards to the degree of negativity and, in turn, public health? Would this pose any legal, policy or privacy issues?
  • For both of these articles, do these types of findings require more rigorous and wider-scale mathematical and scientific analysis before applying them to such critically important mental and physical health matters? If so, should such testing be done by public or private institutions, universities and/or the government agencies?
  • As first expressed in this November 22, 2014 Subway Fold post entitled Minting New Big Data Types and Analytics for Investors, how are the differences in correlation and causation being factored into these studies? Given the skepticism expressed above about Facebook Likes being so indicative about personality, are there other effects and influences that need to be identified and filtered out of these types of conclusions?
  • If the usage and analysis of social media data continues to grow in areas, well, like employment, education and health, what protections, if any, should people be given, by law and/or the social media companies, to protect themselves or opt out in advance of any potentially negative consequences?

March 20, 2015 Update:

Providing some very worthwhile additional insight and analysis of the University of Pennsylvania study covered in the initial post above, Maria Konnikova has written a very engaging article entitled What Your Tweets Say About You that was posted on The New Yorker website on March 17, 2015. I highly recommend clicking through and reading the entire text. I will sum up just some of the key points, add some links and pose several  additional questions.

The research study (linked to above), was conducted by a team led by psychologist and Professor Johannes Eichstaedt. Their main conclusion was that the collection and subsequent linguistic analysis of tweets proved to be validly predictive of locations with higher concentrations of fatalities from cardiovascular disease. The inverse was also true that geographic clusters of tweets with more positive content had lower death rates from the same cause. It was not that the population tweeting had heart disease, but rather, there is a discernible correlation between angrier content and a higher incidence of the heart disease within an area.

This “correlation is especially strange” due to the fact that Twitter users are generally younger that individuals who perish from heart ailments. Citing a January 9, 2015 study from the Pew Research Center entitled Demographics of Key Social Networking Platforms (also, imho, well worth a click-through and full reading), which, among other things tabulates the ages of the users of all of the leading social media platforms. Just 22% of US Twitter users are more than 50 years old. However, the relative risk of heart disease does not begin to rise until decades later.

How, then, to analytically connect younger people in a particular area who are posting negative tweets with their older neighbors who face higher chances of developing heart disease? The researchers theorize that the tweets “may be a window into the aggregated and powerful effects if the community context”. The overall health of people living in a particular area that is “poorer, more fragmented” and not as healthy as those residing in “richer, integrated ones”. As a result, the angrier tweets of someone in their twenties are likely reflective of an area with higher life stressors that, in turn, later result in more heart-related deaths.

Nonetheless, another renowned expert in this field of linguistic analysis of text, James Pennebaker, recommended caution in drawing any connection based upon this data. He urges further study of the data and posing additional questions about causation. Currently, in his own work, he is examining Twitter data to see how family and religious factors evolve.

There is also value in studying social media content of individuals. For example, Microsoft has previously studied 70,000 tweets of people with depression and then used this data to construct a “predictive index” to identify “other users who were likely depressed based on their social-media posts”.

Eisenstaedt’s team is continuing their work by looking at Twitter data for individuals and communities over time periods, rather than a “snapshot” data set. They are also adding Facebook profiles to their work.

Finally, Pennebaker believes that social media may also generate positive effects on mental health based on his previous studies on the benefits of keeping a personal journal. This may be so despite the private nature of a journal and the very public access of social media and its interactivity.

My additional questions are as follows:

  • Will additional discreet language patterns be discovered and validated that will indicate concentrations of other medical conditions within communities? Are we only at the beginning of using textual analysis of tweets as a metric of the states of local health?
  • Given that there is a lag time of years between negative tweets and the appearance of heart disease, should interventions be undertaken within a community at higher risk and, if so, by whom and at what cost?
  • Are other negative online behaviors such as cyberbullying indicative of some form of identifiable illness that can be treated on a community-wide basis or must this be dealt with on an individual in a case-by-case manner?

February 26, 2018 Update: Using social media activity data to diagnose and treat possible health conditions has advanced in a number of new systems and studies as reported in today’s New York Times in an article entitled How Companies Scour Our Digital Lives for Clues to Our Health, by Natasha Singer, dated February 26, 2018.