#Soda or #Pop? Regional Language Quirks Get Examined on Twitter

  • Share
  • Read Later
Brice Russ / Ohio State University

Brice Russ maps out how people refer to carbonated beverages. Yellow dots indicate “pop,” red dots “Coke,” and blue dots “soda.”

Sneakers or tennis shoes? Hoagie or hero? Dust bunny or house moss? These differences in regional speech are thriving in an unlikely place — Twitter.

A study presented by Brice Russ, a graduate student at Ohio State University, at the American Dialect Society’s annual meeting in January demonstrates how Twitter can be used as a valuable and abundant source for linguistic research. With more than 200 million posts each day, the site has allowed researchers to predict moods, study the Arab Spring and now, map out regional dialects.

According to the New York Times, Russ waded through nearly 400,000 Twitter posts to analyze three different linguistic variables. He started by mapping the regional distribution of “Coke,” “pop” and “soda” based on 2,952 tweets from 1,118 identifiable locations. As has been documented in the past, “Coke” predominantly came from Southern tweets, “pop” from the Midwest and Pacific Northwest and “soda” from the Northeast and Southwest.

(MORE: Is Twitter Really More Addictive than Alcohol? The Vagaries of Will and Desire)

Russ then analyzed  the migration of “hella,” meaning “very” as in “hella cool,” the Times notes. The qualifier first sprouted in California, but to Russ’ surprise, has since made its way to the Midwest. He also took a look at a common Midwest and Pittsburgh-area syntactical construction, “needs X-ed” as in “the sink needs fixed.” This phrase seems to have moved toward the South since the mid-1990s.

Though it’s the latest attempt to harness Twitter users’ data to analyze regional language variation, it’s not the first. As reported by the Chronicle of Higher Education, researchers at Carnegie Mellon University developed a computer program that tracks the geographical location of Twitter users. According to the report, presented at last year’s American Dialect Society conference, users can be tracked within 300 miles of their location based on language usage and patterns.

“I really think that the availability of data like Twitter is a real game-changer for how people study language,” Jacob Eisenstein, one of the researchers on the CMU team, told the BBC. “What surprised us was that, in addition to the sorts of words and names that we expected to see, there was a whole other class of words that seemed to have a very strong geographical affinity that we had never known about before.”

Additionally, the New York Times notes that Lexicalist has been analyzing and mapping word use on the Internet according to demographics since 2010.

Twitter may enable large-scale, worldwide studies and cut out fieldwork typically demanded of linguistic scholars, but it can also limit studies to a generally younger, more urban base of users and most likely won’t replace face-to-face interview.

“There’s a lot of richness you can get by doing a personal interview in terms of finding out someone’s life story, and exactly how their location effects the way that they speak,” David Crystal, language expert and honorary professor of Linguistics at Bangor University, told the BBC.

MORE: Twitter Makes Your Old Tweets Available to Advertisers