Pages

Monday, June 11, 2012

Looking for the Perfect Tweet

Twitter Around the World

Somewhere out there, the Acme of Twitter-Friendly News Articles is lurking. It's probably a story about Apple, Google, or Facebook. It may also include a mention of Barack Obama or Adele. And it's just waiting for a social media jockey at a popular tech blog to calmly post it without much emotional baggage, right before a jillion retweets and link clicks ensue.

At least that's the idea behind an algorithm devised by Roja Bandari, Sitram Asur, and Bernardo Huberman as part of a joint research project by UCLA and Hewlett-Packard's HP Labs. The three researchers claim their algorithm, which weighs factors like an article's subject matter and source to determine its likely popularity on Twitter, is 84 percent accurate in guessing which news tweets will hit and which won't—crucially and most impressively, before the news item itself is actually published.

That's a pretty good rate of success and one that will likely get news organizations' social media teams to take notice (the researchers also said they were able to determine whether a news article will receive zero retweets with 66 percent accuracy). Here's how the team describes what it did in a paper published on the HP Labs site:

The news data for our study was collected from Feedzilla—a news feed aggregator—and measurements of the spread are performed on Twitter, an immensely popular microblogging social network. Social popularity for the news articles are measured as the number of times a news URL is posted and shared on Twitter.

To generate features for the articles, we consider four different characteristics of a given article. Namely:

- The news source that generates and posts the article

- The category of news this article falls under

- The subjectivity of the language in the article

- Named entities mentioned in the article

We quantify each of these characteristics by a score making use of different scoring functions. We then use these scores to generate predictions of the spread of the news articles using regression and classification methods. Our experiments show that it is possible to estimate ranges of popularity with an overall accuracy of 84 percent considering only content features. Additionally, by comparing with an independent rating of news sources, we demonstrate that there exists a sharp contrast between traditionally popular news sources and the top news propagators on the social Web.

More loosely organized outfits like Mashable, Mark Cuban's Blog Maverick, and Google's official blog are often more successful in generating Twitter traffic than the biggest and most respected news organizations, like the The New York Times, The Wall Street Journal, or Reuters. The source of a news article was a very strong predictor of its social media strength, they said, with the most popular sources in their Twitter Dataset producing the most tweets.

No big surprise there. Also not too shocking was the researchers' conclusion that technology news is the most social media friendly news category. That may come with a caveat, however—the team notes that articles about the biggest celebrities with the most rabid fans (think Justin Bieber) generate an outsized number of tweets. Tech as a general category may outperform entertainment, but it's difficult to think of a single tech topic that could ever challenge a Bieber or Lady Gaga on Twitter.

Indeed, the researchers found that the news category classification in their algorithm "did not perform well" as a predictor on its own. What's more, the language of an article and even the "named entities" (the team compiled a list of 40,000) wasn't nearly as important as the source.

So that hypothetical article from a top tech blog about Apple, Google, or Facebook that mentions Barack Obama in measured tones may generate "the Platonic version of the news tweet," as The Atlantic's Megan Garber put it in her review of the research team's findings.

Or it may not. As accurate as the team's algorithm is, the unfortunate takeaway from all this might simply be, "If you want to have runaway social media success, be the Google Blog."

For more from Damon, follow him on Twitter @dpoeter.

For the top stories in tech, follow us on Twitter at @PCMag.