Standing at the Tip of the Iceberg

Who Says What to Whom on Twitter is an absolutely fantastic piece of scholarship. It received a fair amount of visibility among the tech press for one of its findings–that 0.05 percent of Twitter users account for 50 percent of “tweets consumed”–IE, tweets that appear in people’s timelines. This was actually the least interesting finding of the study as far as I was concerned. Anyone who had read Clay Shirky’s essay on power laws would have anticipated that kind of disproportion.

What interested me about the study was first and foremost the methodology–the authors thought through very carefully how to go about gathering data, classifying the different users, and what the flaws of their approaches were. They then took measures to adjust for those flaws. For example, they used two separate methods for categorizing the most followed users in their sample. Each method had completely distinct biases, so the fact that they both yielded very similar results made a strong case that the authors were on the right track.

The study put the “elite” (highly followed) Twitter accounts into four categories–celebrities, media, organizations, and bloggers. Interestingly, they found that each category mostly followed and retweeted members of their own category. I immediately began to think of the economics blogosphere, which involves a handful of very widely read bloggers who all read and engage with one another. As Paul Krugman recently put it:

Twenty years ago it was possible and even normal to get research into circulation and have everyone talking about it without having gone through the refereeing process – but you had to be part of a certain circle, and basically had to have graduated from a prestigious department, to be part of that game. Now you can break in from anywhere; although there’s still at any given time a sort of magic circle that’s hard to get into, it’s less formal and less defined by where you sit or where you went to school.

Emphasis added by me.

Most bloggers these days also have Twitter accounts–this pattern also holds for the economics blogosphere. I would love to use the methodology adopted by the authors of “Who Says What to Whom on Twitter” in order to find and categorize the members of Krugman’s “magic circle” in order to more closely study the dynamics of that circle. The economics blogosphere is a microcommunity, a subset of the larger web and Twitter community–and within that microcommunity are people who more closely follow specific members of the circle than others, who are into Austrian economics, or monetary economics, or development economics. Observing how the online economics microcommunity coheres and fragments, in comparison to the larger view that the paper takes, would be very interesting indeed.

Part of what is so exciting about reading a paper like “Who Says What to Whom on Twitter” is both the sense of a larger media literature that spans decades, and the sheer newness of the field of social media research in particular. There is still so much unexplored territory. In attempting to answer the “what” of the titular question, the authors are forced to limit their focus to the New York Times‘ Twitter account, since the content that they tweet has been categorized already by what section of the newspaper it is in. This is a novel approach, but one the authors admit is quite limited.

How could one possibly categorize content on a larger scale? The authors mention another study that uses Amazon’s Mechanical Turk to this end, but argue that this approach does not scale well. I think that the companies in a position to do this right are the big search engines. Google and Microsoft get millions and billions and queries a day, and they record the specifics of what keywords people put in and what links they subsequently clicked. Moreover, they have methods for determining whether the websites they clicked were actually looking for.

Someone working in the Google or Microsoft research departments could take the data from Twitter and categorize the content being linked to by looking at what terms people used when clicking to that content from their search engines (and being satisfied with the result).

The breadth of the knowledge represented by the existing literature on media generally, and social media and the web in particular, is far larger than any one individual could possibly cover comprehensively. And yet, we’ve only just begun to scratch the surface of what can be known in these areas.

I urge those of you reading the sea of blogs on social media out there to take the time, every so often, to read a paper like Who Says What to Whom on Twitter. As Tyler Cowen once said of the economics blogosphere, “We’re just a small number of apes sitting at computers, relative to the overall literature.” There are people out there, like Duncan Watts and his co-authors, who are thinking very hard about the challenges of trying to understand social media and investing time and resources into tackling them.