Social curation of web content is a practice you engage in every day, sometimes even without being aware of it. Posting a link to your Twitter, Facebook and Tumblr (or Digg, Diigo or Delicious) is in essence you curating web content, i.e. you making a recommendation to your peers. You don’t even have to actively post links to be a curator: a click on the Facebook Like button below a link that has been shared by your friend is little more than “you recommending what your friend recommended”.
The holy grail of social curation is to find a way to present the most “relevant” stories to the user. Several companies are trying to solve this problem. In this post, I’ll try to describe and comment on the approach of one of these companies: Summify.
Summify is a “social news reader” where you can connect your Twitter, Facebook and Google Reader Accounts. Summify’s approach is to make a daily selection of five stories they think you would be interested in. They analyze the links that have been posted by those you follow on Twitter and those you are friends with on Facebook – I’m not sure how Google Reader is integrated in it – and then they make a selection of the five stories they believe are most interesting to you.
So how do the selection mechanisms work? Summify is at times quite transparent in telling the user how they select the five stories. The first story for March 29 is Jacob Barnett, 12, with higher IQ than Einstein develops his own theory of relativity by The Daily Mail. Here, they disclose that they take into account (1) the number of tweets and (2) the number of Facebook likes, shares and comments. They are also showing two Twitter users that have shared the story, @PeterDeYoe and @briankotts (see image below).
Why are they highlighting these two Twitter users? Are they “highly influential” (according to whatever ranking mechanism…)? Perhaps; they have 743 and 2454 followers respectively, numbers that could be considered to be high enough to qualify as influential, but considering the number of people that have tweeted the story (720 people) there ought to be users that have more followers than kotts and yoe have. Are they Twitter users that @medeamalmo – the Twitter account that is linked to Summify – communicate with often? The answer is no. Where these two users those who first tweeted the story? Answer: hard to tell without digging deeper into the data, but probably not. One can conclude that Summify gives no explanation to what @PeterDeYoe and @briankotts have to do with this story, other than that they have obviously shared the story in the last 24 hours.
The selection of the second story is less transparent. Summify present four randomly (?) chosen users who have “shared this story”. Hovering their images reveals that they shared these stories on Twitter (hovering the below image does not work).
The logic of the news media
So, is transparency important? For a company, being too transparent in showing how the selection works might be equal to revealing industry secrets and opening up for people to try to game the system. This is why you can’t easily find information on how Google’s PageRank algorithm(s) work. This is why Facebook won’t disclose how the Top News of your Facebook Newsfeed are selected. The consequence of this is that we, as users, face the same issue that we did in the pre-web era when your newspaper or your daily television news show made a non-transparent selection of what stories they deemed news-worthy and what stories were not. The “logic of the news media” has been replaced by “the logic of the algorithms” or “the logic of engineers”. Who knows what is the worse.
Google promise us that the results from a google search is un-biased; sponsored search results are clearly indicated as links that some organization or company has purchased. On Facebook, people have been trying to figure out how the selection of Top News works by experimenting and observing the results over a period of weeks, see e.g. The Daily Beast’s Cracking the Facebook Code. Selection of stories on Digg is/was straightforward; users “digg” or “bury” links that are posted and those stories most “dugg” will float to the top.
But what about companies like paper.li, Summify, My6sense or soon-to-come curation functions on Flipboard that might, if they attract enough users, become agenda-setters in the space of selection of socially curated content? Will they adhere to the “don’t be evil” motto of Google and refrain from allowing sponsored content to enter your “Daily Summify” without disclosing that it is just that, sponsored content? Probably, they will. The United States have a strong legislation on sponsored content. E.g. the FTC has updated its Guides Concerning the Use of Endorsements and Testimonials in Advertising with a requirement that “bloggers who make an endorsement must disclose the material connections they share with the seller of the product or service.” Fines for violating the new rule will run up to $11,000 per post (Source: Mashable). If this also applies to aggregators of curated content, I don’t know.
What is probably worse is that we are being bullied by “the logic of the engineers” and algorithms pretending to be smart. The companies mentioned above have an opportunity to be more transparent on how their algorithms select stories and hopefully this can be executed without too many loop-holes for gaming the system. For a company and a service like this to really take off and not just be the tools of early adopters and Twitter nerds, people will probably want to know what is happening behind the scenes; to know why this particular piece of content is deemed news-worthy by the engineers and algorithms. Opening the curtain might be the holy grail.
Update May 6th, 2011. Eli Pariser has this to say in the recently published TED Talk:
And the thing is that the algorithms don’t yet have the kind of embedded ethics that the editors did [the human gatekeepers]. So if algorithms are going to curate the world for us, if they’re going to decide what we get to see and what we don’t get to see, then we need to make sure that they’re not just keyed to relevance. We need to make sure that they also show us things that are uncomfortable or challenging or important […] other points of view.