Imagine yourself in a world where nanotechnology has made scarcity and the associated traditional form of money a thing of the past. In this world, the only currency is the goodwill that people give electronically to one another and everyone’s overall resulting reputation score is accessible by anyone in real-time. This reputation is Whuffie and the term and world was coined and imagined by Cory Doctorow in his sci-fi novel, Down and Out in the Magic Kingdom.
Fast rewind to present time. We are a world where people increasingly publish digitally their life i.e. are “life streaming”: they publish pictures, blog posts, twits, videos, wikis, etc. Other people subscribe to these life streams (RSS/friendfeed), give attention to the ones they find the most relevant and sometimes comment positively or negatively on these life stream items. These comments are themselves life streaming items and subject to views and positive/negative comments from others.
One thing is missing to get us closer to Cory’s vision: real-time computation of anyone’s Whuffie, the Web 2.0 equivalent of your FICO score. How do we compute it?
I have only found one blog post so far on the the problem of the so-called Whuffie algorithm, but I was not convinced by the arbitrary number of points won/lost for specific actions, and by the difficulty of implementing the tracking of some of these actions:
Trash talk somebody: -1000
For every conference you attend: +200 (Plus bonus +5 for each #tweet)
I know that Jeff Ward wrote that he was just posting for fun on this one, but since there seemed to be interest in the comments for an actual implementation, I decided tonight in BART to take a stab at what such algorithm would look like.
Here are the basic principles:
- The algorithm should take into account how many positive/negative comments or citations your life stream items have got from other people, weighted by the Whuffie score of each of these people.
- The use of the weight here is important as it allows to remove completely the arbitrary point amounts: for instance, instead of “For every conference you speak at: +10,000″, speaking at a conference would essentially be equivalent to posting a summary of your speaking engagement and have the conference organizers or the conference itself comment on it/cite you on their Web site, with the Whuffie value of the comment being a function of the Whuffie of the conference or conference organizers themselves.
- The positive/negative nature of the comment would be determined via semantic analysis or microformats votelinks or voting nanoformats (vote:for:this article, +1/-1).
- If the positive/negative nature of the comments cannot be determined, a positive Whuffie point amount of a lesser amount would be attributed, weighted by the Whuffie of the entity issuing the comment.
- If no comment is available, views should be used (# of time a video was viewed), agained weighted by the Whuffie of who viewed it if possible. Views should contribute less Whuffie points then comments.
- In all cases, for each item published a number of points should be provided multiplied by the number of followers the person/entity has on the site where the life stream item is posted on (# of subscribers to RSS feed, # of Twitter followers, # of Flickr contacts, etc.).
I don’t really have a precise idea of what these point amounts should be. Let’s say +10 for a positive comment, -10 for a negative comment, +5 for a comment, +3 for a view, and +1 for a published item.
Let’s also say that these points would be weighted by 1/100 of the Whuffie of the person commenting, viewing or following the publisher/life stream item. so, if my Whuffie is 1,000,000 and I view an image of someone, but do not comment on it, that gives 10,000 Whuffie points to the person who posted this image.
Of course this algorithm reduces the number of arbitrary constants to a few, but these are still arbitrary. So, the next question that came to my mind is whether there is a set of constant values that would be better than another, better for instance at achieving the goal of a Whuffie system.
What is such goal? do we want a bell curve distribution of Whuffie scores, a very spiky curve or a very flat curve. Do we want Whuffie to last indefinitely, or to self-destroy over time (with the objective of preventing social capital to be too concentrated among too few people). I think this is where I should have started, but that I will the subject of another post hopefully. In the meantime, I will get good ideas/suggestions from you.
Another interesting problem is how we fight spam and reputation hacking in such a system. I think one partial answer would be to allow Internet hosts to have their own Whuffie, and to use that as an additional weighting factor. Ideas here are welcome as well.
Very interesting post. Hope to see more posts on this topic.
Interesting assumptions on the algorithm. I wonder if you could extract an actual dataset from twitter, friendfeed and a few other websites (or even the websites that track the top users from one of these sites like twitterholic) and start playing around with it.
In any case, pagerank seems like a good abstraction of whuffie.
You’ve raised some interesting points there, calculating whuffie is a very challenging idea, I think it requires a group of statisticians, social media experts and programmers to experiment with actual data in order to get anywhere near a workable system. Until then, all of this blue-sky thinking is fun.
Interesting assumptions on the algorithm. I wonder if you could extract an actual dataset from twitter, friendfeed and a few other websites (or even the websites that track the top users from one of these sites like twitterholic) and start playing around with it.
In any case, pagerank seems like a good abstraction of whuffie.