|Thoughts on the future of the real-time web
||[Aug. 12th, 2010|03:03 am]
I've been doing a lot of thinking over the last few months about the real-time web as part of a project I'm working on. Here are a few things about it that I think are underappreciated:
0. There's an item 0 that I wanted to get out of the way first just so we're on the same page. Real-time doesn't automatically mean "Twitter."
1. Real-time content is going to look a lot prettier once good filtering tools are in place. Publish-then-filter has become the standard paradigm for new web communication tools, and the real time web is no exception. Just as the web was full of random crap before we had Google to filter it, real-time content is full of spam these days, but the filtering tools are soon going to catch up.
2. Related to the previous point: the pagerank model doesn't work for real-time content for obvious reasons -- there isn't enough time to wait to see what other people say about a piece of content. Instead, what is needed is to measure the reputation of sources, so that when a source emits some content you immediately have a way to rank it. A new class of algorithms are going to become as important or more important than pagerank, and user identities and reputation are going to be a key component of it.
3. There are two distinct reasons why low latency is valuable to users. Some types of events/information are inherently time-sensitive. The street food location via Twitter phenomenon is a good example, as are various other time-sensitive offers.
For other types of information, the value of timeliness is relative. For example, I follow Susan Polgar's chess blog, where she posts puzzles. To have a prayer of being the first to post a solution you have to be among the first to be notified of new posts (especially since the puzzles are kinda lame ;-).
4. A more serious and better known example of the value of relative timeliness is stock prices. The competitive advantage of low latency in high-frequency trading is so high that there are incredibly complex and powerful infrastructures that have been set up where the speed of light is a bottleneck. The important thing to realize is that this is driven not by the speed of events in the external world but purely due to competition within the trading network. A similar phenomenon is happening for information on the web, although less dramatically.
5. From the system design point of view, the biggest benefit of real-time isn't so much the low latency it delivers as the fact that it uses a publish-subscribe model as opposed to polling periodically for updates. For a large-scale system, the efficiency gains are incredible.
6. The above two factors -- the competitive value of low latency and the efficiency gains of publish-subscribe are together leading to the real-time-ification of update propagation of the web at an accelerating rate. Within a few years we will be able to build a "uberhose" -- a real-time stream that aggregates essentially all human activity on the web. Combined with effective routing and filtering tools, the applications will be limited only by imagination.
7. My regular readers and/or those who know me IRL may have noticed that I am as usual childishly optimistic about the future of technology. I happened to write something yesterday about the problem of hoaxes exacerbated by real-time meme propagation, so I thought I'd throw in a link to that "for balance."