XMPP comments

Rob Kaye has a good write-up of Kellan Elliott-McCrae and Evan Henshaw-Plath (Rabble)'s interesting XMPP talk at OSCON '08. There were a couple of things worth commenting on:

Rabble presented using XMPP for FireEagle, Yahoo!'s new personal geolocation service that allows users to provide their current location to other users. For a few users and a few updates you can paginate the data stream into RSS/atom feeds. But once you have more than a few users and frequent updates a paginated stream cannot keep up. What if a user publishes more updates than can an RSS feed can capture? Updates get lost -- and for applications using FireEagle missing an update presents a critical flaw. Using a system like XMPP, FireEagle can rely on Jabber to deliver all the updates -- exactly what Jabber was meant to do.

Pagination of feeds is worth a mention here: Google Reader, for example, uses this for delivering a continuous list of items to the browser, and so do OAI and PubMed's EUtils for sequentially retrieving chunks of items from a long list of results. All it takes is a key that fixes the initial query, and 'start' and 'end' parameters that determine which items to retrieve. This solves the "What if a user publishes more updates than can an RSS feed can capture?" problem nicely.

As for "FireEagle can rely on Jabber to deliver all the updates", as I understand it XMPP (unlike HTTP's response packets) has no mechanism for confirming delivery of a message. There might be an XEP for this that I haven't seen, though.

Kellan also applied XMPP/PubSub to Flickr and how a Flickr update "Firehose" might work. If Flickr sends a ~2k an atom enriched packet for each new public picture posted at a rate of 60 updates a second, it would take roughly a megabit of traffic. Even a normal DSL line can handle one mbit of traffic, so the network effects are manageable on this level, compared to the polling system that FriendFeed uses. (Kellan also points out that FriendFeed is not doing anything wrong at all -- the current web service centric model is simply insufficient for this type of service.)

Fine, if Flickr's only sending a firehose stream of XMPP packets to one listener (FriendFeed), but what happens when there are 1000 listeners, or more? Flickr's XMPP server has to send a copy of the packets to each of the listeners' XMPP servers, which each have to be able to handle the incoming data. It's not unmanageable, but does present some problems.

The open and mature infrastructure that Rabble and Kellan found to use for this service is Jabber. Jabber has 10 years of experience of passing messages around the internet and has been embraced by many companies including Google.

Except Google might well be moving away from XMPP to a binary protocol for Google Talk on mobile devices, in Android, to reduce the communication overhead. While XMPP support will probably get back into Android, maybe a binary protocol (not this one) is better suited to high-volume notifications?