Metadata in feeds (again)


Two little shifts in the blogoplate today (already noted by Phil) suddenly make the prospect of embedding metadata in Atom feeds much more promising again.

Google's Blogsearch uses this metadata to identify information such as the author and publishing date, while SixApart's AtomStream makes formatted feeds available to any aggregators that want to subscribe. This marks a shift away from Technorati, its scraping of HTML for metadata, and the necessity for microformats embedded in the XHTML post (though I think there's probably still a place for those). This reliance on feed data also gives bonus points to those who include the full text of their posts in their feeds, as they're more likely to be included in search engine results.

I'm thinking about review data again - about the inadequacies of my previous attempts and the possibilities of conciliating multiple formats aimed at the same purpose.

Things that are needed:

  • A way to denote the object of a post (the thing being written about). This could be either a URI (an http: link or an identifier with a uri: or info: prefix, for example) or a free text or XHTML description. Microformat markup would be well suited to the latter, and in that case might as well stay in the body of the post (?). This approach avoids blossoming trees of hierarchical XML nodes for describing every possible aspect of the item being described (which works well in RDF, but that's a different story).
  • A rating score. That's easy (a floating point percentage, in my opinion).
  • A way to create and store this data. This is the difficult bit. I don't know how to add extra fields to Movable Type and other weblogging systems (it probably requires a plugin for each system), and I still can't see how weblogging tools like Ecto will be able to add extra input fields for arbitrary metadata.