Zotero and compound documents


Zotero, a Firefox extension that automatically extracts and organises bibliographic metadata for items from web pages (Amazon, PubMed, CiteSeer, New York Times, lots of library catalogues...) had a public beta release a few days ago. It now supports HubMed, presumably by recognising embedded COinS.

When you save an item, Zotero adds a snapshot of the page that can be viewed later on, which comes around to something Jon Udell mentioned earlier this week: compound documents. Compound documents would probably be zip archives containing the main item and all the associated items need to display it (in the case of a web page that would be images, CSS, etc), along with a meta-document that would describe all the pieces. What I also wanted to include in this kind of snapshot is anything marked with rel="enclosure": that way if you saved a paper from a journal website, all the supplementary information like movies and results data would be included as well.

The Mozilla Archive Format seems like a good place to start, but lots of other applications, such as OpenOffice, store their data in this kind of bundle as well.

On the other hand, a compound document such as that envisaged by the XIPF project, where all the pieces are added to one HTML page as data: URIs, might be an alternative.