RDF interoperability for social bookmarking tools

Quite a few people have been talking about interoperability between social bookmarking tools, generally based around RDF, particularly as RSS 1.0. The RDF export from HubMed and its Tag Storage seems to be working well now, so I thought I write out the reasoning behind it and see if interested parties can agree on a best-practices way to present this kind of data.

Here's an annotated example RSS 1.0 feed, containing a journal article that's been bookmarked and tagged by one user. Note that the bibtex:Article section doesn't necessarily have to be included in the feed - in fact, HubMed provides this data separately.

<?xml version="1.0" encoding="UTF-8"?>
  <rss:channel rdf:about="http://www.hubmed.org/tags/users/alf/tags/text"> <!-- an RSS feed of items tagged with 'text' by a particular user -->
    <dc:date>2005-12-01T16:48Z</dc:date> <!-- date the feed was generated -->
    <rss:link>http://www.hubmed.org/tags/users/alf/tags/text</rss:link> <!-- HTML version of the items in the feed -->
    <rss:title>HubMed Tag Storage: items tagged with 'text' by alf</rss:title> <!-- title of the feed -->
    <rss:description>RDF/XML version of HubMed Tag Storage: items tagged with 'text' by alf</rss:description> <!-- description of the feed -->
      <rdf:Seq> <!-- ordered list of items in the feed -->
        <rdf:li rdf:resource="info:pmid/15473905" /> <!-- identifier of an item in the feed (note: this could just as easily be an http: URL ) -->
  <rss:item rdf:about="info:pmid/15473905"> <!-- identifier of the subject of this feed item (a journal article). Matches the identifier used in the list of items above. -->
    <rdf:type rdf:resource="http://purl.oclc.org/NET/nknouf/ns/bibtex#Article"/> <!-- what kind of thing the subject of this item is (a journal article) -->
    <dc:title>Content-rich biological network constructed by mining PubMed abstracts.</dc:title> <!-- title of the journal article -->
    <rss:title>Content-rich biological network constructed by mining PubMed abstracts.</rss:title>	<!-- title of the journal article and of the RSS item (for aggregators which don't display the dc:title) -->
    <dc:date>2004-10-08</dc:date> <!-- date the journal article was published (note: misused in RSS 1.0 to be the date the RSS item was generated) -->
    <rss:link>http://www.hubmed.org/display.cgi?uids=15473905</rss:link> <!-- a related link for more information -->
    <rss:description>stored by alf</rss:description> <!-- description of the journal article -->
    <dc:subject>text</dc:subject> <!-- topic of the journal article -->
    <dc:subject>text_mining</dc:subject> <!-- topic of the journal article -->
  <tags:Tagging rdf:about="http://www.hubmed.org/tags/users/alf/item/15473905"> <!-- identifier for the 'tagging' (note: the HTML version of this URL displays the information about this particular 'tagging') -->
    <tags:taggedBy rdf:resource="http://www.hubmed.org/tags/users#alf"/> <!-- identifier for the 'tagger' (note: http://www.hubmed.org/tags/users is a list of all users)  -->
    <tags:taggedOn>2005-12-01</tags:taggedOn> <!-- date of the 'tagging' -->
    <tags:taggedResource rdf:resource="info:pmid/15473905"/> <!-- identifier for the 'tagged' resource -->
    <tags:taggedWithTag> <!-- an unordered list of tags used (note: perhaps this should be a parseType="Collection") -->
      <tags:Tag rdf:about="http://www.hubmed.org/tags/users/alf/tags#text"> <!-- identifier for a tag (important note: this tag is specific to the user, ie the same tags from different users are not equivalent. http://www.hubmed.org/tags/users/alf/tags is a list of all tags used by this user.) -->
        <tags:tagName>text</tags:tagName> <!-- name of the tag -->
    <tags:taggedWithTag> <!-- another tag -->
      <tags:Tag rdf:about="http://www.hubmed.org/tags/users/alf/tags#text_mining">
  <bibtex:Article rdf:about="info:pmid/15473905"> <!-- identifier of the journal article (note: all this section could be inside the RSS item above, it's just that HubMed leaves this extra metadata out of the RSS feed and can supply it separately) -->
    <dc:title>Content-rich biological network constructed by mining PubMed abstracts.</dc:title> <!-- title of the article -->
    <dcterms:abstract>BACKGROUND: The integration of the rapidly expanding corpus of information about the genome, transcriptome, and proteome, engendered by powerful technological advances, such as microarrays, and the availability of genomic sequence from multiple species, challenges the grasp and comprehension of the scientific community ... Chilibot distills scientific relationships from knowledge available throughout a wide range of biological domains and presents these in a content-rich graphical format, thus integrating general biomedical knowledge with the specialized knowledge and interests of the user.</dcterms:abstract> <!-- abstract of the article -->
    <dc:creator> <!-- creator of the article -->
      <rdf:Seq> <!-- an ordered list of the creators of the article (note: some people say the content of dc:creator should be presented as a literal string, but this is apparently the way XMP does it and it seems to make sense when the order of the authors is important) -->
        <rdf:li>H Chen</rdf:li> <!-- first author of the article -->
        <rdf:li>BM Sharp</rdf:li>
    <foaf:maker> <!-- author of the article -->
        <foaf:name>H Chen</foaf:name> <!-- author's name as a string, in any order -->
        <rdf:value>H Chen</rdf:value> <!-- authors' name as above, as a node value for RDF tools that don't use foaf:name -->
        <foaf:givenname></foaf:givenname> <!-- author's given name (Western = first name) (note: in this case it's empty as only the initial is available) -->
        <foaf:surname>Chen</foaf:surname> <!-- author's surname (aka family name) (note: i've used lowercase, no underscores for all the FOAF fields, but this is unclear and inconsistent in the FOAF specifications and in usage)-->
    <foaf:maker> <!-- author of the article -->
        <foaf:name>BM Sharp</foaf:name>
        <rdf:value>BM Sharp</rdf:value>
    <dc:identifier>doi:10.1186/1471-2105-5-147</dc:identifier> <!-- an identifier for the article -->
    <prism:publicationName>BMC Bioinformatics</prism:publicationName> <!-- name of the journal in which the article was published (note: this could be a weblog, for example) -->
    <prism:publicationDate>2004-10-08</prism:publicationDate> <!-- date the article was published  -->
    <prism:volume>5</prism:volume> <!-- volume of the journal -->
    <prism:number></prism:number> <!-- issue number of the journal -->
    <prism:startingPage>147</prism:startingPage> <!-- first page number -->
    <prism:endingPage>147</prism:endingPage>  <!-- end page number  -->
    <prism:isPartOf rdf:resource="urn:issn:1471-2105"/> <!-- identifier for the journal in which the article was published -->