James Howison has made a workflow diagram comparing metadata management for academic papers with the systems available for music files. I've made a different version of the academic part of this, because there are a lot of gaps that can already be filled in (and I don't need much incentive to play with OmniGraffle) - click for the PDF:
The main point is that you don't need metadata attached to the PDF when you have a fulltext index, a simple filing system and a bibliographic manager (and if you did, it's not too hard to write a script that will fetch the metadata from PubMed and put it into the PDF Info Dictionary).
Comments
All fields are optional, email address will not be shown; no HTML, URLs are automatically hyperlinked.


Metadata is always good for you to share with other people. I don't believe that everyone can follow your workflow. However I found that this workflow works for me. :-D
I like it ;) The key thing for me is that the current situation doesn't let others leverage the efforts you make ... I'm hoping for a metadata lookup service that can take the file and return the metadata ... ideally adding it to the file in a standard manner. Like CDDB for academic papers ... (without the ripping off the contributors bit!)
And I found that it really isn't as easy as one would think to add metadata to PDF files. I blogged my (admitted unskilled) efforts here: http://shangorilla.syr.edu/themp/
The long-term good approach of XMP is it seems currently impossible without the for-pay adobe libraries while the good hack of putting bibtex into the document info dictionary can be done in perl (although it is slow because it loads the whole pdf) but not, AFAIK, in C for free ... but I'd love to be proven wrong!
James, I was going to say that I used Win32::OLE to add to the Info Dictionary and that you can manage that metadata using PDF Explorer (on Windows), but I found that I already made that comment on your post in October :)
http://shangorilla.syr.edu/archives/themp/000206.html
PDF::API2 does seem like a better way to do it, anyway. However, I'm still not convinced that a centralised metadata database, keyed perhaps on the DOI number of a particular document, is necessary. Wouldn't it be easier to contact the publishers and ask them to add the metadata in bulk to all their PDFs?
Perhaps they could even use XMP, but then you'd have to find a program that could read and organise the data.