It's been a long-standing need, for dealing with directories full of metadata-free PDFs, but I've finally got it working. The steps are:
- Download the source for xpdf and apply the patch to remove the check for permissions. Compile and install.
Email me and ask for the Perl and Applescript scripts (they still need testing, and only work with papers that are found in PubMed so far).- Install a few Perl modules through CPAN.
- Run the Perl script, which will, for each PDF in a specified directory: convert the PDF to text, try and recognise the title and authors, search PubMed for the title, ask you to confirm the match, fetch the BibTeX from HubMed, write the BibTeX to the Keywords field of the PDF's Info Dictionary (so it's now self-contained with metadata), run an Applescript to add the item to BibDesk and attach the PDF to that entry.
- When all the PDFs have been analysed, you can use BibDesk's "Consolidate Linked Files" command to move the PDFs into a specified directory and rename them according to a template.
Ideally this utility needs some kind of Cocoa GUI, but that's outside my reach at the moment.
Update: files are attached to this later post.