Server-side scraping with Javascript

Update 2007-12-03: Runs in Zotero now as well. Moved source to Google Code repository.

Update 2007-11-14: Rewritten to remove E4X dependency so more likely to run in WebKit, and to make functions more in line with Zotero's.

The code, available through svn, contains:

What this does:

When you run

Rhino should load the test.js file, which will pull in the other .js files. It'll then fetch an item page from Amazon, convert it to XHTML using the Tidy proxy, load it and call two functions loosely based on Zotero translators. The first function will detect the type of item ("Book" in this case). The second function will detect the ASIN, look up the metadata from ECS, parse the XML and produce a metadata object that can then be passed to a bibliographic manager.

The point of this is to try and make Javascript scrapers that will run in Firefox (for Zotero), WebKit (for BibDesk and Papers) and Rhino (server-side, for Connotea, CiteULike, Bibsonomy, etc).