Scraping with YQL Execute

Another attempt at hosted, server-side scraping in Javascript, this time using YQL Execute, which is mostly based around E4X and XPath.

It takes the idea from one of the Execute demo tables to use a CSS2XPath library to convert CSS selectors into XPath (the library handles most selectors well, though not the very newest, like nth-of-type). This allows the selectors to be written using CSS or XPath, which is enough for a lot of cases (but might still have to expanded to allow regular expressions).

Here's an example definition file, and its output.

Comments

All fields are optional, email address will not be shown; no HTML, URLs are automatically hyperlinked.