Scraping with YQL Execute


Another attempt at hosted, server-side scraping in Javascript, this time using YQL Execute, which is mostly based around E4X and XPath.

It takes the idea from one of the Execute demo tables to use a CSS2XPath library to convert CSS selectors into XPath (the library handles most selectors well, though not the very newest, like nth-of-type). This allows the selectors to be written using CSS or XPath, which is enough for a lot of cases (but might still have to expanded to allow regular expressions).

Here's an example definition file, and its output.