Google has an official Feed API and several other methods that let you retrieve historical items from a feed.
The easiest way to access the items for one feed is to log in to your Google account in a browser, load, then save the resulting Atom file.
To access the items programmatically, the following options are available:
If you don't need all the original metadata for each item, you can fetch a JSON representation of each item, as used in Google Reader's UI:
$feed = urlencode('');
$continuation = '';
$url = sprintf('', $feed, $continuation);
$json = file_get_contents($url);
$data = json_decode($json);
$continuation = $data->continuation;
print "Continuation: $continuation\n";
foreach ($data->items as $item)
} while ($data && $json && $continuation);
This has the advantage of returning every item from the feed (it goes > 6500 at least); not needing to be logged in; being easy to parse; and includes elements like enclosures, but doesn't include some elements such as the original id for each entry.
The official method is to use the AJAX Feed API:
$feed = '';
$params = array(
'q' => $feed,
'v' => '1.0', // API version
'num' => -1, // maximum entries (limited)
'output' => 'json_xml', // mixed content: JSON for feed, XML for full entries (json|xml|json_xml)
'scoring' => 'h', // include historical entries
$result = file_get_contents('' . http_build_query($params));
$json = json_decode($result);
$data = $json->responseData;
// json version
foreach ($data->feed->entries as $entry)
// xml version
$xml = simplexml_load_string($data->xmlString);
foreach ($xml->channel->item as $item) // only matches RSS2 - need namespace for Atom
This way you get the full, original XML version of the feed, but it's not normalised (which it harder to parse - the Javascript API has a parsing function built in) and only contains a limited number of entries (seems to be 250, json_decode has problems).
If you're logged in, you can fetch a normalised Atom representation of the feed:
$feed = urlencode('');
$params = array('Email' => 'YOUR_GOOGLE_EMAIL', 'Passwd' => 'YOUR_GOOGLE_PASSWORD');
$context = stream_context_create(array('http' => array('method' => 'POST', 'content' => http_build_query($params))));
$result = file_get_contents('', NULL, $context);
$sid = array_pop(explode('=', array_shift(explode("\n", $result))));
$cookie = array(
'SID=' . $sid,
$header = sprintf("Cookie: %s\r\n", implode('; ', $cookie));
$context = stream_context_create(array('http' => array('method' => 'GET', 'header' => $header)));
$continuation = '';
$url = sprintf('', $feed, $continuation);
$data = file_get_contents($url, NULL, $context);
$xml = simplexml_load_string($data);
$xml->registerXPathNamespace('atom', '');
$xml->registerXPathNamespace('gr', '');
$continuation = (string) array_shift($xml->xpath('/atom:feed/gr:continuation'));
print "Continuation: $continuation\n";
$items = $xml->xpath('/atom:feed/atom:entry');
foreach ($items as $item)
} while ($xml && $continuation);
This returns an apparently unlimited number of normalised Atom entries (> 6500, at least).