Neighbourhood Statistics API

The UK Neighbourhood Statistics sites provide data collected from - among other sources - the most recent Census in 2001. The England and Wales website is a bit awkward to use, but through the 'create a custom table, chart or map' wizard you can get some basic data plotted.

For more complex analysis, though, you need access to the raw data. Happily (as previous email requests went straight into the black hole of bureaucracy), the NeSS has opened up their API as part of the government's 'Show Us A Better Way' initiative.

Unfortunately, it's a SOAP API, and documented not as functions and their parameters, but as example chunks of raw XML [PDF]. Argh. Also there's an error in one of the WSDL files, so it doesn't actually work. And there's another error somewhere, because it doesn't parse the data responses properly. Isn't SOAP wonderful?

Anyway, I managed to get something working, so thought I'd write up how the API works, as it uses some bits of PHP's SOAP functions that aren't extensively documented either.

First of all, contact the ONS by email for an API key. Once that's arrived, define a SOAP client that will connect to the web service:

<?php
global $client;
$client = new SoapClient($wsdl, array('trace' => 1));
// WSSE authentication header
$auth = new SoapVar(sprintf(
  '<Security xmlns="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd">
    <UsernameToken><Username>%s</Username><Password>%s</Password></UsernameToken>
  </Security>',
  htmlspecialchars('YOUR_NESS_USER'), htmlspecialchars('YOUR_NESS_PASSWORD')
), XSD_ANYXML);
$client->__setSoapHeaders(array(new SoapHeader('http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd', 'Security', $auth)));
//var_dump($client->__getFunctions());
and a function that will handle errors in SOAP requests:
<?php
function soapy($method, $params){
  //print_r(array($method, $params));
  global $client;
  try {
    return call_user_func(array($client, $method), $params);
  } catch (SoapFault $e) {
    print_r($e);
    var_dump($client->__getLastRequest());
    var_dump($client->__getLastRequestHeaders());
    exit();
  }
}
For the Discovery phase, set the WSDL file location, include the code above, then call functions as needed:
<?php
$wsdl = 'https://www.neighbourhood.statistics.gov.uk/interop/NeSSDiscoveryBindingPort?WSDL';
include_once('ness.php');
$items = array(
  // fetch the list of area hierarchies
  array('GetHierarchies', NULL),
  // fetch the list of topics
  array('GetSubjectTree', NULL),
  // fetch the list of levels under a particular hierarchy
  array('GetLevelTypesByHierarchy', array('HierarchyId' => 11)), // 2004 Administrative Hierarchy
  // fetch the list of sections under a particular topic
  array('GetVariableFamilies', array('DSFamilyId' => 91)), // Age (UV04) topic
  );
  
foreach ($items as $item)
  print_r(soapy($item[0], $item[1]));
You can then make more specific queries:
<?php
// all the cities (level 12) matching the search term 'London'
print_r(soapy('SearchAreaByNameLevelType', array('AreaNamePartWithLevelType' => array('AreaNamePart' => 'London', 'LevelTypeId' => 12))));
// details for Greater London (area 554954)
print_r(soapy('GetAreaDetail', array('AreaId' => 554954)));
// all the 2004 administrative wards (level 14) within Greater London (area 554954)
$wards = soapy('GetAreaAtLevel', array('AreaIdWithLevelType' => array('AreaId' => 554954, 'LevelTypeId' => 14)));
Once you have the codes for all the desired areas and topics, you can move to the Delivery phase and fetch the datasets. For this you need to set a new WSDL file, and post a chunk of XML as the query:
<?php
$wsdl = 'https://www.neighbourhood.statistics.gov.uk/interop/NeSSDeliveryBindingPort?WSDL';
include_once('ness.php');
$DataCubeQueryMessage = '<DataCubeQueryMessage version="0.5" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://neighbourhood.statistics.gov.uk/dissemination/resources/schemas/datacubequery" xsi:schemaLocation="https://www.neighbourhood.statistics.gov.uk/interop/NeSSDataCubeQueryMessage_v0_4.xsd">
  <Query repository="NeSS" type="custom">
    <!-- dataset -->
    <Dimension name="dataset">
      <!-- DSFamilyId -->
      <Item code="%d"/>
    </Dimension>
    <!-- variable (GetVariableFamilies) -->
    <Dimension name="variablefamily" isMeasuredDimension="true">
      <!-- all variables for this DSFamilyID -->
      <Group dimension="dataset" code="%d" type="all"/>
    </Dimension>
    <!-- area (GetAreaAtLevel) -->
    <Dimension name="area" isSpatialDimension="true">
      <Item>
        <HierarchyArea>
          <AreaId>%d</AreaId>
        </HierarchyArea>
      </Item>
    </Dimension>
    <!-- time (should be optional, but is not yet) -->
    <Dimension name="time" isTimeDimension="true">
      <Item>
        <Period>
          <Start>2001-04-29</Start>
          <End>2001-04-29</End>
        </Period>
      </Item>
    </Dimension>
    <SeriesGrouping>
      <SeriesGroup dimension="dataset"/>
    </SeriesGrouping>
  </Query>
</DataCubeQueryMessage>';
$dataset = 91; // Age (UV04)
foreach ($wards->Areas->Area as $area){
  print $area->Name . "\n";
  $input = sprintf($DataCubeQueryMessage, $dataset, $dataset, $area->AreaId);
  $response = soapy('getDataCube', array('DataCubeQueryMessage' => new SoapVar($input, XSD_ANYXML)));
  
  // response isn't parsing properly (bug in WSDL?), so parse response XML instead
  $xml = simplexml_load_string($client->__getLastResponse());
  $xml->registerXPathNamespace('lgdx', 'http://www.esd.org.uk/LGDX');
  
  // topic IDs for this request don't seem to match up to global topic IDs, so have to read them here
  $topics = $xml->xpath('//lgdx:Topics/lgdx:Topic');
    
  $items = $xml->xpath('//lgdx:DatasetItems/lgdx:DataItem');
  print_r($items);
}
Here's a bundle of work-in-progress files that might be useful.