The Office for National Statistics produces a regularly-updated dataset - “Deaths registered weekly in England and Wales, provisional” - as an Excel file each week, plus a summary Excel file per year.
There is also an API which provides the same data as CSV.
To make this data accessible to web applications, I’m going to combine the historical data (assembled manually) in a git repository with the latest dataset from the ONS API, using a GitHub Action to keep it up to date (as described by Simon Willison).
- In GitHub, create a new repository.
- Add a
metadata.ymlfile describing the data source and license.
- Add a
- Create a "fetch" workflow with a
scheduletrigger that uses cURL to fetch the latest version of the dataset to a file. If the data has been updated when the scheduled workflow runs, it will be committed and pushed back to the repository. Note: use at least
ubuntu-20.04to get a version of
curlthat supports the
- Create a "deploy" workflow which uses
sqlite-utilsto insert the data from the CSV file(s) into a SQLite database, then uses
datasetteto publish the SQLite database as a web service with Cloud Run.
- Optionally, add a CNAME for a subdomain and associate the subdomain with the deployed service in Cloud Run.
For Cloud Run, you need to create a project, enable the Cloud Build and Cloud Run APIs, create and download a JSON private key for the service agent and add it to the repository’s secrets, and allow unauthenticated access to the deployed service.
Here's the git repository containing the workflows and data and the resulting datasette-powered web service.