The Office for National Statistics produces a regularly-updated dataset - “Deaths registered weekly in England and Wales, provisional” - as an Excel file each week, plus a summary Excel file per year.
There is also an API which provides the same data as CSV.
To make this data accessible to web applications, I’m going to combine the historical data (assembled manually) in a git repository with the latest dataset from the ONS API, using a GitHub Action to keep it up to date (as described by Simon Willison).
- In GitHub, create a new repository.
- Add a
metadata.yml
file describing the data source and license. - Add a
requirements.txt
file containingdatasette
andsqlite-utils
. - Create a "fetch" workflow with a
schedule
trigger that uses cURL to fetch the latest version of the dataset to a file. If the data has been updated when the scheduled workflow runs, it will be committed and pushed back to the repository. Note: use at leastubuntu-20.04
to get a version ofcurl
that supports the--no-progress-meter
option. - Create a "deploy" workflow which uses
sqlite-utils
to insert the data from the CSV file(s) into a SQLite database, then usesdatasette
to publish the SQLite database as a web service with Cloud Run. - Optionally, add a CNAME for a subdomain and associate the subdomain with the deployed service in Cloud Run.
For Cloud Run, you need to create a project, enable the Cloud Build and Cloud Run APIs, create and download a JSON private key for the service agent and add it to the repository’s secrets, and allow unauthenticated access to the deployed service.
Here's the git repository containing the workflows and data and the resulting datasette-powered web service.