Ten years of HubMed

It's been 10 years, last weekend, since HubMed first went online. Inspired by TouchGraph's Google Browser, I learned a bit of Perl (mostly because PubCrawler was the closest example code I could find) and had a go at making something similar using PubMed's "Related Articles" API.

Inspired by Mark Pilgrim, I made a script to convert PubMed's EUtils output to RSS, and later Atom. It was great to be able to have new papers arrive in a feed aggregator, and a couple of years later it became possible to do so using PubMed itself. In the meantime I played around with the web interface to HubMed, adding features - some of which are still there, some of which have decayed. Gunther Eysenbach/JMIR kindly supported some part-time work on the site in 2005-6, and I wrote a paper (which - like most papers - obviously wanted to be a blog post) about some of HubMed's features.

So, 10 years later, it's time for a new version of HubMed. It's linked from the front page, but hasn't replaced the old version yet.

There have been some changes to the web over the last 10 years, mostly coming from innovations in web standards and browser support for those standards. These days, there's no need for any server-side scripting to run HubMed - all the data comes direct from the EUtils server (as allowed by CORS), gets turned into Javascript objects, then is rendered as views using Backbone. Attention metrics for each article get pulled in - client-side - from various sources, particularly Altmetric. You can bookmark articles in Mendeley, or download them directly as RIS or BibTeX thanks to bibutils.

Many of these new features are still a work-in-progress, which is exciting: browsers are adding support for native search inputs; back buttons and list offsets are still an unsolved problem when using infinite scroll; there's still a need for an extensible way to ask the browser how it would prefer to handle various actions (bookmark, save, etc; Web Intents is trying to solve this); it's still too difficult to get a URL for the PDF of each article, and still too expensive to read those articles (let alone delegate the reading to software). Clicking an author's name in HubMed shows you other articles they wrote, but also articles by people of the same name - a problem which ORCID is trying to solve by assigning each researcher a unique identifier.

As for features that still survive, one of my favourite things that databases can do is "More Like These", and HubMed now makes that even easier: hold down Ctrl/Cmd while clicking subsequent "Related" links, and the related articles will be merged; the search gets more and more focused, and hopefully achieves an equivalent goal to the original PubMed TouchGraph, even if it doesn't visualise all of the connections between similar articles.

There's also an update to HubMed's Citation Finder, though it's still missing a bulk export option, and I think I can make finding articles even easier now that free text citation parsing is working again in EUtils.

Most importantly, a next step: each researcher's collection of saved articles needs to move out of silos and become available to the web. When searching in Metatato, the list of search results knows which articles you already have saved, as it can query your local database, and can add articles directly; I think this needs to be expanded so that any database or search index can be overlaid with information about your existing collection. It may be synced and stored on your local computer, or it may be somewhere else behind an API, but allowing third-party tools to query and analyse that collection as you build it will hopefully turn out to be really useful for searching and filtering.

There's one more thing related to search and discovery, which is your social graph: the sources that you have chosen as reliable sources of information. Information filters through them to you, and search results in Google and Twitter already incorporate social cues to highlight useful information. It should get easier to follow not only the research produced by the most knowledgable people in any particular area, but also the research that they find the most interesting.