The Guardian launches 3 million linked data music album pages

by Martin Belam, 4 August 2011

This week at the Guardian we launched something like 3 million album pages, allowing users to rate, review and buy just about anything that has ever been released. Well, provided it has a MusicBrainz ID anyway.

On the site Lisa Van-Gelder has explained a bit about how we built them, and music editor Caspar Llewellyn Smith has explained how to use the new pages. You can also find Alexis Petridis advising on how to write the perfect album review.

And admitting that the Bee Gees made his favourite album of all time, a choice that has been slagged off for being too populist, for being wilfully obscure, and for deliberately picking something populist in order to avoid being wilfully obscure.

(Mine would be “Spirit of Eden” by Talk Talk)

Release early, release often

“Release early, release often” could be a mantra urging bands to emulate The Beatles and to keep up a rapid album release rate, but I’m talking about software. We’ve put the album pages up live on the site before some of the surrounding infrastructure around search and discovery, and our discussion platform plans are in place. Sometimes that is the nature of agile - I’d rather have them out there now than wait for perfection. That approach hasn’t, it must be said, met with total approval from our users.

We are the robots

This is another set of pages on the Guardian website featuring our little robot illustration, coupled with the disclaimer:

“This page has been automatically assembled and may not be entirely accurate. If you spot any problems with the page email userhelp@guardian.co.uk”

It only took the pages being live for a few minutes for people to start spotting data parsing errors.

My view, though, is that I would rather have the 3 million pages live with the opportunity to correct mistakes, than spend the time and money auditing them in advance. It doesn’t matter a huge deal if the School of seven bells page doesn’t have much detail on it if nobody ever visits it.

An honest mistake?

“Music is my life. I dislike having conversations about, listening to or reading other peoples opinions or being asked for mine. Music approaches a zero-degree of meaning dependent entirely on listener response: talking about it is redundant, and so is a music press.” - OldOwl

I am genuinely baffled why you would take the time to register and post that comment if you didn’t want to share your opinion, and thought talking about music on a music website was redundant. Genuinely.

If you build it right, they will improve it

You can dynamically generate a page for any album based on the MusicBrainz ID. Matt Cox has already written a nifty little bookmarklet that will automatically switch you from looking at an artist on MusicBrainz to looking at them on the Guardian. Isn’t the lazyweb great sometimes?

The serious point being that this can be done because we have made our URL schema REST-ish, consistent, and obvious, and the URLs are easily hackable in a good and predictable way. If we have a keyword tag for an artist, then the URL is human readable. If we don’t, then the URL features the raw MusicBrainz ID in it. A mapping ensures that guardian.co.uk/music/artist/5700dcd4-c139-4f31-aa3e-6382b9af9032 re-directs to guardian.co.uk/music/kraftwerk.

“The antithesis of rock‘n’roll”

I have to confess that the first time the Guardian’s music website linked to this blog, it was because the sadly-missed legendary writer Swells was describing my music list-making habit as “the antithesis of rock‘n’roll”. He would no doubt be appalled that you can make music lists on the Guardian as well now. Amongst other things you can conjure up your fantasy festival line-up (here is mine) and lists of your favourite artists, or the worst acts you’ve ever seen.

It is another step along the way to “mutualising” our culture coverage, which we started earlier this year with our automatic book pages and list-making tools.

Of course, the Guardian isn’t the only place on the Internet where you can review albums, but we hope we have provided a space where our already very active music community can help each other to find and discover great new music, which is one of the main reasons for having a music website at all.

If you’d like to find out more about our “culture” projects, I’ll be presenting “The IA of /culture” at EuroIA in Prague in September, and “A new chapter for the Guardian Books site” at Enterprise Search Europe in London in October.

3 Comments

'Spirit of Eden' would be mine too, without question!

Greg Nisbet | 4 August 2011

pity you neglected to do the basics ie set up a decent browse structure and the titles and lack of h1 suck as well.

neuromancer | 5 August 2011

Hi,
Helpful article - thanks.
However, I think the title is very misleading - as far as I can tell, although I will be delighted if I am wrong.
The Guardian is not actually publishing any "linked data album pages".
I can't find any machine-readable results from the Guardian's URIs, never mind RDF.
It is of course exciting to find the Guardian using linked data, in the sense of consuming, which is what I believe it is doing.
It would be even better to find it publishing, as implied by the title of your article.
As I say, I would be pleased for you to tell me I am wrong, and how to get the machine-readable data from one of the Guardian's URIs.
Best
Hugh

Hugh Glaser | 5 August 2011