Why are the UK & US so far ahead with linked data and the semantic web compared to Germany?
I've just come back from a really enjoyable 1st Datajournalism meetup in Berlin, which will no doubt generate a spew of blog posts here and on guardian.co.uk over the next couple of weeks. I was giving a talk entitled 'Datajournalism at guardian.co.uk', which I intend to publish in some format somewhere at some point in the not too distant future - in the meantime there is a list of the things that I referenced in the talk in yesterday's linklog special edition.
The afternoon closed with a panel discussion, featuring myself (representing Guardian News & Media) and Tom Scott, Jem Rayfield and Silver Oliver from the BBC, ex-LA Times journalist Eric Ulken, Gerd Kamp of the Deutsche Presse Agentur Newslab and Ole Wintermann from the Bertelsmann Foundation.
From left to right: Ole Wintermann, Jem Rayfield, Silver Oliver, Gerd Kamp, Eric Ulken, Tom Scott, Martin Belam. Photo: Georgi Kobilarov
One of the themes that emerged was that the Berlin audience felt that the US and UK had taken a lead in the linked data and semantic web field. As one audience question put it:
"All the presentations from the British and US speakers seemed to be about we did this thing, and we did this thing. All the presentations from Germany seemed to be about how we struggled to do anything. Why is that?"
I'm not entirely convinced that we came up with a very good answer, but there seemed to be four main strands to it.
The common language
Dull but true - the common language between the UK and the US has made it easier to build things in English. The English language version of Wikipedia is 3 times as big as the German one, and dbpedia, a crucial hub in the linked data ecosphere, uses English Wikipedia entries as the basis of a common identifier.
The use of the English language also drives the scope of some of the services being built. It makes sense for The Guardian's World Government Data store to include datasets from Australia, Canada, New Zealand and the US alongside the UK because they share a common language which makes data retrieval and cross-referencing and comparison easy. Whether it would be useful to add in state published data from a German Landtag or the Austrian government remains to be seen.
'The unnoficial API and data arms race'
Having a common language also means sharing a common media space. In the UK we've made some tentative steps to forge a common vision of linked data use by news organisations. However, no doubt dubbed with some hideous neologism like 'co-op-a-tition', the fact that businesses like The Guardian, The Telegraph, the LA Times and the New York Times are all feverishly looking across the Atlantic at what the others are doing drives innovation and new data-driven journalism services.
The BBC has done it
With BBC Earth and the 2010 World Cup site, the BBC still remains alone as a big media organisation that has used semantic web technologies on the production side, and has then been open in blogging and presenting about what they have done. It provides a demonstration of a potential business case to other media organisations in the UK. The size and funding model of the BBC has allowed it the opportunity to experiment in this sphere and carefully build things 'the right way'.
The history of 'freedom of information' court cases
Eric Ulken made the point that he was loathe to say it as an American, but that maybe more Europeans needed to reach for their nearest lawyer. He stressed that the open government data and freedom of information legislation in the US has come about after years of lawsuits trying to force grudging state and federal departments to release information. The plethora of official data being issued in the UK and US in machine-readable and reusable formats is fueling the development of apps and services.
So what should Germany do next?
Between us on the panel we suggested a few things that might help kick-start the process:
- Get together hacks and hackers meetups like those taking place in the US and UK
- Support and publicise existing sites and services like datenjournalist.de and the DPA Newslab
- Have more 'Web of data' meetups in Berlin and elsewhere
- Start campaigning for the release of local and national state information
As I mentioned at the beginning, I'm sure this is just the first of a flurry of blog posts to emerge out of my trip...