“If we don’t understand the financial system, we aren’t doing our jobs as journalists” - Chris Taggart of OpenCorporates at Hack/Hacks London
The latest Hacks/Hackers London meet-up was crammed with talks from people at Bloomberg, Thomson Reuters and the Financial Times. Striking a rather different organisational note at the end of the evening was Chris Taggart. I’ve previously seen Chris talk about OpenlyLocal, but this talk was about another open data project — OpenCorporates.
“OpenCorporates” - Chris Taggart
Perhaps the first thing to say about Chris’s talk is that it provoked a strong reaction on Twitter:
“My mind is blown by opencorporate. I've thought of about 15 stories in as many minutes.” - Peter Newlands on Twitter
“Chris from opencorporate is gonna turn me into a journalist if he's not careful” - @schizdazzle on Twitter
In the glitzy surroundings of the Bloomberg office, and following talks from three of the behemoths of financial data and journalism, Chris opened by saying that compared to some of their offerings “There is nothing to see here. This is just a toy.”
This was disingenuous at best. Chris explained that OpenCorporates.com has the a simple, but huge, goal: An entry for every company that exists as a legal entity in the world. That is no small undertaking, and the database has grown from 3 million companies to 50 million in the space of a couple of years.
The site gathers the data through a mixture of means, both fair and foul. Where there is the opportunity to do so, as in Norway and New Zealand, they have used APIs and data dumps from the official bodies registering countries in various territories. Where there isn’t, they’ve had to rely on screen-scraping the data.
One of the questions this revelation sparked from the floor was about how much “bad data” gets into the system. Chris said that they design to scrapers to try and throw alerts and warnings when they can’t parse data, rather than scrape junk into the system. He also said that often, when people complain about erroneous data on OpenCorporates, the error is in the source material, not the way that OpenCorporates has integrated it.
Chris outlined the main purposes of OpenCorporates. Firstly it acts as a place where every legal entity it knows about has a unique and permanent ID and URL. Here is my business, Emblem Digital Consulting Ltd. It can also help “clean” and reconcile data by illustrating the different instances of Tesco Mobile, or clarifying whether Microsoft-UK Ltd is actually the same legal entity as Microsoft Ltd in the UK. (It isn’t)
It also offers journalists a powerful ability to search cross-jurisdiction. Chris illustrated this with an example showing how Mitt Romney’s corporate holdings could only be unravelled by tracking branches of the companies he was active in through the states that had stronger measures around disclosing information.
The site acts as a hub to aggregate additional data like health and safety violations or county court judgements against a company or company directors. And finally, it is becoming a platform itself, with an API providing developers and external businesses the opportunity to build their own products and services off the back of the data.
Chris explained why he thought what they were doing was important. “We are not about hardcore professionals, we are for people who don’t have access to all that data, NGOs and journalists in smaller organisations” he said.
When Lehman Brothers was dissolved, it was thought they owned around 300 subsidiary legal entities. A more thorough post-mortem audit revealed over two thousand. “These companies aren’t doing this for fun, they are doing it to gain advantage and profit” Chris said.
“The whole of the last ten years has shown us how important this information is. If we don’t understand what they are doing in the financial systems, then we aren’t doing our jobs as journalists.”