What can news organisations like news.com.au learn from the BBC’s approach to online voting fraud?

 by Martin Belam, 29 January 2013

When systems fail and embarrass a news organisation, the temptation is always to blame the technology or the programmers. But no computer forces editors to commission content based on flawed sources.

I got a reminder of the voting systems I used to work on at the BBC when Neil Thackray of The Media Briefing tweeted a link to this blog post from October — “The Ubermotive Guide to Media Influence

In it a “hacker” describes how they were able to influence the editorial content of news.com.au by posting a lot of fake votes into online polls that the site was running. To demonstrate how much they’d compromised the system, they even took to deliberately dead-locking polls on 50%-50%. Nobody at the company noticed anything was amiss until it was picked up by Reddit, and the manipulated polls were being referenced in stories, and forming the basis for commissioning.

Flashback to the mid-2000s, and my producer role at the BBC for “Online Voting”. Central “New Media”, where I worked, had a system called polling.pl. Written in Perl, an admin panel allowed site producers to set up “self-service” votes. The system worked well enough, but didn’t handle high load terribly well, and it wasn’t very hard for determined users to “vote stuff”.

Factual & Learning meanwhile had a system called log2results. This burnt the casting of a vote into the server logs, and a harvester script came along and counted them later. It also analysed votes for odd patterns. It was hard to influence the outcome of a vote, but this system couldn’t display dynamic counts back to the user.

News Online naturally had their own system as well. It handled load really well, but couldn’t, for architectural reasons, be served on pages aways from the news.bbc.co.uk sub-domain. It also had a fatal flaw — if two votes were submitted at exactly the same time, the counter would sometimes reset to zero. This used to annoy programme makers and journalists a lot.

For a good few months I was on the fool’s errand of trying to reconcile these three systems into a single requirements set and technical spec for one über-arching-pan-BBC solution that would scale, be dynamic, self-service to set up, and difficult to stuff with spoof votes. And which would keep three technical and editorial departments with three very different cultures happy that it was better than their system which it was replacing. And possibly shower users with glitter and unicorns into the bargain…

In the course of it, I learned an awful lot about the length that people will go to fixing relatively meaningless online votes.

Now I always tried to be clear at the BBC that online votes should be for fun and not serious matters, but that didn’t stop a myriad of producers around the BBC using them inappropriately. I’d be the person making the awkward phone call to some BBC local radio website saying “You really shouldn’t be running a competition with a prize worth £1,000 being awarded to an individual on the basis of an easily spoofed online vote that only has 52 votes.”

The vote that was targeted the most in my time there was the Today programme’s “Listener’s Lord” poll. A fringe candidate had unswerving online loyalty, and we uncovered what seemed like a whole university campus computer network repeatedly voting for them. The beauty of the log2results approach was that you could order the counting script to discount anything that matched a particular behaviour, and remove those votes from the final score. The people trying to manipulate the vote got no feedback that the results of their efforts weren’t being counted, so never felt the need to change their attack strategy.

But all that was the mid-2000s, and news organisations have had an awful lot of time to get their heads around the validity of online votes since then. The BBC had, and still has, a strict set of editorial guidelines. The current text states that:

“‘Straw polls’ - including phone, text and online votes - have no statistical or numerical value.

They can be an effective form of interaction with the audience, illustrating a debate, but they should only be used with an explicit reference making it clear to audiences that they are self-selecting and not representative or scientific. Such votes cannot normally be said even to represent the audience for the programme or website, they only represent those who chose to participate. This applies even when there is a large response.

They should not be referred to in our output as a ‘poll’. The term ‘straw poll’ itself is widely misunderstood and should normally be avoided in output.”

For me, that is what has gone wrong with the process at news.com.au, and it is a set of guidelines that any news organisation would do well to adopt.

Although people will look at the technical details, and blame the website security and the programmers, in the end it is the editorial decision to take the numbers at face value, and to commission content on the back of this kind of vote that has been their downfall.

Keep up to date on my new blog