More details on the Linux user base of the BBC, The Telegraph and The Guardian

Martin Belam by Martin Belam, 5 November 2007

A mistake can have unintended consequences, and a nice one after Ashley Higfiled's original claim that the BBC only has 400-600 Linux users is that it has thrown a bit of a spotlight on OS statistics in the UK media landscape.

First of all, Ashley has posted on the BBC's new BBC Internet Blog to upsize the estimated Linux user base to between 36,600 and 97,600.

Ashley Highfield on the BBC Internet Blog

Secondly, Neil McIntosh of the Guardian came out with some figures for Linux usage for The Guardian. He claimed 13,000 users in a day - well, most of a day - up until 4pm actually.

Ian Douglas of The Telegraph also blogged about the OS breakdown that they see. He didn't provide a complete table of figures, but he indicated that the figures were based on a sample of 13 million users, and from those he gave, it roughly worked out to:

Usage of The Telegraph web site by OS - October 2007
OS%Users
Windows XPc. 77%10,000,000+
Vindows Vista-c. 990,000
Mac7.31%973,473
Linux0.97%130,024
Windows 98 / ME / NT / 95 / CE-each >130,000
Sun OS-2,670
Windows 3.1-453
Unix-368
Commodore Amiga (Genius!)-8

I think it is really useful to have that kind of information in the public domain, and was very impressed that The Telegraph were happy to release it.

Some of the confusion arising from Ashley's clarification of the figures is that the BBC doesn't have an absolute canonical set of stats. In part I believe that is because historically the www.bbc.co.uk and news.bbc.co.uk farms were managed by different groups, so there were never any unified logfiles. Now, I think there is also probably an element of not being 100% consistent in the rulesets applied. For example, questions like do you count the requests that hit feeds.bbc.co.uk or newsrss.bbc.co.uk equally, or do you assume that the majority of these requests are simply feed-fetching bots that would skew your true user figures?

There is some more information at hand though, as on the backstage.bbc.co.uk mailing list one of the BBC News technical overlords, Kevin Hinde, has posted a breakdown of the figures that he gets to see via the metrics system available to him. The figures are for September 2007, and are broken into usage across all *.bbc.co.uk domains, and usage on the specific news.bbc.co.uk domain.

(Note that they are listed in this table in order of popularity across *.bbc.co.uk)

Usage of the BBC web site by OS - September 2007
Operating SystemUsers - *.bbc.co.uk%Users - news.bbc.co.uk%
Windows247,012,74488.74%113,519,85090.49%
Macintosh17,353,4386.23%10,866,7248.66%
Nokia3,675,2241.32%150,0070.12%
Liberate2,784,7621.00%187-
SonyEricsson2,116,7660.76%180,9160.14%
BlackBerry1,921,0660.69%363,4970.29%
Motorola1,062,3230.38%52,5870.04%
Symbian925,4650.33%161,4620.13%
Samsung802,4500.29%20,9390.02%
LG216,9720.08%8,2180.01%
Orange134,9950.05%54,5180.04%
Sagem104,3710.04%265-
T-Mobile61,6870.02%22,694 0.02%
O239,7470.01%11,3150.01%
Sharp38,3730.01%968-
NEC30,6060.01%9,6840.01%
Panasonic16,3690.01%25-
Linux15,8860.01%6,832-
Sprint13,175-7,3380.01%
BenQ12,008-91-
DOS9,300-1,026-
Philips5,853-62-
VK3,926-1,052-
ZTE3,523-318-
Unix3,224-2,764-
Sanyo1,656-116-
Toshiba1,236-175-
Siemens1,067-77-
Sun539-308-
Linux-gnu171-86-
IRIX88-49-
AIX85-42-
HP-UX48-32-
Treo30-10-
OSF111-10-
Palm11-10-
Lobster10-8-
Nextel2-1-

There has been a call in the comments on the BBC Internet Blog for the BBC to officially release this data on a regular basis. I can't see any reason why they shouldn't to be honest, except perhaps because every month it would start a chorus of "I set my user agent string to Windows XP because of badly coded sites, therefore your figures must be wrong by a factor of [insert figure from your own imagination here]"

It is interesting to note that as a proportion of users, there are more Windows and Mac users of the news.bbc.co.uk domain than across the BBC as a whole. This, of course, could lend weight to the argument that the BBC News site is most often consumed during office hours, on networks that are locked down to the two major OS variants.

Or, I guess you could equally argue that it is the sheer weight of Linux, Sun and Unix users hitting the h2g2, Doctor Who and much-missed Cult site that skews the figures in the opposite direction.

Both of those, though, would be drawing broad stereotypes out a set of data that isn't intended to provide that information - let's not forget in all of this debate that server logs simply tell you what was passed to the server at the time it received a request - not who made the request and why. Nevertheless, it is very good to have more detail out in public though - it would be great to see figures from the other ex-broadsheet newspapers as well.

Keep up to date on my new blog