Doctor Who and Britain, yes. Sarah Jane and London, no. The mystery of what makes a BBC top-level 'site'...

 by Martin Belam, 30 March 2010

Yesterday, Erik Huggers posted to the BBC Internet blog some further details of the site closures announced in the recent BBC strategic review, which, for me, served only to illustrate even more starkly how arbitrary they are.

He listed 400-odd top level BBC URLs in a text file that was by no means comprehensive. The logic of what is or isn't included as a "top level directory" is unfathomable outside of the Corporation.

Put it this way - there is a prize for anyone outside the BBC who can explain to me why bbc.co.uk/doctorwho (Sci-fi/fantasy TV show produced by BBC Wales shown on BBC One) and bbc.co.uk/torchwood (Sci-fi/fantasy TV show produced by BBC Wales shown on BBC One) are in the list, but the Sarah Jane Adventures website URL bbc.co.uk/sja (Sci-fi/fantasy TV show produced by BBC Wales shown on BBC One) isn't.

And as Tom Loosemore pointed out on Twitter, the BBC Vision site launch blog lists 242 new BBC sites deployed in the last two years alone.

Other notable absentees include anything local, like bbc.co.uk/london or bbc.co.uk/leeds, and specific sport areas like bbc.co.uk/football, updated daily. However, included in the list are places I used to hang out like /cult (closed 2007) and /collective (closed 2008) - and as Robert Andrews noted on PaidContent, a fifth of the list fall into this category of already dead site.

Mind you, who knew there was bbc.co.uk/zombies?

The list only seems to confirm the BBC continuing to commit one of the cardinal sins of IA - having a navigation and URL structure that is all about a representation of the internal organisation structure, and nothing about ease of use and transparency for the audience.

The intention may be, as Erik states, to produce more 'significant, coherent, regularly updated' websites in the future - but missing lots of the sites you currently update every day from the official list of BBC top-level directories is only going to lead to confusion.

On a more positive note, in the post, Huggers asks for ideas on preserving the currently mothballed sites as they change technology platform. My tuppence:

  1. Write something that crawls and parses them client-side to capture all the assets and store flat HTML.
  2. Put those flat HTML files on a couple of specific webservers set up with the old-fashioned BBC LAMP stack, and direct requests for bbc.co.uk/some-archived-site there. Load isn't going to be a major issue for these newly static pages.
  3. Set up a generic redirect to any request for bbc.co.uk/cgi-bin/* or bbc.co.uk/cgi-perl/* whilst in those directories, to produce an HTML message that the site is archived, thus killing 96% of links to any interactivity that existed.
  4. ...
  5. Non-profit!

Sorry, couldn't resist the geek joke at the end of the list there...

4 Comments

Widely misreported as halving the size of the site as the BBC's strategic review was, having bothered to read it, I never really understood the idea of halving the "top-level directories in the form bbc.co.uk/sitename".

You could fulfill this by inserting /tv/ or /radio/ in front of most of the show names (I'm not saying this is a good IA!). Obviously there were the other criteria as well, around the editorial priorities.

But it doesn't seem to me that having a fixed target for URLs of a certain format added anything to the editorial priorities, especially as the nice round number (half) wasn't actually rooted in the principles.

I suppose it gives them a target to aim at, but as it's so easy to do just by changing folder structures, what's the point ..?

And as to what to do with the old ones, why not just leave them there. The Guardian has pages from 2005 still on the web such as this one. Why not just do the same and mark as archived?

I work within the BBC so me answering this probably doesn't count but the reason /sja /london etc aren't included is because they're all redirects.

/topgear, /doctorwho and /torchwood are all sites that live actually within that top level structure.

But /sja redirects to /cbbc/sja so lives within the CBBC directory.

And as of November last year all the BBC Local sites have moved into the News CPS and now live under the news.bbc.co.uk/local/ directory.

I'll shut up now.

While Vision may have launched 242 sites since January 2008, it's worth looking at the trend in the past two-and-a-quarter years: 138 launches in 2008, 93 in 2009 and just 11 so far this year. Also, following the strategy of "every television programme will have its own website" and many of them being automatic, Vision's "site launches" include things like
/larkrise, /desperateromantics, /silentwitness, /restaurant and /whodoyouthinkyouare, all of which are redirects to /programmes.

I also work for the BBC, but I see Ryan has already won the prize for answering the /sja question. :-)

I think the right approach is to use structured URI's, e.g. something in the form /programmes/doctorwho/ - with top level aliases reserved almost exclusively for redirects for a sparse number of sites (so, only using them where it's most appropriate, such as for news and likely more highly trafficked destinations, but in the case of the latter at least, still just as an alias that can be retired without impacting the permanent URL, which I would not change).

I think the way /sja redirects to /cbbc/sja is a sensible usage.

I think the beeb is actually broadly doing the right thing here, but should use re-directs more - this would make it clear to everyone what's really "top level" and what isn't just by visiting the URL and seeing if it redirects or not, and solve the problem of what to do with older URL's.

The automated redirections are interesting, but I note that /silentwitness redirects to /programmes/b007y6k8 the latter auto-generated component is...blegh!

I think if the number of shows is too many to manage easily under one URL space the idea of an /archived/ destination isn't a bad one.

Keep up to date on my new blog