A day in the life of BBCi Search - part 1

Martin Belam
Written by
Published 27 March, 2003
Categories: , , ,

<< previous | next >>
No comments yet 
Add your comment Add your comment

A day in the life of BBCi Search - Introduction

Since BBCi launched in November 2001, the improved search offering has been collecting data on the way that BBC website users search both the BBC's website, and through the homepage Websearch, the whole wide web.

Given such a mass of data, the easiest way to aggregate and make sense of it has been to measure the search terms that are most popular. Indeed, the BBCi homepage has a panel displaying the three most popular search terms of the moment, and an editorial and taxonomy team at the BBC constantly monitor the searches gaining high volume in order to match the correct content to them.

BBCi homepage, showing the top three recent search trends

The team use reports that are generated hourly, daily and weekly to monitor the activity of the users. An hourly email alert identifies developing trends in the search terms, and specialist reports focus on trends within searches that have been generated specifically on the BBC News & BBC Sport sites. Daily lists of the most popular search terms from both the site as a whole, and the homepage websearch are generated, and weekly summaries focus on searches that originate in specific content areas of the site like Food or Cult TV.

Screenshots of hourly and daily internal BBCi Search statistical reports

However, it became clear that the searches that make the top 500 searches of the day are not necessarily representative of search behaviour as a whole. The majority of users on BBCi put something unique into the search box, and 80% of the users of the service put in search terms that never appear on any of the statistical reports, because they only happen once or twice during the course of a day.

I therefore wanted to find out what it was that this vast majority of users were actually doing on the service, and had to find a way of analysing their behaviour without relying on our existing model of aggregating popular search terms.

Methodology

One way to go about this was to isolate one individual day, and to analyse in depth the searches that had been made. The log files collected by the search service contain information not only on the terms used, but on the time the search took place, and the area of the site that the search originated from.

I chose Wednesday December 11th, as it was a weekday, during UK school terms, and there were no major breaking news stories or broadcast events to dominate results. A school term weekday is the most typical day of the year, and so the most typical use of the service - since the school calendar affects traffic to BBCi web services.

I also know from experience that search behaviour is affected by large breaking news stories, for example the loss of the space shuttle Columbia, or major UK broadcast events, like Test The Nation or the launch of BBC3.

To analyse the search terms I took 10 separate 6 minute samples from the log files, at different times of day, from 1am to 10pm. This was still too much information to classify, so I reduced the information to searches that had been made from the BBCi homepage at www.bbc.co.uk, and the searches that were made from the 404 error page. These are the most context neutral pages on the site, and reduced the amount of information I had to deal with down to a considerable but manageable 15,000 search terms.

I then took further 1 minute samples across the whole service to ensure that the data I was using was representative, and classified as a control sample an additional 3,000 search terms, to ensure that searches from the homepage and the 404 error page were representative of the usage of the service as a whole.

I measured the search activity on the day both in quantities using Perl scripts and spreadsheets, and by the hand-classification of individual search terms.

In part 2 of this article I will look at the UK and regional focus of the search terms.

No comments yet
Leave your comment


Alan Turing wouldn't be impressed with this crude test,
but please prove you are a person and type toothpaste into this box:
  

A limited set of HTML tags are allowed in comments: a href, strong, em, ul, li, blockquote
To protect against spam your comments will not appear on the site until I have manually published them.
* Your email address will never appear on the site.

Search


Search powered by Google

Subscribe

Subscribe via email or RSS RSS icon
Get updates to currybetdotnet sent to you via email

About Martin Belam

I'm an internet consultant and writer, with 8 years experience in product management, information architecture, and user experience design for global brands like Sony, Vodafone and the BBC. I specialise in advising on search, widgets, online news publishing and bulk email delivery.
Martin Belam CV
email: martin.belam@currybet.net
tel: +44 (0) 7801 828718
About Martin Belam and this site

Popular categories

BBC, Doctor Who, Ghost Walks, Media, Music, Newspapers, Search, Web

See all Categories