A day in the life of BBCi Search - part 4
The spelling of search terms presents perhaps the biggest challenge to the BBCi Search team, and to the process of information retrieval on the web as a whole, in bringing back relevant and targeted results to the user.
When a search is made on the BBCi site, effectively just two pieces of information are passed to the search technology - the search query itself, and the referring page. With these two pieces of information search is able to provide results that are contextualised in places where this is appropriate - for example, different top results for the search term 'china' depending on whether you are on the BBC News site, or on the Antiques site.
However, an analysis of search terms shows that 1 in 12 feature incorrect spellings. On December 11th this added up to over 30,000 search queries with an incorrect spelling. The combination of having to spell words and having to type is clearly a barrier to information access on the web to a large section of the online community. To provide a useful result set when one of the only two pieces of information you have about the search is wrong is a formidable task.
BBCi Search employs two mechanisms to combat this.
Firstly, 66% of searches with misspellings on the site were offered a "Did you mean?" spelling correction. This uses a spelling dictionary to suggest to the user that they may have meant a different word. Selecting the spelling correction link re-runs the search query with the new spelling.
Secondly, the editorial team at BBCi have also used misspellings as synonyms within their taxonomy - and because of this 96% of the searches for BBC channels, stations, programmes and brands with incorrect spelling on December 11th got the intended search as their top 'BBCi Best Link'. The "BBCi Best Links" and "BBCi Recommended Websites" are hand-chosen, and the editorial team have the ability to add synonym's to individual URLs. This means that the Cbeebies homepage can be returned as the top result even when the search query has been "cbbebies", "cbbies" or "ceebeebies".