Amazon & AOL - utilizing a user's context to improve search results

 by Martin Belam, 25 April 2003

I mentioned previously the excellent and informative set of papers at CHI 2003 - Best Practices and Future Visions for Search UIs: A Workshop. Another one of these that has really caught my eye is Peter Gremett's paper on 'Utilizing a Users Context to Improve Search Results' [word doc 182kb], which focuses on usability work carried out by AOL in 2001 on the Amazon e-commerce interface.

Whilst there are clear differences between a retail site like Amazon and the BBCi site, there are some striking parallels in the problems posed with conveying to users the 'scope' within which they are searching. The BBCi site, like Amazon's product portfolio, is divided into broad categories, some of which have cross-references and overlaps, and some of which have implied cross-references and overlaps which are not reflected within either the site or data structure.

Additionally I have a keen interest in the Amazon interface and the AOL user base. The interest in the Amazon interface stems from that fact that repeated user testing has shown that a 'tabbed' interface to search is problematic. Yet I also observe that many search engines (Google, Yahoo!, and All the Web for starters) use a tabbed or tab-like interface on their results pages, and that two of the most popular websites on the planet - Amazon and Hotmail - use tabs, as does the Microsoft Windows operating system within its dialogue boxes. My interest in the AOL user base stems from the fact that within the "new media world" AOL users are often perceived as inexperienced people who stumbled across the internet via a free CD. Given the size of their user base this can't hold true for all of their users, but I assume that within their design process must be the realisation that their interface is going to be the first introduction to the "internet" for a lot of people. The alchemy of how we make a tabbed search interface usable for novice internet users, as part of the BBC's role as a 'trusted guide to the internet', clearly benefits from being informed by how similar issues have been dealt with by these two major websites.

Peter Gremett's study concentrates on how to leverage the contextual information from a user to inform the processes of the search technology, whilst making it transparent, coherent and intelligible to the user. It also contains several points that correspond very closely to behaviour I have observed or recorded on BBCi search. One of the first paragraphs in the paper states that:

it was observed that the majority of the time users browsed first and then searched when necessary.

On BBCi we also find similar behaviour when a user is within a 'scoped' search area. BBCi Search currently works in such a way that using the search box in the top-right of the screen at will initially search solely over the content of the BBCi films site. Studies of the search terms used in these scoped searches show that 85% of the terms entered are 'appropriate' for that scope. And by appropriate I mean that it seems not unreasonable that a user at who searches for 'spiderman', 'dvd' or 'films on tv this weekend' would expect to get results in the context of

Amazon, and the AOL shop interface that Gremett was involved with designing, employ a system where a combination of tabs, drop-down menus and context altered the behaviour of the search. In a broad confirmation of our observation of 85% of users, Gremett found that for most users scoping the results in this way was effective, but also observes that:

In order for this technique to work, the system must know where the user is located within the site hierarchy at all times

and that Amazon also provides an 'all Amazon' get-out

This item [All products] is always present in the drop-down menu regardless of where the user is in the taxonomy

This is comparable to the BBCi offering of an 'all BBC' tab within all of the search result interfaces (excepting the 'walled garden' searches utilised by the areas of the site provided for children). Having said that 85% of searches are appropriate for their scope, BBCi search has to cope with the fact that the remaining 15% require a search that extends beyond that, and that within the 85% there may be searches that would be better served by extending the scope across the whole of the BBCi offering.

Gremett illustrates a similar situation with the classification of the film 'Shrek'

Now lets say a user is looking for the movie "Shrek". The user proceeds to enter the term "Shrek" into an e-commerce search box from the movies category without contextualizing the search. The system has no idea that the user is looking for the animated movie. As far as the system knows the user could be looking for Shrek toys, Shrek sound track, Shrek kids clothes, Shrek Gameboy games, etc.. By knowing where the user is located and utilizing that as a default limiter for search queries it increases the relevancy of the results. In the above example, if the location of the website is utilized, the Shrek movie will be contained in the results and will likely be the first item.

but this doesn't work all the time

there were some problems with products that were not cross referenced between categories. for example, several participants mentioned that they would expect to find the "Shrek" movie in "Kids & Toys", "Comedy" or "Children & Family" categories.

A comparable example of this would be a search for a band like "The White Stripes" on the Radio 1 site, where there may be items of interest to the user not only within the Radio 1 alt section, but elsewhere on BBCi at Radio 6 Music, Top of the Pops, the Music artist database, and BBC News entertainment sites - and possibly even from the Press Office site because "Fell in love with a girl" was used in a BBC trail last year. Here, even though the search may be 'appropriate' for the scope, limiting the results may not be the best default option for the user.

Gremett's paper also includes some excellent observation of how a web page is viewed:

When participants arrived at a category page they would quickly scan the page looking for information that would help them complete the task. However, at this point they often did not see items that looked helpful or there were too many choices to explore. On the category pages the users tended to pay the most attention to the center content first and then would scroll down on the page looking for information. After scanning the center the users would look to the left hand navigation. The large majority of people at this point would enter a search.

This is an observation that I believe applies to both content pages and search results pages. Within search logs we see repeated use of search terms that cover areas of the site that were navigable to if the user had spent time looking at every section of the page, and we see a large number of examples of another Gremett observation, that:

If the naming convention did not match their model or they had no idea how to further classify the object within the site's hierarchy, they then entered a search term to try and accomplish their goal

At the BBC we attempt to overcome the nomenclature problem with the use of synonyms and 'best bets'. However one crucial, and extremely frustrating, finding of our usability studies on search results pages is that users see just that - the results. Features like related links, spelling correction, help, and additional navigation are invisible to many users. It is informative to ask a user to count the number of links on a search result page. The majority will say that the number of links is 10 if the number of results displayed is 10 - all of the features, navigation and contextual help seem to be in a blind spot.

One thing that really emerges for me is that it doesn't matter if like Amazon you are trying to sell products, or like AOL you are trying to re-skin another service to suit your audience, or like the BBC you are trying to provide information and entertainment - the obstacles placed in front of users trying to achieve a goal are very similar when you are dealing with a large quantity of products or content, especially those which can have ambiguous context or multiple classifications.

Keep up to date on my new blog