Searching The Scotsman

 by Martin Belam, 7 July 2006

I've been surveying the strengths and weaknesses of site searches across a number of British newspaper web sites, and today I'm going to look at a newspaper which isn't seen as one of the major national newspapers south of the border, but which is one with a significant and well developed online presence - The Scotsman

The Scotsman homepage

The user can reach the site's search facility by following a link on the navigational toolbar.

The Scotsman homepage

Selecting this link takes the user to the search homepage. This couples a search box with an extensive directory of content to facilitate the browse mechanism of information seeking. A radio button mechanism allows the user to choose between searching 'News & features', or searching an archive of material that covers 1817 through to 1950.

The Scotsman's search page

Selecting 'News & features' allows the user to perform a search over the last archived year of stories. Each result comes with a title, the name of the original publication (as The Scotsman site also covers content from the Sunday edition of the paper and the 'Evening News'), a date stamp, and an extract of the article.

The Scotsman's search results

The Scotsman's search is powered by Convera, and offers an extremely comprehensive, and complicated, advanced search form - probably the most sophisticated one offered by all of the newspapers that I have looked at so far. Searches can be performed just over article headlines or bylines. There are options to select a single 'zone' or multiple 'zones' of the site, and restrict the results to pages from these zones. The date cannot be fine-tuned precisely, but a drop-down menu does offer some options of restricting the results returned by date - 1 week, 2 weeks, 1 month, 6 months or 1 year.

The Scotsman's search results

The advanced search from also allows users to select a 'style' of search and a 'mode' of search. The Scotsman's help page describes the 'styles' thus:

A narrow search style doesn't mean that you'll get a small number of search returns - it means that your query is interpreted in a highly specific way. The system will be very particular about the way you've spelled a word and it interprets your request very strictly. The pages it returns should have a strong relevance to your search term and because of this, bear in mind that very few of them will match precisely enough to be returned.

An Average search style, as the name implies, defaults to what is called an average expansion level. This means that it is a middle of the road option which interprets your search term reasonably strictly and provides more returns than the Narrow search style. Spelling is treated slightly more loosely. As you'll see, this is the default option.

A Broad search style has what is called a high expansion level. It treats your query as a very general one. Even pages which only have a fairly remote relevance to your query will be returned and this means that many web pages will match and so the downside of this tolerant approach is that you'll have a lot more material to wade through.

The help site goes on to suggest that:

One good tactic is to start with a Broad search and then proceed through Average and Narrow in order to refine what is returned to you.

I couldn't really see what this advice was getting at here - since surely if you are not varying your keywords the results of a 'narrow' search will be best even if you haven't looked at the 'average' or 'broad' search results?

The use of the different search modes really is a very advanced feature - offering concept, pattern or Boolean searching. Concept uses semantic indexing to find pages about the 'concept' of your keywords, pattern evaluates your search and finds things like what you have entered. The help suggests this is particularly useful where you are unsure of the spelling of words. Boolean search allows complex logical queries to be built. These features seem very much aimed at the professional researcher rather than the casual user. The help is full of search and information retrieval jargon - and I'd love to see some figures on just how often these features are actually used.

Here is a summary of the features of The Scotsman's search facility.

The Scotsman - feature summary
Results per page 10
Article excerpt or abstract Yes
Date stamp (day/month/year) Yes
Time stamp (hours/minutes) No
Article word count No
Navigational or Section information No
Specifies original publication Yes
Specifies original edition No
Specifies original edition page number No
Results display colour-coded No
Search terms highlighted in results No
Relevancy score (%) No
Destination URL displayed No
Sponsored links featured in results No
Site offers web search No
Default search Site search of last archived year
RSS feed of search results No
Advanced search options Yes - comprehensive including different modes and styles of search
Search by date-range Yes - generic ranges from advanced search

<added 10th July>

Thanks to Sober and Industrious, who pointed out that I had suffered from an horrendous case of banner and promo blindness when I originally posted this entry - The Scotsman does have a search box from the front page, which can clearly be seen from this screenshot of the page.

The Scotsman front page search box

It is just that somehow I didn't manage to notice it.


Keep up to date on my new blog