Guess which Jan Moir article is missing from the Daily Mail's search results?

 by Martin Belam, 19 October 2009

Funny old world, the Internet, eh? If you search the Daily Mail website today for the most recent articles by or about Jan Moir, there seems to be one missing. I wonder if you can guess which one it is?

Daily Mail search for Jan Moir with one key result missing

I think it must just be one of those weird coincidences that looks more suspicious than it is when your site is under intense scrutiny on the web. If you click the 'All by this author' link the notorious Stephen Gately article appears, so it hasn't been removed from the index. It looks to be the only web version of one of her columns which doesn't have a headline preceded by "JAN MOIR", which is perhaps why it isn't listed when you search for her name. Maybe that happened when the original online headline was hastily revised on Friday?

Daily Mail screengrab showing both headlines

I wonder, for once in this whole affair, if this is cock up rather than conspiracy. I'd be interested to know which bits of the page their search engine indexes (EG does it look at comments (I think not), 'read more' boxes, etc).

The main body of the original and subsequent versions of this story don't include the words 'Jan Moir'.

Usually they put this in the headline ("JAN MOIR: spouts rubbish" as you can see from your screenshot) - without it in the headline, there is nowhere else on the page for it to appear (there's also the author attribution - but it seems random to me whether this is there or not).

So her latest poisonous article IS in her 'all by this author' page (as presumably meta data pulls this in), but ISN'T in the search results because the Mail's search engine can't find her name on the article.

PS All of which I know because I looked into it a bit more after this

PPS Can you make your write-comments box deeper?!?

Yes, I'm sure you are right. One of the significant differences between 'site search' and 'Google search' is that Google will have no doubt that page is relevant for the phrase 'Jan Moir' because there will be lots of link anchor text mentioning her pointing to it, and the HTML scrape from Googlebot will include the byline. The Mail's database set-up, however, may not add the name of the author into the search index as a field in itself. We have a similar thing on The Guardian site, where it is possible to 'lose' articles for really obvious keyword phrases if they actually only appear in a couple of special fields in the CMS and not the body text.

