More thoughts on Google's sitelinks algorithm
I was writing yesterday about Google's choice of sitelinks for a domain name, and I was speculating, based on the evidence of the links they list for currybetdotnet, that there may be some hand-editing involved. What got me started on this train of thought was an article by Ann Smarty on Search Engine Journal. She suggested six factors that make up the 'sitelinks algorithm'.
- Surfers oriented
- Domain-authority oriented
- Internal-architecture oriented
- On-page SEO oriented
- Brand-strength oriented
- Competition oriented
One of the things that intrigued me about her theories was how well they mapped to my own experience with currybetdotnet. Google sometimes displays a selection of sitelinks under the listing for this blog, and I'm interested as to why they appear, and how the links are chosen. Because they are sure not the ones that I would choose if given the option.
The seven articles Google uses as sitelinks for currybetdotnet are:
- Daily Sport brand hi-jacked by Russian RSS squatters
- Top 50 BBC Podcasts in Google Reader
- Reckless Records RIP - Part 1: An End Has A Start
- Doctor Who and the Vanishing Plaques
- Between a Northern Rock and a hard place
- Bloglines newspaper RSS subscriptions
- Merely trick photograpy
Of the theories that Ann proposes, I think a couple apply to currybetdotnet, but my findings on another couple are quite perplexing. Today I wanted to go through them all in turn.
This is one of Ann's suggested factors that I think currybetdotnet site scores well on, and contributes to the fact that currybetdotnet has sitelinks. Obviously there are some domains on the web that you instinctively know are authorities, like Amazon or IMDB, but Google needs to measure the whole Internet for signs of whether a site is a good addition to their index or not. My site probably gives out the following 'trust' and 'quality' signals:
- Site has been consistently updated with new content for 6 years
- Site continually attracts new organic backlinks
- Site has gathered backlinks from other reputable domains (like bbc.co.uk, searchengineland.com, telegraph.co.uk, guardian.co.uk etc)
Ann Smarty suggests that sitelinks appear when a site is a good match for a search query and there is little competition for that query. Testing currybetdotnet seems to back this part of the theory up.
Sitelinks appear for this site if you search for very specific queries related to the domain like "currybet" and "currybetdotnet". However, if you make a broader query which currybetdotnet ranks for, but where there is obvious competition, like "blog bbc media search", the site links disappear.
Similarly, an exact search for "Martin Belam" throws up currybetdotnet with the 'site links' intact. A broader one word search for "martin" lists currybetdotnet on the second page of the SERPS (behind Martin Stabe I note!) but without the 'site links' listed.
Whether my blog name has become a 'personal brand' is a debate for another day. For me, the significance of this factor in Ann's list is that from Google's point of view, the only site on the web where the phrases 'currybet' or 'currybetdotnet' appear a lot is this one. And virtually every other reference to 'currybet' on the web includes a link back to this domain. Algorithmically speaking, that might be a strong 'brand' indicator to Google.
To find out how Google sees the architecture of currybetdotnet, I downloaded two files from Google's Webmaster tools - a list of internal links and a list of external links. I wanted to see if there was any co-relation between these numbers and Google's choice of 'site links'.
For internal structure, the most popular pages were those that appeared in the variable slots in the right-hand navigation at the time Google last deep-crawled the site. These are things that had recently been published, recently commented on, or featured in my popular articles and categories selections. This bore little or no correlation to the 'sitelinks' Google has chosen.
I performed the same exercise with the list of external links, and I found none of the sitelinks to be in the top 100 pages with the most external links pointing at currybet. I can't see that, for this site at least, Google's sitelinks choice is based on linking patterns.
On-page SEO oriented
This theory doesn't really apply to currybetdotnet. My pages are all coded in a search engine friendly way, and because I am using Movable Type as my CMS, all the anchor text is consistent. However, none of the pages on the site are optimised for a specific keyword - those factors are simply based on the title of the article.
On this count I don't think the sitelinks Google has chosen for me serves my audience very well at all. Looking at my traffic figures, despite these prominent links from Google I don't see these articles performing especially well page impression wise. The best performing page of the 7 isn't in the top fifty pages viewed so far this year.
Sitelinks factors in numbers
This table sums up the numbers around some of the factors I've looked at for these pages.
|Article||Publishing date||Comments||Internal links||External links||Page views (Year-to-date)|
|Daily Sport brand hi-jacked||4/11/2007||0||3||0||42|
|Top 50 BBC Podcasts||22/11/2007||1||7||0||640|
|Dr Who and the Vanishing Plaques||9/8/2007||0||5||4||138|
|Between a Northern Rock||17/9/2007||0||3||0||16|
|Bloglines newspaper blog RSS subscriptions||23/5/2007||0||8||14||127|
|Merely trick photograpy||7/9/2007||1||3||5||147|
To block or not to block
Now, I don't think that this selection of sitelinks is particularly representative of the best content from currybetdotnet, or the most useful links for me as a consultant - as I'd much rather the 'About' page and my 'CV' were listed. Google's Webmaster tools gives you the option of 'blocking' a page from appearing in the sitelinks list, and I've often wondered whether it would be worth my while blocking these links whilst waiting for Google to choose something better?