More thoughts on Google's sitelinks algorithm

Martin Belam  by Martin Belam, 22 June 2008

I was writing yesterday about Google's choice of sitelinks for a domain name, and I was speculating, based on the evidence of the links they list for currybetdotnet, that there may be some hand-editing involved. What got me started on this train of thought was an article by Ann Smarty on Search Engine Journal. She suggested six factors that make up the 'sitelinks algorithm'.

  1. Surfers oriented
  2. Domain-authority oriented
  3. Internal-architecture oriented
  4. On-page SEO oriented
  5. Brand-strength oriented
  6. Competition oriented

One of the things that intrigued me about her theories was how well they mapped to my own experience with currybetdotnet. Google sometimes displays a selection of sitelinks under the listing for this blog, and I'm interested as to why they appear, and how the links are chosen. Because they are sure not the ones that I would choose if given the option.

The seven articles Google uses as sitelinks for currybetdotnet are:

Of the theories that Ann proposes, I think a couple apply to currybetdotnet, but my findings on another couple are quite perplexing. Today I wanted to go through them all in turn.

Domain-authority oriented

This is one of Ann's suggested factors that I think currybetdotnet site scores well on, and contributes to the fact that currybetdotnet has sitelinks. Obviously there are some domains on the web that you instinctively know are authorities, like Amazon or IMDB, but Google needs to measure the whole Internet for signs of whether a site is a good addition to their index or not. My site probably gives out the following 'trust' and 'quality' signals:

  • Site has been consistently updated with new content for 6 years
  • Site continually attracts new organic backlinks
  • Site has gathered backlinks from other reputable domains (like bbc.co.uk, searchengineland.com, telegraph.co.uk, guardian.co.uk etc)

Competition oriented

Ann Smarty suggests that sitelinks appear when a site is a good match for a search query and there is little competition for that query. Testing currybetdotnet seems to back this part of the theory up.

Sitelinks appear for this site if you search for very specific queries related to the domain like "currybet" and "currybetdotnet". However, if you make a broader query which currybetdotnet ranks for, but where there is obvious competition, like "blog bbc media search", the site links disappear.

Google search results with no sitelinks

Similarly, an exact search for "Martin Belam" throws up currybetdotnet with the 'site links' intact. A broader one word search for "martin" lists currybetdotnet on the second page of the SERPS (behind Martin Stabe I note!) but without the 'site links' listed.

Google SERPS for 'martin'

Brand-strength oriented

Whether my blog name has become a 'personal brand' is a debate for another day. For me, the significance of this factor in Ann's list is that from Google's point of view, the only site on the web where the phrases 'currybet' or 'currybetdotnet' appear a lot is this one. And virtually every other reference to 'currybet' on the web includes a link back to this domain. Algorithmically speaking, that might be a strong 'brand' indicator to Google.

Internal-architecture oriented

To find out how Google sees the architecture of currybetdotnet, I downloaded two files from Google's Webmaster tools - a list of internal links and a list of external links. I wanted to see if there was any co-relation between these numbers and Google's choice of 'site links'.

For internal structure, the most popular pages were those that appeared in the variable slots in the right-hand navigation at the time Google last deep-crawled the site. These are things that had recently been published, recently commented on, or featured in my popular articles and categories selections. This bore little or no correlation to the 'sitelinks' Google has chosen.

I performed the same exercise with the list of external links, and I found none of the sitelinks to be in the top 100 pages with the most external links pointing at currybet. I can't see that, for this site at least, Google's sitelinks choice is based on linking patterns.

On-page SEO oriented

This theory doesn't really apply to currybetdotnet. My pages are all coded in a search engine friendly way, and because I am using Movable Type as my CMS, all the anchor text is consistent. However, none of the pages on the site are optimised for a specific keyword - those factors are simply based on the title of the article.

Surfers oriented

On this count I don't think the sitelinks Google has chosen for me serves my audience very well at all. Looking at my traffic figures, despite these prominent links from Google I don't see these articles performing especially well page impression wise. The best performing page of the 7 isn't in the top fifty pages viewed so far this year.

Sitelinks factors in numbers

This table sums up the numbers around some of the factors I've looked at for these pages.

ArticlePublishing dateCommentsInternal linksExternal linksPage views (Year-to-date)
Daily Sport brand hi-jacked4/11/200703042
Top 50 BBC Podcasts22/11/2007170640
Reckless Records13/8/20070611389
Dr Who and the Vanishing Plaques9/8/2007054138
Between a Northern Rock17/9/200703016
Bloglines newspaper blog RSS subscriptions23/5/20070814127
Merely trick photograpy7/9/2007135147

To block or not to block

Now, I don't think that this selection of sitelinks is particularly representative of the best content from currybetdotnet, or the most useful links for me as a consultant - as I'd much rather the 'About' page and my 'CV' were listed. Google's Webmaster tools gives you the option of 'blocking' a page from appearing in the sitelinks list, and I've often wondered whether it would be worth my while blocking these links whilst waiting for Google to choose something better?

Google Webmaster tool to block sitelinks

4 Comments

I see you've overtaken me for "Martin" SERP position, probably thanks to this post!

In the long run, I think we'd better both watch our backs for Mssrs Moore and Rosenbaum.

Very interesting article ... In fact my site is also showing a sitemap and I can confirm most - but not all - of the points. I´m speeking here about the keyword "Alopezie" and the site alopezie.de, which is a German site about hair loss.

Basically Google seemed to have picked a quite interesting choice of subpages, which does not give a clear at the beginning.

The first column has 4 links to 4 (out of 6) forums. Makes sense that way, the minor ones are left out. Selection seemed to be NOT by hand, as a quite strange abbreviation "Allg." is used (stands for "general"), which is pretty misleading. So I guess they used visitors numbers to select this forum.

The second column of sitelinks is a crude mixture from other places, 3 links are related to content, one to "help on the forum" (for what ??). The 3 links to content are fitting pretty well, how they have been selected ?? - I don´t know ...

1. Surfers oriented:
Yes, the selection makes sense (with the limitation of the title and the strange link to help on Forums)

2. Domain-authority oriented
Yes, the domains is exactly the keyword

3. Internal-architecture oriented
Only partly. The forums are well visible on the site, the other 4 links are very hard to find

4. On-page SEO oriented
Clearly NO, 3 links are horribly bad (Which supports my opinion, that Google does not need SEO-friendly links ...)

5. Brand-strength oriented
Yes, this is clearly fitting, anyhow I doubt that played a role here.

6. Competition oriented
Yes, few competition here. In fact currently place 1 + 2 are take from the site.

I'm beginning to suspect that you don't ACTUALLY believe that they're hand picked, but that you're just saying this to get traffic!

Anyway, you can block individual sitelinks using the Google Webmaster Tools (I just discovered).

Hi Martin,

Interesting Piece. I came across it while I was looking for information on the subject of sitelinks. Maybe you can answer my question, I was wondering how long it takes to get back the sitelinks when you've lost them. Assuming that straight away after you've lost them you would "qualify" to have them again (for whatever reason), would you get them back straight away, or is there a certain waiting period? I ask this because my (also German) website 'toptarif autoversicherung' just lost the sitewides for the word Autoversicherung and I think I would qualify to be having them again (got some additional backlinks and made some onside changes). I'm losing a lot of traffic because of this and as you can imagine I would like to be getting that back :)

Kind Regards,

Frits

Keep up to date on my new blog