August 2003 [ims]

Link Text Works — Even When You Wish it Didn’t

by Leslie Rohde

Google reads our pages about the same way I read French: I can read the letters, but I have no idea what the words mean. Likewise Google will faithfully index all the visible content of our pages, even words that don’t really mean anything at all, like “click here”, “home”, “untitled”, and “contact us”. To Google, words are words, so your page might get found even for words you don’t care about. Today we’ll look at one of the primary ways that happens: link text.

A Google search for click here (no quotes) returns 10.4 million pages. By way of comparison, that is twice the number of pages Google ranks for viagra, so click here is a very competitive search. To get positioned for this search you would need to do a great job of optimization, and would still need a Page Rank (PR) of at least 8. Indeed, the top ranked pages read like a list from Fortune magazine: Adobe, Apple, Macromedia, Yahoo, Google, RealNetworks, Excite, Altavista, HP, Mapquest, Netscape, CNN, MSN and that is in just the first 30 results. None of the top 20 pages has a PR less than 9.

Supposing for a moment that our command of English was about as good as Google’s, how would we figure out what all the fuss is about click here? What does it cost? Where do you get it? What is it? You will have a very hard time answering these questions from the search results, because the words click here do not appear even once on any of the top 20 pages. When you look at Google’s cache of these pages you will see the message “These terms only appear in links pointing to this page”. It is not until you get to positions 23 and 24 in the results that you find the text click here appearing on the page, and those are just image image alt text!

So what. Nobody wants to rank for click here anyway, right? True, but Google doesn’t know that, so the ranking algorithm employed for this silly search is the very same one used for real searches. If we can figure out how to rank for click here, then the same techniques can be employed to garner top placement for the most competitive searches on the web.

Reverse Engineering the Google Ranking Algorithm

We know that there are three primary factors in the Google ranking algorithm: page title, page PR, and incoming link text. There are an unknown number of lesser influences, but these are the factors that can be shown to make the most difference. We can use search results like click here to illustrate how Google does ranking and to approximate the importance of the different ranking factors. To do that, we need to isolate one factor from the others, which our silly search does for us.

This first table shows the PR, the link popularity and page title for the top 10 click here pages. These results alone illustrate certain valuable rules to ranking at Google.

Postion Page Rank Link Count Page Title
1 10 682,000 Adobe Reader – Download
2 10 85,400 Adobe Systems Incorporated
3 10 455,000 Google
4 10 32,500 Apple – QuickTime
5 10 66,000 Apple – QuickTime – Download
6 10 38,200 Macromedia
7 9 12,500 Macromedia – Downloads
8 9 1,100,000 Yahoo!
9 10 2,930 Microsoft Corporation
10 10 22,700 RealNetworks

The first thing to notice is that link popularity – the count of links pointing to a page – does not get you ranking, and does not necessirily even get you a bunch of PR either. Witness Yahoo! with fully one million links, yet beat out in the PR game by the 2930 links to Microsoft. Not all links are created equal.

The next most obvious feature in these results is the lack of the search term anywhere in either the title tag or the body text. This is the essence of “Google bombing” where a page gets known for something that it really isn’t about. Are any of these pages”about” the subject click here? The pages themselves say no, but as we’ll see, Google is more likely to believe other pages instead.

Finally notice that PR alone will not get you top placement, as even in the top 10 there are a couple of PR9 pages that beat PR10 pages. Since none of these pages contains the search term in either the title tag or the body text, our best guess is that link text is the factor causing these pages to rank as they do.

Consider the following table. Here we’ve delved deeper into the search results and added some Link Reputation data computed by OptiLink. In addition to the PR of several additional pages from the results, the table also shows how many times our search term appears in links pointing to the page. For example, of the 900 links OptiLink found that point to Adobe download page, 93 of these links included some or all of the search term. Of these 93 links, 54 of them had the complete search text click here, 5 had just the word click, and 34 the word here.

Position Page PageRank “click here” “click” “here” Links Analyzed
1 adobe.com/… 10 54 5 34 900
6 macromedia.com 10 13 32 1497
9 microsoft.com 9 6 0 0 1150
22 aol.com 8 3 3 1350
23 firstgov.gov 10 2 14 0 1160
70 nasa.gov 10 1 1 0 1400

With this table, we gain some valuable insight into how Link Reputation impacts the ranking of pages at Google. The positioning of these pages can be explained with nothing more than the link text numbers alone. Notice the page at position 22: with a PR of 8, its better link text beats (many) pages with a PR of 10. Notice too the gap – position 9 to position 70 – that link text creates despite the lower PR that microsoft.com has compared to nasa.org.

But What About “Real” Searches?

The somewhat “contrived” search example above was used because the results pages were all very close together in Page Rank and lacked any on-page influence. This allowed us to isolate and demonstrate the effect of Link Reputation. In general, the kind of ranking analysis we did with click here will not be so easily done with more normal searches, because the search results will not show this clean isolation of factors. But that does not mean the rules we learned here aren’t valid. The search results at Google are all prepared using the same algorithm, so the rules we learn from one silly search can be applied to all of our real searches.

In previous articles, I did address real ranking problems for terms like “hats” and “cell phones” and showed that Link Repuation was a predominate factor in positioning at Google. If Link Repuation didn’t work, I would never have released OptiLink. OptiLink was built specifically for the task of Link Repuation analysis for the sole and simple reason that it worked to explain ranking where all my previous experiments failed. After eighteen months and many dozens of analyses, link text still is still easily shown to be a dominate force in search engine ranking. I have every reason to expect this to continue.

Control your linking, and you control a major component of your ranking. Don’t, and you might just get ranked for something really silly.

** Leslie Rohde is the developer of OptiLink and the author of the Dynamic Linking ebook, packaged with Michael Campbell’s Revenge of the Mini-net. A programmer since 1974 and a webmaster since 1999, Leslie is currently focused on providing webmasters with leading edge technology to advance their online businesses. Like most successful entrepeneuers, Leslie works only half time (12 hours a day) and vacations regularly at exotic destinations (anyplace but the office).

** Google and PageRank are (I’m told) trademarks of Google, Inc. Many of the domain names used in the examples above are also trademarks and service marks of companies not affiliate with the author or publisher.

Biographical and Press Packet for Leslie Rohde