Google PageRank - Democracy or Corporate Muscle?

Google PageRankâ„¢ is what got Google where it is today.

Google has become the world’s favorite search engine, and on average it probably brings Websites over 50% of their new visitors (when you take into account visitors from Yahoo Web page searches that are also provided by Google). For many Websites, mine included, Google brings nearer to 90% of all new traffic.

Recently, Google PageRank has attracted some controversy. Now that the dust has settled a little, this article attempts to take a more rational look at PageRank and its strengths and weaknesses, and to consider where Google could go from here.

What Is PageRank?

Google make big claims for PageRank. They explain the concept of PageRank as follows:

PageRank relies on the uniquely democratic nature of the Web by using its vast link structure as an indicator of an individual page’s value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important."

Important, high-quality sites receive a higher PageRank, which Google remembers each time it conducts a search...

The original algorithm for calculating PageRank was published by the founders of Google, Sergey Brin and Lawrence Page, in the paper "The Anatomy of a Large-Scale Hypertextual Web Search Engine". Although Google may well have refined the algorithm since then, we know from this paper that the PageRank of a Web page is a number calculated using a recursive algorithm in which the page receives a share of the PageRank of each page that links to it. The share that page A receives from page B depends on the number of outgoing links on Page B (as the number of links increases, the value of each link decreases).

In other words, PageRank is a mathematical calculation that takes into account only the number of pages and the number of links on those pages in the whole Web of hyperlinks that lead to the page in question. Content is not taken into account when PageRank is calculated. Content is taken into account when you actually perform a search for specific search terms.

Who Benefits?

So how do Google make the leap from this relatively simple concept, to claiming that "Important, high-quality sites receive a higher PageRank"? Well, as they say, they interpret a link from page A to page B as an indication of the importance and quality of page B. But of course, there are many other reasons why page A might link to page B:

  • The owner of page A wants to promote page B because it is part of his own Website
  • The owner of page A wants to promote page B because it is another Website that he owns
  • The owner of page B pays for an ad on page A
  • The owners exchange reciprocal links specifically to boost PageRank
  • The owner of page A is an affiliate of page B and receives commission on sales
  • Page A is a news story (good or bad) about page B’s Website

In most of these cases, the importance or quality of page B has little to do with its link being placed on page A. Worse still, in many cases it is simply commercial interest that drives the number of links to page B.

The result is that PageRank favors business, and particularly big business. A business that sells a product or service on its Website will naturally receive PageRank because of affiliate links, advertising and resources devoted to Web promotion. A Website that offers information or free services will find it much more difficult to attract incoming links, and therefore, to achieve a good PageRank. It does seem that corporate muscle is useful when it comes to winning PageRank.

But That’s Not All…

When you actually perform a search on Google, PageRank is only one of the factors that are taken into account in deciding which results are prese, and in what order. Google’s own explanation continues as follows:

Of course, important pages mean nothing to you if they don’t match your query. So, Google combines PageRank with sophisticated text-matching techniques to find pages that are both important and relevant to your search. Google goes far beyond the number of times a term appears on a page and examines all aspects of the page’s content (and the content of the pages linking to it) to determine if it’s a good match for your query.

What this means is that it’s a combination of content and PageRank that determines the sequence or the ranking of the search results that Google returns. The ranking of search results is very important, as most users won’t bother to look beyond the first 20 results or so. It’s important to the user, because if the search engine doesn’t return the mostly relevant results in the first 20, the user gives up on the search — and loses faith in that search engine. It’s obviously important to a Website to be listed in the first 20 results for relevant search terms, otherwise that Website will receive very little traffic from the search engines.

For most searches, Google’s ranking algorithm works very well for the user, and indeed I personally use Google for almost all my searches. Google usually returns relevant results, and often returns what I would consider to be the most important Website first. It is the PageRank factor that ensures that when you search for "Amazon", it’s Amazon.com’s home page that is returned first (although I’m not sure why anyone would need a search engine to find Amazon!).

Unexpected Effects

However, PageRank is such an important factor in the ranking of search results that it can have some very significant effects.

Restricted Competition

One effect is illustrated by a search on Google for the words apparel store. Within two months (and maybe sooner) of Amazon opening its apparel store, this search returned Amazon’s apparel store first in the results. The reason that Amazon’s page is ranked first is of course that as well as being relevant, it has gained massive PageRank from being on Amazon’s Website. The Amazon apparel store may well be an important, high-quality site, but that is not the reason it has acquired its PageRank. It has gained its PageRank by being part of the huge Amazon.com site for books, CDs, etc., and all those affiliate links to the books section in particular.

Does this matter to the user? In the short term, probably not. The user has received a relevant set of search results, and may even be pleased to have found Amazon’s apparel store. In the longer term, however, it may matter more. Small companies and small Websites find it hard to gain PageRank, and therefore, top rankings in the search results, no matter how relevant their sites may be. This represents a barrier to new entrants in the market, which in the longer term restricts competition and damages consumer choice. With Google’s increasingly dominant position, that side-effect is something to be concerned about.

Decreased Relevancy

In some cases, the effect of PageRank does actually damage the relevancy of results. If you search for free Web page in Google’s standard search, the top ranking result is for digits.com, offering free page counters, and only 5 out of the first 10 results offer free Web pages. The others offer free search engine submission, free translation, and free font downloads (from Microsoft.com). In the case of digits.com, it is the back links that are required on all sites using the digits.com page counter that has given their site a huge PageRank, bringing it to the top of the results.

The search results for this search can be improved if you use an exact phrase search on Google’s advanced search page or put the search phrase in quotes, but I suspect only a tiny proportion of users ever use these options. In an exact phrase search for "free Web page", digits.com drops to number 4 in the results. However, this number 4 ranking is still a little surprising, given that the phrase "free Web page" does not appear on the page at all, and appears only in links pointing to that page. This illustrates the importance of link text in search engine optimization for Google.

Suppose you are looking for office space in New York. You might search for New York Office. In this case the top ranking page, whether you use an exact phrase search or not, is the page "New York Governor George E. Pataki", which again does not contain the exact search phrase on the page, but only in links pointing to the page. The page does however have a Google Toolbar PageRank of 9 to account for its position. In fact if you use an exact phrase search for "New York Office", I don’t think any of the top 20 pages contain the exact search phrase other than in links pointing to them!

Of course if you try hard enough, you can get all sorts of odd search results! How about a search on Google for Biggest Garden On Earth! Guess what Google returns first? Yes — the Amazon.com home page! Why? Because the page title is "Amazon.com–Earth’s Biggest Selection" containing two of the keywords, another keyword "Garden" is on the page, and of course it has massive PageRank.

What this shows is that if you have a Website with Google ToolBar PageRank of 9 or 10, like Amazon, Microsoft, Adobe, etc. then you are virtually guaranteed a Google top ranking for the keywords of your choice on a new Web page, if you put those keywords in the title of your page and link to it from the rest of your site. The content of the page would not matter at all. It does make you wonder why Amazon bothers to use Google Adwords so much!

These examples are of course the exceptions. As I’ve said, Google is the search engine many people prefer to use, and for the vast majority of searches, Google returns relevant search results that satisfy the user. And this is exactly what Google needs to do to continue to win market share.

Google’s To-Do List

Fortunately for most Webmasters there are plenty of search terms where you are not competing with the likes of Amazon and Microsoft. With some careful attention to page titles, page content and link text, it is possible to achieve reasonable rankings within the search results. For instance, one of my Websites has the top ranking for the two best relevant traffic-generating search phrases, and another has a number 3 ranking for my preferred search phrase. Yes, you do need to make sure you obtain links from pages with reasonable PageRank, but it is not usually necessary to go to the extremes of search engine optimization. In fact you need to be careful not to go beyond the bounds of what Google consider to be ethical search engine optimization techniques, otherwise you will receive the dreaded PageRank Zero penalty!

However, with Google’s increasingly dominant position, the search giant will come in for more and more criticism if its search results are seen to work in favor of big business and against free market competition.

Google are of course working hard all the time to improve their algorithms and it will be interesting to see whether these sorts of concerns are taken on board and addressed.

In the short term, Google may need to consider these points:

  • Increase the weighting of proximity of keywords, which would increase the rankings of exact phrase matches, even if an exact phrase search was not specified.
  • Increase the weighting of keywords in visible text on the page in order to reduce the number of times pages are included in the results only with keywords in links pointing to the page.
  • Consider capping the weighting of PageRank at some value so that pages with a very high PageRank are less overwhelming. Alternatively, vary the scale so that as you move up the PageRank ladder, the increase in weighting does not increase proportionally. The scale is probably already logarithmic, but it doesn’t seem to have the desired effect.
  • Default searches to look for both the singular and plural forms of search words. This is a controversial suggestion, as some searches work better this way, while for others i’s a detrimental step. However, I believe more searches will be more successful if this approach is taken. It could perhaps be introduced as a selectable option in the advanced search.
  • Reduce the weighting of keywords in the page title. This is one part of the Web page that users hardly look at, and is therefore easy to manipulate. This suggestion will therefore be unpopular with Webmasters!

Now, for the big one!

What is really needed is content- or topic-sensitive PageRank. In other words, PageRank should be calculated for each search term used, so that PageRank is only accumulated from links from relevant pages all the way back through the whole Web of links. The problem is that the content factors of the search ranking algorithm are only evaluated at the time of the search, and to calculate PageRank at search time would be impossibly slow, especially as it is a recursive algorithm.

However there have been research papers published on proposals for calculating content-sensitive or topic-sensitive PageRank at crawl-time. One such paper is "Topic-Sensitive PageRank" by Taher H. Haveliwala (be prepared for some mathematics if you read that paper!). Haveliwala proposes that for each Web page, a separate PageRank is calculated for each relevant topic represented by the categories of the Open Directory Project. By limiting the number of topics to Open Directory categories, and as most Web pages will not have content relevant to many topics within this engine, the amount of computing power required is not impossible.

Another paper is "The Intelligent Surfer: Probalistic Combination of Link and Content Information in PageRank" (pdf) by Matthew Richardson and Pedro Domingos, who propose pre-calculating separate PageRanks for all search terms. Their experiments suggest that even for millions of search words, the computing power and storage is (only!) between 100 and 200 times that needed for calculation of a single PageRank.

The problem for Google is that the last thing they want to do is to increase the time it current time it takes for the crawl and the update. At the moment their efforts are spent finding ways to update the index more frequently so that their search results reflect what’s in the Web today, and not what was there last month.

Google is going to find it hard to balance all the demands and pressures it faces, but I’m pretty sure they are better equipped to succeed than most. Time will tell…

Replay

Category: marketing Time: 2003-01-16 Views: 1
Tags:

Related post

  • Obsessing with Google PageRank 2009-05-25

    Every time I write an SEO article I get a bunch of comments from web marketers who are obviously obsessed with Google PageRanks, so obsessed that they literally believe that PageRanks influence everything from rankings into Google's SERPs to the over

  • Check Google PageRank for a given URL 2013-03-11

    Does anyone know how to write a Mathematica function that returns the Google PageRank of a given URL? I found a tutorial for achieving this in a bash script here, but I have no idea how to port this over to Mathematica. --------------Solutions-------

  • How to find Google PageRank of a SSL/HTTPS Page 2013-04-04

    If a page is using HTTPS it will display its page rank as the page rank of its homepage, for instance this page is displaying a PR8 - https://addons.mozilla.org/en-US/firefox/user/publiser45/ But that's actually the PR of the root of the subdomain, a

  • Google PageRank calculation 2013-04-10

    I know this might be a stupid question, but I have recently heard, that Google PageRank takes into account links from my sites to big pages, i.e. when I link to big sites, my site get more PageRank. Is that true? (I am not talking about big sites lin

  • Google Pagerank (PR) History and Next Upcoming Updates 2013-04-21

    As you may know Google updates PageRank roughly 4 times a year, out of interest and not for ranking purposes I often check when the next incoming update is. I do this by searching various sites and finding out the history, Well since I'm a regular on

  • Why is Google PageRank not showing after redirecting www to non www? 2013-11-07

    I have a fashion website. I had redirected my domain http:// (non-www) to http://www domain and my preferred domain in Google Webmaster Tools was http://www. Now I have redirected http://www to http:// domain and have changed my prefered domain as we

  • How to get Google PageRank (from the toolbar) to increase from 0? 2014-09-15

    I have an issue with the Google PageRank of my site, the PR of this site is 0. I've tried many ways to increase the PR: I submitted the site in webmasters created sitemap.xml filled all meta tags of site validated and fixed all errors from w3validato

  • Many Stack Overflow users' pages have no Google PageRank and they are not indexed, why? 2010-08-25

    If you go to my user page on Stack Overflow and you check it with the Google Toolbar, you can see it has no PageRank at all (this does happen for almost any user page, even people with much higher reputation, the only exceptions seem to be the users

  • How to find Google PageRank without third party tools? 2015-09-13

    How do third party applications and PageRank checking websites check for the PageRank number? Does Google provide this rank number with an API or can I retrieve it by entering something into the search box? I would like to make a simple script to rec

  • DNS question and Google PageRank from domains 2011-01-24

    I'm not so good at DNS at all but I do know some basics. A while ago i have noticed, that my blog have different Page Ranks, PR 3 for domain www.example.com and PR 1 for domain example.com. In dns records i have this setup: A - IP - `www.example.com`

  • How can I get an email update for a Google PageRank update? 2014-08-06

    I'd like to be informed when there's a PageRank update for the websites I manage. Is there a way to receive an email (or a tweet...) after a PageRank update? Also, ideally I won't be paying to receive this email. --------------Solutions-------------

  • Google/Apple account for corporate issued mobile phones 2015-10-16

    We are working on implementing MDM (Airwatch) for all corporate mobile devices (no BYOD will be allowed). But I have a question as what is the best way to avoid users use their personal Gmail or Apple account to sing-in to the Play or iTunes? I see t

  • Backlink from page with low pagerank but site has high pagerank - effect from Google? 2013-01-04

    My understanding is that if I have too many backlinks from low PR (pagerank) sites, Google considers them to be low quality links and if you have too many, your site may even be de-indexed from Google: http://www.seomoz.org/blog/how-to-check-which-li

  • Top 10 Google Myths Revealed 2002-12-24

    Google is the Web's most popular search engine, powering not only the popular Google.com Website, but also Yahoo! and AOL. Being listed in Google is very important, and being listed highly in Google can bring great benefit to your site. However, ther

  • Microsoft's Answer to PageRank: BrowseRank 2008-07-29

    According to CNET a new paper out of Microsoft Research Asia (PDF) details what may eventually be Microsoft's answer to the Google PageRank algorithm that was in a large part responsible for the Mountain View-based company's ascension to the search e

  • Recent IE vulnerability and google. How? 2010-01-18

    So, there's a huge hole in the wild about IE. Normally I would not be interested, as I don't use IE, but in this case, I'm not able to understand the connection between the IE hole and the chinese google hack. Could you please explain what's going on

  • My domain PageRank shows as unavailable, why is that? 2010-09-23

    My domain, http://www.anovaordemmundial.com , has been snatched by some opportunist when I failed to renew the domain. I know, it's all my fault :/ . After I have being ripped off and bought my domain back, and everything is configured and working, t

  • Does Google adsense affect page ranking? 2010-11-03

    does placing google ads in out web page help in SEO page rank? --------------Solutions------------- No. Although you may find your pages crawled more frequently as a result of Google attempting to keep the ads relevant to your page's content. The aut

  • Multiple Google sessions in Firefox after recent Google apps migrations 2011-08-01

    I cannot be logged to multiple Google Apps email reading anymore (company, non-profit foundation). Google has apparently done some changes and multiple Google accounts cannot be logged in to a browser same time anymore. The changes seem to have appea

iOS development

Android development

Python development

JAVA development

Development language

PHP development

Ruby development

search

Front-end development

Database

development tools

Open Platform

Javascript development

.NET development

cloud computing

server

Copyright (C) avrocks.com, All Rights Reserved.

processed in 4.423 (s). 13 q(s)