How Google Models How We Value Content

Alex Barnett, whose blog I am enjoying more and more lately, asks a very important, but seemingly trivial, question to a recent post. In that recent post: A Social Revolution by Modeling Human Behavior, I said that Google models the way that we (humans) value content. Alex questions this by asking if I meant instead […]

Alex Barnett, whose blog I am enjoying more and more lately, asks a very important, but seemingly trivial, question to a recent post.

In that recent post: A Social Revolution by Modeling Human Behavior, I said that Google models the way that we (humans) value content. Alex questions this by asking if I meant instead that Google “models the way we look for things”, and if not, to please explain what I mean.

Here is what I mean.

Google is a great search engine because it does what we want it to do: returns relevant results for most queries. It does this by applying a complex algorithm to the web pages it indexes, taking into consideration (and certainly not limited to) several pieces of information: the content of a page (including the title, keywords, and linking structure), the domain that a page is on, how often the page is updated, and who is linking to it.

The last item, who is linking to it, is the soul of Google’s algorithm, known as Pagerank. What the founders of Google (Sergei Brin and Larry Page) realized was that the links between pages could tell us a lot about how we value content on the web, not just about what content is related to other content.

Put plainly, they realized that we tend to link to pages that we value. This observation, so simple, so obvious, is the key to the search kingdom.

Now, to believe this observation we need to believe a couple of other small, but important, things. We need to believe that people are generally good and generally efficient. By good I mean that most of what we link to we link to truthfully, without trying to deceive people. By efficient I mean that most of what we link to we do because we have a purpose, without trying to waste anybody’s time. If we don’t believe these things, then we won’t believe the algorithm actually works. But since the results are usually very good, we can believe these things easily. People are good and efficient (for the most part).

In a more interesting light, we can see this observation in terms of what someone says vs. what someone does. If we look at the predominant way that search engines worked before Google, we see that they mostly relied upon individual documents to describe themselves (what they say). Like Google, they would take into consideration the keywords on the page, but unfortunately there was no way to rank one page over another except by counting the number of keywords and simply putting faith in what the document says about itself.

This is why <meta> tag keywords failed so miserably. Deceitful document creators could game the system by simply adding more keywords to their meta tags. The noise quickly drowned out the signal.

Pagerank, on the other hand, ignored <meta> tag keywords and because of this, couldn’t be gamed as easily as it took into consideration the actions of other parties, and granted Pagerank to a particular page based on what other pages were linking to it (what they actually do). Now, deceitful document creators couldn’t game themselves to the top of the pile, because other pages had to vouch for them first.

It is this vouching for, this inter-site authority-building, that Pagerank takes advantage of. It models how we value things based simply on who we pay attention to with our links, and who pays attention to us with theirs.

Published: October 20th, 2005

Joining the Web 2.0 Workgroup

Scalability a Growing Problem in Web 2.0

bokardo

How Google Models How We Value Content

ABOUT

GREATEST HITS