We know, the well-known product of google is its search engine. The search brings relevant pages to the top. It works on the principle of 'page ranking'. Let us see that idea.
Let us assume the internet has only 6 web pages. The author of page 1 thinks that pages 2,4,5 and 6 have good content and links to them. The author of page 2 only likes pages 3 and 4. So, only links his page to them. All the links in the 6 page-web is illustrated below. The task is to find the most valuable page for a particular search query. For, example, if everyone linked to a page, then it will be the most valued web page. But we are not able to arrive at such an easy conclusion, from the diagram above. So, we should go for some innovative strategy.
The page rank: Let us construct a "link matrix"
The page 1 has four links to 2,4,5 and 6. Hence we place link value 1/4 in second row (page 2), fourth, fifth and sixth rows. Page 2 has two outgoing links. Hence we place 1/2 in third and fourth rows. The process will continue for the remaining pages. If you read the matrix vertically, it will give outgoing links. If you read its horizontally, it will give incoming links. For example, page 5 has 2 out-going links but 4 in-coming links.
There is one formula:
Matrix * Eigen vector= Eigen value *Eigen vector.
Here Eigen value is one.
In this case, Eigen vector is one which gives popularity ranks to pages based on 'links'. (Higher the number of links, higher the popularity).
Using formula:
Next we have to normalize the vector. The total is 20. Divide each value by 20, we get
The vector or the values gives the 'page rank'. Higher the value, higher the page rank. For example, page 5 has high page rank 0.4. Hence the google search engine will bring this page 5 to the top for the relevant search query.
The beauty of page rank is that it regarded pages with many incoming links as more important, and it gave more weight to the outgoing links of important pages. Page rank principle is the basis for the Google's search engine. But remember, the real internet has billions of pages. Yet, google brings the result in fraction of a second. Because it uses clever and complicated mathematics.
The real google search algorithm is complex and secret. Any way, it successfully finds the well-known needle in the web's haystack for us.
When we use 'google' everyday, we use the sophisticated mathematics.
Comments
Post a Comment