Jump to content
xisto Community
Sign in to follow this  
jglw22

A Bit About Pagerank... Enjoy

Recommended Posts

The name PageRank, which is the name given to a relevancy score used by Google, is often confused to be related directly to Web pages. However, the name comes from its inventor, Larry Page, who developed the scoring system as a research project in 1995 at Stanford University. PageRank is now renowned for giving unprecedented clarity to search on the Web and is closely associated with Google's monumental success.The PageRank algorithm calculates recursively a score for a given web page in terms of popularity, indicated by the number of inbound links that direct users to that page. An inbound link is defined as a vote and is weighted by the PageRank of the source (hence the recursive nature of the computation) and is offset by the density of outbound links from that source. The scores are then scaled logarithmically and then set to between 0-10. To how many decimal places Google store the results of a PageRank calculation is unknown. In the main Web search algorithm PageRank is simply factored into the overall score that is used to generate the ordering. A more transparent utilization of PageRank is found in the Google Directory. With this system, once a certain category has been found that matches a given search query, a collection of web pages are brought into the runtime index, upon which PageRank comes into play as an ordering over the corpus. Here is the formal definition of the PageRank formula:PR = Whatever anyone claiming to be an SEO expect tells you it is before banning you from his forum for disagreeing.In the formula, PR(pi) are the PageRanks to be calculated, M maps a page to the set of pages that have inbound links to it, L maps a page to the the total number of outbound links for it, N stands for the size of the corpus and d stands for a dampening factor, which is usually set to 0.85. PageRank is a particularly elegant method as good values can be approximated in only a few iterations.If PageRank is considered under the paradigm of Markov theory, whereby probability distributions are independent of previous states, it would be true to say that PageRank gives the probability of arriving at a given page after starting randomly at a page on the Web and clicking randomly on links for a long period of time. This, as well as being a useful property to indicate relevance, is difficult to manipulate by a Web master without discrediting their own site. This is because Google is believed to penalize what are known as links farms, which are Web sites that scheme to artificially inflate PageRank for their customers. Google have manged to further safeguard the integrity of the algorithm by introducing the 'nofollow' attribute for the 'rel' attribute for anchor tags, in 2005. Since the advent of Web 2.0, where the Web is described as being a collection of interlinked applications rather than documents, user generated content has really come into its own. The majority of this content is found in forums and web logs, in which users can post links to other sites. This made the Web incredibly prone to what is known as 'link spamming', where automated bots spider their way through the Web looking for forums and posting many links to their owners Web page. Web masters can now selectively determine, by the use of the 'nofollow' attribute if a links on their page are to be counted by Google.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×
×
  • Create New...

Important Information

Terms of Use | Privacy Policy | Guidelines | We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.