Gene Golub : June 3, 2004
Adaptative methods for updating/downdating page ranks.
Gene Golub
Thursday June 3, 10:00 a.m. at CERFACS
Abstract
Google's PageRank algorithm for web search involves computing the principal eigenvector of a stochastic matrix describing the hyperlink structure of the World Wide Web. Since the web is a highly dynamic structure, with new pages constantly being added or removed, the update problem is an important problem in web search. We address the problem of efficiently recomputing the principal eigenvector of the web hyperlink matrix after small perturbations in the link structure of the web.
We present an algorithm that is motivated by the empirical observation that the convergence patterns of web pages in the PageRank algorithm have a nonuniform distribution. Specifically, many pages converge to their true PageRank quickly, while relatively few pages take a much longer time to converge. This algorithm, called Adaptive PageRank, avoids redundant computations associated with the PageRanks of pages that have already converged.
We show empirically that Adaptive PageRank speeds up the computation of PageRank even in the standard case, and that it is much more effective in the context of updates.



