Você está na página 1de 1

The optimal value of Pagerank`s damping factor

Pter Bruck,1 Istvan Rthy,1 Jan Tobochnik,2 and Pter rdi2


1. ProcessExpert Ltd, Budapest, Hungary
2. Kalamazoo College, Kalamazoo, MI, USA
Pagerank quickly became one of the most popular ranking algorithms. Page and Brin
empirically set the crucial parameter of the algorithm, the damping factor (d) to 0.85,
which well suited for the WWW,
but proved to be unsuitable for
scientific citations; here the best
results were achieved with
d=0.5. In order to explain this
difference we selected simple
model graphs to study the
correlation between the structure
of the network and the most
appropriate value of d. When the
value of d increases, the relative
ranking of the nodes changes. At a given value of d the good hits suddenly disappear
from the top of the list - this is called "rank reversal". The Pagerank equation enables us
to calculate the location of the rank reversal for our model graphs: it happens at d = 1-1/s
(s is the number of inlinks pointing toward the central node). Until this d value the central
node receives the highest Pagerank; i.e. Pagerank is a proper measure of the centrality.
The higher is the number of inlinks, the higher is the upper limit of the applicability of
Pagerank: if s is 2, Pagerank can be used in the 0 - 0.5 range; if s is 7, the permissible
range of d for the model graphs is 0 - 0.85.
For large networks no theoretical method is available to establish the upper limit of
Pagerank's applicability. With the lack of field-proven analogies (e.g. for the WWW or
scientific citations) only the "trial and error" approach can be used to find the location of
the rank reversal, which is the upper limit of Pagerank's applicability.
This problem could be avoided if an alternative - more refined - method of stochasticity
adjustment is used. Let us introduce a new node (O) which has inlinks as well as outlinks
with every node. This way the manual value selection of the damping factor can be
avoided: when a node with n outlinks receives a new outlink, the share of the node to be
passed toward each neighbor is reduced by n/(n+1). If n is 1, the "damping" for this node
is by factor 2; if it has 100 neighbors, the damping is only 1%, i.e. the damping of the
nodes is automatically adjusted and in the critical parts of the graph the changes of the
information flow is relatively small. Otherwise the computation exactly follows the
original paper of Page and Brin; therefore we call this solution Pagerank 2.0.
In the presentation we compare the ranking results and the stability of the proposed method
with Pagerank using the US patent network having 4 million nodes and 44 million edges.

Você também pode gostar