Webgraph
The webgraph describes the directed links between pages of the World Wide Web. A graph, in general, consists of several vertices, some pairs connected by edges. In a directed graph, edges are directed lines or arcs. The webgraph is a directed graph, whose vertices correspond to the pages of the WWW, and a directed edge connects page X to page Y if there exists a hyperlink on page X, referring to page Y.
Properties
- The degree distribution of the webgraph strongly differs from the degree distribution of the classical random graph model, the Erdős–Rényi model: in the Erdős–Rényi model, there are very few large degree nodes, relative to the webgraph's degree distribution. The precise distribution is unclear, however: it is relatively well described by a lognormal distribution, as well as the Barabási–Albert model for power laws.
- The webgraph is an example of a scale-free network.
Applications
The webgraph is used for:
- computing the PageRank of the world wide web's pages;
- computing the personalized PageRank;
- detecting webpages of similar topics, through graph-theoretical properties only, like co-citation;
- and identifying hubs and authorities in the web for HITS algorithm.