All similarity measures so far are local: they count the number of common neighbors, or number of paths of length two, between two nodes. However, two nodes might be similar in a more general way. First, the fact that two nodes are connected by an edge (a path on unitary length) is certainly an indication of similarity. Moreover, similarity can also be deduced by longer paths of length larger than two. For instance, paths of length 3 between nodes $i$ and $j$ count the number of neighbors of $i$ and $j$ that are connected by an edge; paths of length 4 count the number of neighbors of $i$ and $j$ that have a common neighbor; paths of length 5 count the number of neighbors of $i$ and $j$ that have neighbors that are connected by an edge. Two nodes may have few or none neighbors in common, but they still may be similar in an indirect, global way. Paths between nodes, of any length, are hints of similarity; nevertheless, the shorter the path, the stronger the contribution to similarity.
The recursive thesis of global similarity is the following:
Two nodes are similar if one has a neighbor that is similar to the other.
Mathematically, the recursive thesis of global similarity can be written as follows: $$ \sigma_{i,j} = \alpha \sum_k A_{i,k} \sigma_{k,j} + \delta_{i,j} $$ or in matrix form as: $$ \sigma = \alpha A \sigma + I $$ Evaluating the expression by iterating starting from $\sigma^{0} = 0$ we get: $$ \begin{array}{lcl} \sigma^{(1)} & = & I \\ \sigma^{(2)} & = & \alpha A + I \\ \sigma^{(3)} & = & \alpha^2 A^2 + \alpha A + I \\ & \ldots & \end{array} $$ Hence, $\sigma^{(k)}$ counts the contribution of paths of length less than $k$ and the contribution of a path of length $k$ is weighted by the attenuation factor $\alpha^k$, as soon as $\alpha < 1$. Notice that the contribution of the identity matrix $I$ is that each node is similar to itself; this is necessary to propagare similarity through network using the network topology represented by the adjacency matrix $A$. If follows that the similarity matrix $\sigma$ is the following: $$ \sigma = \sum_{k = 0}^{\infty} (\alpha A)^k = (I - \alpha A)^{-1} $$ This solution is reminiscent of Katz centrality; indeed, Katz centrality of node $i$ is exactly the sum of similarities of $i$ with all other nodes. Hence, a node is central in Katz view if it is similar to many other vertices. As with Katz centrality, the parameter $\alpha$ must be chosen less then $1 / \rho(A)$, with $\rho(A)$ the spectral radius of $A$.
As with local similarity, it is useful to normalize similarity with respect to its expected value if nodes were to choose its neighbors at random. In such a case, the expected number of paths between two nodes $i$ and $j$ is proportional to the product of their degrees $k_i k_j$, hence a normalized version of global similarity is: $$ \sigma_{i,j} = \frac{1}{k_i k_j} (I - \alpha A)^{-1}_{i,j} $$ or in matrix form as: $$ \sigma = D^{-1} (I - \alpha A)^{-1} D^{-1} $$ High values are given to pair of nodes that, independently of their degrees, are similar, while low values correspond to pairs of nodes that are dissimilar.