Spectral maximization

This method tries to maximize the modularity of a partition by exploiting the spectral properties of the modularity matrix. This method is implemented in igraph in the function leading.eigenvector.community. Recall that modularity is defined as: $$ Q = \frac{1}{2m} \sum_{i,j} \left(A_{i,j} - \frac{k_i k_j}{2m}\right) \delta(c_i, c_j) = \frac{1}{2m} \sum_{i,j} B_{i,j} \delta(c_i, c_j) $$

Notice that the modularity matrix $B$ is symmetric and all rows and all columns of $B$ sum to $0$. Indeed: $$ \sum_j B_{i,j} = \sum_j A_{i,j} - \frac{k_i k_j}{2m} = k_i - \frac{k_i}{2m} 2m = 0 $$

Let us consider the division of a network into just two parts, namely group 1 and group 2. The more general case can be treated with repeated bisection, as described for the Kernighan-Lin maximization. We represent the assignment of node $i$ to a group with the variable $s_i$ such that $s_i = 1$ if $i$ belongs to group 1 and $s_i = -1$ if $i$ belongs to group 2. It follows that: $$\delta(c_i, c_j) = \frac{1}{2} (s_i s_j + 1)$$ Substituting this expression in the modularity formula we have: $$ Q = \frac{1}{4m} \sum_{i,j} B_{i,j} (s_i s_j + 1) = \frac{1}{4m} \sum_{i,j} B_{i,j} s_i s_j $$ where we have used the fact that $\sum_{i,j} B_{i,j} = 0$. In matrix form we have: $$ Q = \frac{1}{4m} s B s $$ We wish to find the division of a given network, that is the value of $s$, that maximizes the modularity $Q$. The elements of $s$ are constrained to take integer values $1$ or $-1$, so that the vector always points to one of the corners of an n-dimensional hypercube.

Unfortunately, this optimization problem is a hard one, but it can be tackled approximately by a relaxation method. We relax the constraint that $s$ must point to a corner of the hypercube and allow it to point in any direction, though keeping its length the same, meaning that it can take any real value subject only to the constraint that: $$s s = \sum_i s_{i}^{2} = n$$

We maximize the modularity equation by differentiating, imposing the constraint with a single Lagrange multiplier $\beta$: $$ \frac{\partial}{\partial s_i} \left[ \sum_{j,k} B_{j,k} s_j s_k + \beta (n - \sum_j s_{j}^{2}) \right] = 0 $$ That is: $$ \sum_{j} B_{i,j} s_j + \sum_{j} B_{j,i} s_j - 2 \beta s_i = 2 \sum_{j} B_{i,j} s_j - 2 \beta s_i = 0 $$ which is: $$ \sum_{j} B_{i,j} s_j = \beta s_i $$ or in matrix notation: $$ B s = \beta s $$ Hence the solution $s$ is an eigenvector of the modularity matrix. Therefore the modularity is given by: $$ Q = \frac{1}{4m} s B s = \frac{1}{4m} \beta s s = \frac{n}{4m} \beta $$ which leads us to the conclusion that for maximum modularity we have to choose $s$ as the eigenvector $u$ corresponding to the largest (positive) eigenvalue of the modularity matrix.

We typically cannot in fact choose $s = u$, since the elements of $s$ are subject to the constraint $s_i \in \{+1, -1\}$. But we do the best we can and choose it as close to $u$ as possible, hence $s_i = +1$ if $u_i > 0$ and $s_i = -1$ if $u_i < 0$ (either value of $s_i$ is good if $u_i = 0$).

And so we are led the following very simple algorithm. We calculate the eigenvector of the modularity matrix corresponding to the largest (most positive) eigenvalue and then assign vertices to communities according to the signs of the vector elements, positive signs in one group and negative signs in the other. If all elements in the eigenvector are of the same sign that means that the network has no underlying community structure.

One potential problem with the algorithm is that the matrix $B$ is not sparse, and indeed usually has all elements non-zero. Finding the leading eigenvector of a matrix takes time $O(mn)$, which is equivalent to $O(n^3)$ in a dense matrix, as opposed to $O(n^2)$ in a sparse one. In fact, however, by exploiting special properties of the modularity matrix it is still possible to find the eigenvector in time $O(n^2)$ on a sparse network. Overall, this means that the spectral method is about as fast as, but not significantly faster than, the vertex-moving algorithm of Kernighan-Lin. There is, however, merit to having both algorithms. Given that all practical modularity maximizing algorithms are merely heuristics - clever perhaps, but not by any means guaranteed to perform well in all cases - having more than one fast algorithm in our toolkit is always a good thing.