In the mathematical discipline of graph theory, the expander walk sampling theorem intuitively states that sampling vertices in an expander graph by doing relatively short random walk can simulate sampling the vertices independently from a uniform distribution. The earliest version of this theorem is due to Ajtai, Komlós & Szemerédi (1987), and the more general version is typically attributed to Gillman (1998).
Statement
Let
be an n-vertex expander graph with positively weighted edges, and let
. Let
denote the stochastic matrix of the graph, and let
be the second largest eigenvalue of
. Let
denote the vertices encountered in a
-step random walk on
starting at vertex
, and let 
. Where 
(It is well known that almost all trajectories
converges to some limiting point,
, as 
.)
The theorem states that for a weighted graph
and a random walk
where
is chosen by an initial distribution
, for all
, we have the following bound:
![{\displaystyle \Pr \left[{\bigg |}{\frac {1}{k}}\sum _{i=0}^{k-1}\mathbf {1} _{A}(y_{i})-\pi (A){\bigg |}\geq \gamma \right]\leq Ce^{-{\frac {1}{20}}(\gamma ^{2}(1-\lambda _{2})k)}.}](https://wikimedia.org/api/rest_v1/media/math/render/svg/38bed3949aff40e40f155072aa5532cd9ed4f4ff)
Where
is dependent on
and
.
The theorem gives a bound for the rate of convergence to
with respect to the length of the random walk, hence giving a more efficient method to estimate
compared to independent sampling the vertices of
.
Proof
In order to prove the theorem, we provide a few definitions followed by three lemmas.
Let
be the weight of the edge
and let
Denote by
. Let
be the matrix with entries
, and let
.
Let
and
. Let
where
is the stochastic matrix,
and
. Then:

Where
. As
and
are symmetric, they have real eigenvalues. Therefore, as the eigenvalues of
and
are equal, the eigenvalues of
are real. Let
and
be the first and second largest eigenvalue of
respectively.
For convenience of notation, let
,
,
, and let
be the all-1 vector.
Lemma 1
![{\displaystyle \Pr \left[t_{k}-\pi (A)\geq \gamma \right]\leq e^{-rk(\pi (A)+\gamma )+k\log \lambda (r)}(\mathbf {q} P(r)^{k}\mathbf {1} )/\lambda (r)^{k}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/8f8d3df0c1f6f2b9fae0a3eb946489271df70846)
Proof:
By Markov's inequality,
![{\displaystyle {\begin{alignedat}{2}\Pr \left[t_{k}\geq \pi (A)+\gamma \right]=\Pr[e^{rt_{k}}\geq e^{rk(\pi (A)+\gamma )}]\leq e^{-rk(\pi (A)+\gamma )}E_{\mathbf {q} }e^{rt_{k}}\end{alignedat}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/5b3d375fcf90c73feba4e6a26bae88003d51d5d5)
Where
is the expectation of
chosen according to the probability distribution
. As this can be interpreted by summing over all possible trajectories
, hence:

Combining the two results proves the lemma.
Lemma 2
For
,

Proof:
As eigenvalues of
and
are equal,

Lemma 3
If
is a real number such that
,

Proof summary:
We Taylor expand
about point
to get:

Where
are first and second derivatives of
at
. We show that
We then prove that (i)
by matrix manipulation, and then prove (ii)
using (i) and Cauchy's estimate from complex analysis.
The results combine to show that

- A line to line proof can be found in Gilman (1998)[1]
Proof of theorem
Combining lemma 2 and lemma 3, we get that
![{\displaystyle \Pr[t_{k}-\pi (A)\geq \gamma ]\leq (1+r)N_{\pi ,\mathbf {q} }e^{-k(r\gamma -5r^{2}/\epsilon )}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/b532c43c54c038ea3611a22a4960a3530f15e35a)
Interpreting the exponent on the right hand side of the inequality as a quadratic in
and minimising the expression, we see that
![{\displaystyle \Pr[t_{k}-\pi (A)\geq \gamma ]\leq (1+\gamma \epsilon /10)N_{\pi ,\mathbf {q} }e^{-k\gamma ^{2}\epsilon /20}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/0bd1ff816d44ebe381f6c6632735cda6d8cd83bf)
A similar bound
![{\displaystyle \Pr[t_{k}-\pi (A)\leq -\gamma ]\leq (1+\gamma \epsilon /10)N_{\pi ,\mathbf {q} }e^{-k\gamma ^{2}\epsilon /20}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ec45eb89d6e8c1c0bca939ceab3e0652b231aa0a)
holds, hence setting
gives the desired result.
Uses
This theorem is useful in randomness reductions in the study of derandomization [2], showing that expander random walks are good pseudorandom generators against various classes of test functions. Sampling from an expander walk is an example of a randomness-efficient sampler. Note that the number of bits used in sampling
independent samples from
is
, whereas if we sample from an infinite family of constant-degree expanders this costs only
. Such families exist and are efficiently constructible, e.g. the Ramanujan graphs of Lubotzky-Phillips-Sarnak.
References