Convergent subsequences and diagonalization

Quick description

A topological space is sequentially compact if every sequence has a convergent subsequence. One form of the Bolzano-Weierstrass theorem states that a closed bounded subset of $\mathbb{R}^n$ is sequentially compact. More generally, compact metric spaces are sequentially compact. These facts have many applications. Also, some useful diagonalization techniques can be interpreted as saying that certain topological spaces are sequentially compact – as a result, one often hears the phrase "by compactness" when no topology has been specifically mentioned.

A common application of compactness is to use the compactness and contradiction method to derive finitary quantitative results from infinitary qualitative ones.

Prerequisites

Basic notions of real analysis up to and including the definition of compactness.

The article on how to use the Bolzano-Weierstrass theorem contains further examples of the use of convergent subsequences.

Example 1

Let us define a binary word to be a finite sequence of $0$ s and $1$ s. Given two words $W_1$ and $W_2,$ say that $W_1$ is an initial segment of $W_2$ if $W_1$ has length $m$ , $W_2$ has length $n\geq m$ , and the first $m$ terms in the sequence $W_2$ give you $W_1.$ For example, $001101011$ is an initial segment of $00110101110010110.$

Suppose now that you are given an infinite set $\mathcal{W}$ of words. Is it always possible to find an infinite sequence of $0$ s and $1$ s such that every initial segment of that infinite sequence is an initial segment of a word in $\mathcal{W}$ ? The answer is yes, and the proof is typical of a certain kind of compactness/diagonalization argument.

To prove it, we argue as follows. In our set $\mathcal{W}$ there must either be infinitely many words that start with a $0$ or infinitely many words that start with a $1.$ If there are infinitely many words that start with a $0,$ then there must be infinitely many words that start $00$ or else infinitely many words that start $01.$ Similarly, if there are infinitely many words that start with a $1,$ then there must be infinitely many words that start $10$ or infinitely many words that start $11.$ We can continue in this way, building up inductively a $01$ -sequence $\epsilon_1,\epsilon_2,\dots$ such that for every $n$ there are infinitely many words in $\mathcal{W}$ that begin $\epsilon_1,\dots,\epsilon_n.$ This sequence has the desired property.

General discussion

If you are familiar with the proof of the Bolzano-Weierstrass theorem, then you will recognise that the above argument is essentially the same.

Example 2

Let $T$ be an infinite tree such that every vertex has finite degree. Let $x$ be a vertex of $T.$ Must there be an infinite path that starts at $x$ ?

Again, the answer is yes, and this is a fact known as König's lemma. The proof is very similar to the proof in the first example. Let us write $x_0=x.$ Then we can partition the other vertices of $T$ according to which neighbour of $x$ you have to go through to get to them. So there must be a neighbour $x_1$ of $x$ such that $x$ is connected to infinitely many vertices of $T$ via a path that has to go through $x_1.$ Let $T_1$ be the set of vertices that can be reached from $x$ only by going through $x_1.$ Then we can find $x_2$ such that there are infinitely many vertices in $T_1$ that can be reached from $x_1$ only by going through $x_2,$ and so on. This gives us an inductive construction of an infinite path in $T.$

General discussion

This second example is a generalization of the first, which can be reinterpreted as a statement about binary trees. (To form a tree out of $\mathcal{W},$ take an infinite binary tree and label its vertices as finite $01$ -sequences in the obvious way. Then take all vertices that correspond to initial segments of words in $\mathcal{W}.$ )

Example 3

Here is a typical argument that would often be summed up by experts with the phrase "by compactness". One way of stating Van der Waerden's theorem is to say that for any two positive integers $k$ and $r,$ if you colour the positive integers with $r$ colours then there will always be an arithmetic progression of length $k$ consisting of numbers of just one colour. (Such a progression is called monochromatic.) Another way of stating the theorem is "more finite": for every $k$ and every $r$ there exists $N$ such that if you colour the integers $1,2,\dots,N$ with $r$ colours then there will always be a monochromatic arithmetic progression of length $k.$

It is easy to see that the second formulation implies the first. But what is interesting is that the first implies the second – by compactness. What does this mean? To answer this question, let us first see the proof and then what it has to do with compactness.

First proof. Suppose that the second formulation is false. Then for every $N$ we can colour the integers from $1$ to $N$ with $r$ colours in such a way that there is no monochromatic arithmetic progression of length $k.$ Let us be slightly formal, and call this colouring $c_N,$ regarding $c_N$ as a function from $\{1,2,\dots,N\}$ to $\{1,2,\dots,r\}.$ Thus, $\{1,2,\dots,r\}$ is our set of "colours" and $c_N$ assigns colours to the integers from $1$ to $N.$

We now want to put together the colourings $c_N$ to create a new colouring $c,$ defined on the whole of $\mathbb{N},$ in such a way that there is no monochromatic arithmetic progression of length $k.$ This will contradict the first statement of van der Waerden's theorem and thereby give us the (contrapositive of the) deduction we wanted.

To do this, we first find infinitely many $N$ such that $c_N(1)$ is the same. We set $X_1$ to be the set of all these $N,$ and we define $c(1)$ to be the value taken by $c_N(1)$ whenever $N\in X_1.$

Next, we find an infinite subset $X_2\subset X_1$ such that $c_N(2)$ is the same for every $N\in X_2$ and call the common value $c(2).$ And we continue like this to define our colouring $c.$

We must check that there is no monochromatic arithmetic progression of length $k.$ But suppose that $P$ is an arithmetic progression of length $k$ and that $c(m)=j$ for every $m\in P.$ Let $M$ be the maximum element of $P.$ Then for every $n\leq M$ and every $N\in X_M,$ our construction above gives us that $c_N(n)=c(n).$ Therefore, there must be some $N$ such that $c_N(m)=j$ for every $m\in P,$ which contradicts our choice of $c_N$ as a colouring with no monochromatic progressions of length $k.$

Second proof. While we are at it, it is worth seeing that the deduction is also a consequence of König's lemma. Indeed, the finite colourings of $\{1,2,\dots,N\}$ can be regarded as vertices of a tree, where the neighbours of a colouring $\kappa$ of $\{1,2,\dots,N\}$ are all colourings $\lambda$ of $\{1,2,\dots,N+1\}$ that agree with $\kappa$ up to $N,$ as well as the restriction of $\kappa$ to $\{1,2,\dots,N-1\}.$ Now form a subtree by taking the colourings $c_N$ from the previous proof, as well as all their restrictions (which also do not contain any monochromatic arithmetic progressions. Then find an infinite path of such colourings, which naturally defines a colouring of $\mathbb{N}$ with no monochromatic arithmetic progression of length $k.$

General discussion

Relationship with compactness

So where does compactness come into it? Well, a colouring $\mathbb{N}\rightarrow\{1,2,\dots,r\}$ can be thought of as an element of the infinite product $\{1,2,\dots,r\}^{\mathbb{N}},$ and by Tychonoff's theorem this product is compact in the product topology. Recall that a basic open neighbourhood of a function $\mathbb{N}\rightarrow\{1,2,\dots,r\}$ will be a set of the form $\kappa(1)=c(1),\dots,\kappa(n)=c(n)\}$ for some $n.$ That is two colourings are thought of as "close" if they agree on the first $n$ integers for some large $n.$ In fact, it is not just compact but sequentially compact, which is essentially what we were showing in the first proof above. We say "essentially" here because that proof constructs a limit colouring $c$ from a collection of colourings $c_N$ of finite sets, but we could if we wanted have defined $c_N$ in some arbitrary way for integers greater than $N$ (e.g. by defining $c_N(n)$ to be 1 whenever $n>N$ ) and the proof would have been the same.

Why do we call this a compactness argument rather than a sequential-compactness argument? Simply because the product topology on $\{1,2,\dots,r\}^{\mathbb{N}}$ is metrizable (this is a nice exercise) and compactness and sequential compactness are equivalent for metric spaces.

Relationship with ultrafilters

Arguments such as the ones we have just seen can often be proved with the help of ultrafilters. See the article on how to use ultrafilters for the necessary background to this remark. In the proof above, we could instead have defined a colouring $c$ as follows. Let $\mathcal{U}$ be a non-principal ultrafilter. Then for each $n$ there is a unique $j$ such that $c_N(n)=j\}$ belongs to $\mathcal{U}.$ We define $c(n)$ to be this $j.$ Then if $c(m)=j$ for every $m$ in some finite arithmetic progression $P,$ the set of $N$ such that $c_N(m)=j$ for every $m\in P$ belongs to $\mathcal{U}$ and is therefore non-empty, which gives us the contradiction we had above in a slicker way.

Essentially what the ultrafilter does here is decide for us how to define $c(n)$ when we have a choice. (However, note that although the other proof appeared to use the axiom of countable choice, it was in fact easy to make the infinitely many choices needed in an explicit way, such as choosing the smallest possible $j$ at each stage and making each $X_N$ as large as possible.) It tends to be easy to translate back and forth between ultrafilter arguments of this basic kind and diagonalization arguments. (However, it becomes less routine when one uses ultrafilters with special properties such as being idempotent.)

Lack of quantitative bounds

An interesting feature of compactness arguments is that they give no information about bounds. The next two examples give two rather dramatic illustrations of this. The first is an easy compactness argument that proves that a certain function exists, but the function is known to grow so fast that it cannot be proved to exist in Peano arithmetic. The second is another easy compactness argument that proves that a function exists, but finding any sort of bound for the function is an open problem.

Example 4

Let $\mathbb{N}^{(k)}$ stand for the set of all sets of $k$ integers. Let us colour $\mathbb{N}^{(k)}$ with $r$ colours. Then the infinite Ramsey theorem implies that there is an infinite set $X\subset\mathbb{N}$ such that every subset of $X$ of size $k$ has the same colour. From this we can extract, for any $t$ , a finite set $Z\subset X$ with the property that the cardinality of $Z$ is at least as large as the minimal element of $Z$ , and also at least as large as $t,$ such that every subset of $Z$ of size $k$ has the same colour.

This may seem a rather uninteresting observation, but we can now use a standard compactness argument to prove that for every $k,$ $r$ and $t$ there must exist $N$ such that for any $r$ -colouring of the subsets of $\{1,2,\dots,N\}$ of size $k$ there is a subset $Z\subset\{1,2,\dots,N\}$ with $|Z|\geq t$ and $|Z|\geq\min Z,$ such that every subset of $Z$ of size $k$ has the same colour. The proof is just like the first proof in Example 3: one takes a sequence of purported counterexamples and lets them converge pointwise to a limiting colouring that contradicts the infinitary result that we deduced from Ramsey's theorem.

The interest in this result is that the dependence of $N$ on $r,k$ and $t$ is known to be so vast (because there is a clever way of constructing colourings that give rise to enormous lower bounds) that the resulting function cannot be described in Peano arithmetic. This means that the theorem itself cannot be proved in Peano arithmetic: in a certain sense we were forced to use the axiom of infinity to prove it. This fact, first shown by Paris and Harrington, was philosophically significant because it showed that there were unprovable statements that were far less artificial than the statement cooked up by Gödel in his incompleteness theorem.

Example 5

Let $G$ be a finite graph with maximum degree $d,$ and let $r$ be an integer. The $r$ -neighbourhood of a vertex $x$ is the set of all points that can be reached from $x$ by a path of length at most $r.$ If you pick a random vertex of $G,$ then you can look at its $r$ -neighbourhood as a graph, considered up to isomorphism. Let's call the $r$ -neighbourhood distribution of $G$ the probability distribution associated with these $r$ -neighbourhoods. In other words, given a graph $H$ of radius $r$ about some point $z,$ one is asking what the probability is that a random vertex $x$ of $G$ has an $r$ -neighbourhood that is isomorphic to $H$ with an isomorphism that sends $x$ to $z.$

Given any $c>0$ a routine compactness argument shows that there is a finite set of graphs $G_1,...,G_N$ of maximum degree $d$ such that the $r$ -neighbourhood distribution of any finite graph $G$ differs by at most $c$ (pointwise, say, though since there are only finitely many different possible $r$ -neighbourhoods you can take any measure of closeness you like) from the $r$ -neighbourhood distribution of at least one of the $G_i.$ The proof goes as follows: the set of possible $r$ -neighbourhood distributions lives in a compact set, by Tychonoff's theorem (since it belongs to $[0,1]$ raised to the power the set of all possible $r$ -neighbourhoods). Therefore, it is totally bounded. This means that it can be covered by a finite number of open balls of any given radius, which follows instantly from the definition of compactness. And that's all there is to it. But this proof gives no clue about the sizes of the graphs $G_i.$ And in fact, no quantitative bound is known, which is rather extraordinary. Even a huge estimate would be interesting.

General discussion

The last example above was not quite the same as the others: it was a direct application of compactness rather than of sequential compactness. However, it was similar enough in spirit to be contained in this article. (If, however, people can think of more compactness arguments that use total boundedness in this way, then it might be better to create a new sub-article of How to use compactness and transfer this example. The author learned of this open problem from Lászlo Lovász (which is discussed in Section 5.5 of this survey article).