How to use ultrafilters

Quick description

An ultrafilter on a set $X$ is a collection $\mathcal{U}$ of subsets of $X$ with the following properties: (i) $\emptyset\notin\mathcal{U}$ and $X\in\mathcal{U}$ ; (ii) $\mathcal{U}$ is closed under finite intersections; (iii) if $A\in\mathcal{U}$ and $A\subset B$ then $B\in\mathcal{U}$ ; (iv) for every $A\subset X$ , either $A\in\mathcal{U}$ or $X\setminus A\in\mathcal{U}$ . A trivial example of an ultrafilter is the collection of all sets containing some fixed element $x$ of $X.$ Such ultrafilters are called principal. It is not trivial that there are any non-principal ultrafilters, but this can be proved using Zorn's lemma. Here we explain various ways of using non-principal ultrafilters in analysis and infinitary combinatorics.

This article is incomplete. This article needs to include an application of minimal idempotents.

Prerequisites

Some familiarity with basic Ramsey theory; knowledge of the results from a first analysis course.

General discussion

A filter on a set $X$ is a collection of sets $\mathcal{F}$ with the following three properties: (i) $\emptyset\notin\mathcal{F}$ and $X \in \mathcal{F}$ ; (ii) $\mathcal{F}$ is closed under finite intersections; (iii) if $A\in\mathcal{F}$ and $A\subset B$ then $B\in\mathcal{F}$ . For example, if $X=\mathbb{N}$ then the collection of all sets $Y$ such that $X\setminus Y$ is finite forms a filter (known as the cofinite filter). A filter is called an ultrafilter if in addition it has the property (iv) that for any partition of $X$ into two sets $A$ and $B$ , either $A$ belongs to $X$ or $B$ belongs to $X$ .

Notice that an ultrafilter is a maximal filter, since properties (i) and (ii) imply that $A$ and $B$ cannot both belong to $X$ if they form a partition of $X$ . The converse is true as well: every maximal filter is an ultrafilter. To see this, suppose that $\mathcal{F}$ is a maximal filter, that $A$ and $B$ partition $X$ into two sets, and that neither $A$ nor $B$ belongs to $\mathcal{F}$ . We shall obtain a contradiction by showing that $\mathcal{F}$ is not maximal.

Let us attempt to add $A$ to $\mathcal{F}$ . If we do so, then we must also add $A\cap Y$ for every set $Y\in\mathcal{F}$ , and we must add all supersets of these sets. If all the sets $A\cap Y$ are non-empty, then the result will be a filter, since if $A\cap Y\subset U$ and $A\cap Z\subset V$ , then $A\cap (Y\cap Z)\subset U\cap V$ . From this we see that we can extend $\mathcal{F}$ unless there is some set $Y\in\mathcal{F}$ with $A\cap Y=\emptyset$ . Similarly, we can extend $\mathcal{F}$ unless there is some set $W\in\mathcal{F}$ with $B\cap W=\emptyset$ . But then, since $A$ and $B$ are a partition of $X$ , $Y\cap W=\emptyset$ , which is a contradiction.

A standard application of Zorn's lemma proves that every filter is contained in a maximal filter. Since no extension of the cofinite filter can contain any finite sets, it follows that non-principal ultrafilters exist.

The following facts about ultrafilters, which follow easily from the basic definition, will be used in this article. Let $\mathcal{U}$ be an ultrafilter on a set $X$ . If $X$ is partitioned into finitely many sets $A_1,\dots,A_n$ , then precisely one $A_i$ belongs to $\mathcal{U}$ . If $F$ and $G$ do not belong to $\mathcal{U}$ then neither does $F\cup G$ . If any finite set belongs to $\mathcal{U}$ then $\mathcal{U}$ is a principal ultrafilter.

Example 1: generalized limits

We can think of the process of taking limits of sequences as a linear functional defined on the convergent sequences. That is, given a convergent sequence $a=(a_1,a_2,\dots)$ we define $L(a)$ to be its limit, and we have the two properties $L(a+b)=L(a)+L(b)$ and $L(ta)=tL(a)$ for any two convergent sequences $a$ and $b$ and any scalar $t$ .

Can we generalize $L$ by finding a linear functional $\phi$ that is defined on all bounded sequences and not just all convergent ones? In order for it to count as a generalization, we would like $\phi$ to be linear, and we would like $\phi(a)$ to equal $L(a)$ whenever $a$ is a convergent sequence.

If $\mathcal{U}$ is a non-principal ultrafilter, and $(a_1,a_2,\dots)$ is a sequence that takes values in $[-1,1]$ , then we can define a limit along $\mathcal{U}$ as follows. Let $\mathcal{J}$ be the collection of all subintervals $J$ of $[-1,1]$ such that $a_n\in J\}$ belongs to $\mathcal{U}$ . Then the ultrafilter properties of $\mathcal{U}$ imply that $\mathcal{J}$ has all the ultrafilter properties but restricted to intervals. That is, $\mathcal{J}$ does not contain the empty set, is closed under finite intersections, has the property that if $J\in\mathcal{J}$ and $J'$ is an interval that contains $J$ then $J'\in\mathcal{J}$ , and if $[-1,1]$ is partitioned into finitely many intervals then at least one of these intervals belongs to $\mathcal{J}$ .

From this it follows that $\mathcal{J}$ is something like a "principal interval-ultrafilter". More precisely, it contains all open intervals that contain some particular point $a$ . To see this, for each $n\in\mathbb{N}$ partition $[-1,1]$ into finitely many subintervals of length at most $1/n$ . Then one of these subintervals belongs to $\mathcal{J}$ . So for every $n$ we have an interval $J_n$ of length at most $1/n$ that belongs to $\mathcal{J}$ . Now let $I_n=J_1\cap\dots\cap J_n$ . Since $\mathcal{J}$ is closed under intersection, $I_n$ belongs to $\mathcal{J}$ . Let $\{a\}$ be the intersection of the closures of the $I_n$ (which are non-empty and nested). If $U$ is any open interval containing $a$ , then $U$ contains some $I_n$ , so belongs to $\mathcal{U}$ .

Thus, we have found a number $a$ with the following property: for every $\epsilon>0$ , the set $|a_n-a|<\epsilon\}$ belongs to $\mathcal{U}$ . Moreover, it is easy to see that this $a$ is unique. We write it as $\lim_{\mathcal{U}}a_n$ . It is easy to check that $\lim_{\mathcal{U}}$ is linear. For example, if $a=\lim_{\mathcal{U}}a_n$ and $b=\lim_{\mathcal{U}}b_n$ , then $|a_n-a|<\epsilon/2\}$ and $|b_n-b|<\epsilon/2\}$ both belong to $\mathcal{U}$ , and from this and the intersection property of $\mathcal{U}$ it follows that $|(a_n+b_n)-(a+b)|<\epsilon\}$ belongs to $\mathcal{U}$ . Since this is true for every $\epsilon>0$ , $a+b=\lim_{\mathcal{U}}(a_n+b_n)$ . A similar but easier argument deals with scalar multiplication.

To see even more clearly how this ties in with the usual notion of a limit, note that $a_n$ converges to $a$ if and only if for every $\epsilon>0$ the set $|a_n-a|<\epsilon\}$ belongs to the cofinite filter.

General discussion

One can rewrite sentences involving " $\forall$ " or " $\exists$ " in terms of set systems. Let $X$ be a set and let $\mathcal{A}$ be the set system consisting of just the set $X$ . That is, $\mathcal{A}=\{X\}$ . Let $\mathcal{E}$ be the set system consisting of all non-empty subsets of $X$ . Then the statement $\forall x\in X\ P(x)$ can be rewritten $P(x)\}\in\mathcal{A}$ and the statement $\exists x\in X\ P(x)$ can be rewritten $P(x)\}\in\mathcal{E}$ .

This is of course a silly thing to do, but that is precisely the point I want to make here: it is often better to think of a set system as a quantifier. In particular, if $\mathcal{U}$ is an ultrafilter then one often finds oneself writing sentences of the form $P(x)\}\in\mathcal{U}$ , as we have already seen. But it can be much easier to deal with these sentences if one instead writes $\mathcal{U} x\in X\ P(x)$ . One can read this as "For $\mathcal{U}$ -almost every $x\in X$ , $P(X)$ ." As we shall see, this simple notational change can make some statements much easier to understand.

Before we continue, let us translate the ultrafilter axioms into quantifier language.

If $\mathcal{U} x\in X\ P(x)$ then $\exists x\in X\ P(x)$ .
If $\mathcal{U} x\in X\ P(x)$ and $\mathcal{U} x\in X\ Q(x)$ then $\mathcal{U} x\in X\ P(x)\wedge Q(x)$ .
If $P(x)\Rightarrow Q(x)$ then $\mathcal{U} x\in X\ P(x)$ implies $\mathcal{U} x\in X\ Q(x)$ .
If $\mathcal{U} x\in X\ P(x)\vee Q(x)$ then either $\mathcal{U} x\in X\ P(x)$ or $\mathcal{U} x\in X\ Q(x)$ .

Actually, the fourth property is not quite a direct translation of the axiom we gave, but of the equivalent property that if $A\cup B$ belongs to $\mathcal{U}$ then either $A$ belongs to $\mathcal{U}$ or $B$ belongs to $\mathcal{U}$ . The equivalence is an easy exercise.

To the above properties we can add the following consequences, valid for any property $P$ .

If $\forall x\in X\ P(x)$ then $\mathcal{U} x\in X\ P(x)$ .
Exactly one of $\mathcal{U} x\in X\ P(x)$ and $\mathcal{U} x\in X\ \neg P(x)$ is true.

Example 2: the infinite Ramsey theorem

The infinite Ramsey theorem that we shall prove here is the statement that if all pairs of natural numbers are coloured either red or blue, then there is an infinite set $X$ such that all pairs of elements of $X$ have the same colour.

The usual proof goes like this. Build a sequence of integers $n_1<n_2<n_3<\dots$ and at the same time a sequence of infinite sets $X_1\supset X_2\supset X_3\supset$ with the following property: for every $k$ , every pair $\{n_k,m\}$ such that $m\in X_k$ has the same colour. To see that this is possible, suppose we have constructed the sequences up to $k$ . Let $n_{k+1}$ be the smallest element of $X_k$ (which we have defined in such a way that every element of $X_k$ is bigger than $n_k$ ). Then either there are infinitely many elements $m$ of $X_k$ such that $\{n_{k+1} ,m\}$ is red or there are infinitely many elements $m$ of $X_k$ such that $\{n_{k+1},m\}$ is blue. Therefore, we can find an infinite subset $X_{k+1}$ of $X_k$ such that for every $m$ in $X_{k+1}$ the pair $\{n_{k+1},m\}$ has the same colour. Let us call $n_k$ red if all the pairs $\{n_k,m\}$ with $m\in X_k$ are red, and blue if they are all blue. Then we can find infinitely many $n_k$ all of the same colour. Since $n_r\in X_k$ for every $r>k$ , it follows that all pairs $\{n_k,n_r\}$ have the same colour. The theorem is proved.

Note that in the above proof we often used sentences of the form "for infinitely many $x$ , $P(x)$ ". The negation of such a statement is "for finitely many $x$ , $P(x)$ ", which is equivalent to "the set of $x$ such that $\neg P(x)$ belongs to the cofinite filter". We shall now discuss a proof where the cofinite filter is replaced by an ultrafilter. In the above proof, we regarded a set as large if it was infinite. Now we shall regard a set as large if it belongs to some given non-principal ultrafilter $\mathcal{U}$ . This turns out to make the proof neater, because now it is not possible for a set and its complement both to be large. Here are the details. We shall make constant use of the quantifier properties just listed.

Let us write $R(n,m)$ for the statement that the pair $\{n,m\}$ is red, and $B(n,m)$ for the statement that the pair $\{n,m\}$ is blue. Then $\forall n\ \forall m\ R(n,m)\vee B(n,m)$ , which implies that $\mathcal{U} n\ \mathcal{U} m\ R(n,m)\vee B(n,m)$ , which implies that $\mathcal{U} n (\mathcal{U} m\ R(n,m))\vee(\mathcal{U} m\ B(n,m))$ , which implies that either $\mathcal{U} n\ \mathcal{U} m\ R(n,m)$ or $\mathcal{U} n\ \mathcal{U} m\ B(n,m)$ . Without loss of generality it is the first that holds.

Now we construct a sequence $n_1<n_2<\dots$ inductively, ensuring at each stage that $R(n_j,n_k)$ for every $j<k$ and that $\mathcal{U} m\ \forall j\leq k\ R(x_j,m)$ . If we have got as far as the $k$ th stage, then we know that (i) $\mathcal{U} n\ \forall j\leq k\ R(n_j,n)$ ; (ii) $\mathcal{U} n\ \mathcal{U} m\ R(n,m)$ . Therefore we know that for $\mathcal{U}$ -almost every $n$ we have both that $R(n_j,n)$ for every $j\leq k$ and that $\mathcal{U} m\ R(n,m)$ . From this it follows that there exists $n$ such that $R(n_j,n)$ for every $j\leq k$ and that $\mathcal{U} m\ R(n,m)$ . Let $n_k=n$ . Since we also know that $\mathcal{U} m\ \forall j\leq k\ R(n_j,m)$ , we find that $\mathcal{U} m\ \forall j\leq k+1\ R(n_j,m)$ , and the induction continues. The resulting sequence satisfies $R(n_j,n_k)$ for every $j<k$ .

General discussion

The main difference between the proof using ultrafilters and the normal proof is that the proof using ultrafilters decides in advance which colour to go for, and thereby avoids the final use of the pigeonhole principle. This on its own would not be enough of a motivation for coming to grips with ultrafilters, but as we shall see there are other results that have beautifully simple proofs that use ultrafilters.

Central to many of these applications is a notion of ultrafilter addition. If $\mathcal{U}$ and $\mathcal{V}$ are ultrafilters defined on $\mathbb{N}$ , then we define $\mathcal{U}+\mathcal{V}$ to be $n+m\in X\}\in\mathcal{V}\}\in\mathcal{U}\}$ . Do you find that hard to take in? If so, then you will appreciate the value of the quantifier notation: a set $X$ belongs to $\mathcal{U}+\mathcal{V}$ if and only if $\mathcal{U}n\ \mathcal{V} m\ n+m\in X$ .

A basic fact about ultrafilter addition is that it is associative. The proof without quantifiers is horrendous to look at, but basically trivial. The quantifier notation makes the triviality clear: a set $X$ belongs to $(\mathcal{U}+\mathcal{V})+\mathcal{W}$ if and only if $(\mathcal{U}+\mathcal{V})n\ \mathcal{W} m\ n+m\in X$ , which is true if and only if $\mathcal{U} r\ \mathcal{V} s\ \mathcal{W} m\ r+s+m\in X$ , which is true if and only if $\mathcal{U} r\ (\mathcal{V}+\mathcal{W}) t\ r+t\in X$ , which is true if and only if $X\in\mathcal{U}+(\mathcal{V}+\mathcal{W})$ .

And now, a non-trivial but very useful fact is that there exist non-principal idempotent ultrafilters: that is, ultrafilters $\mathcal{U}$ such that $\mathcal{U}+\mathcal{U}=\mathcal{U}$ . (There is a natural topology on the space of all ultrafilters on $\mathbb{N}$ , which makes that space compact. It can be shown that every compact semigroup such that addition is left-continuous contains an idempotent. The proof is a beautiful application of Zorn's lemma: one can prove that a minimal closed non-empty subsemigroup must consist of a single element, which is an idempotent ultrafilter.)

Note that the property of being idempotent has the following quantifier reformulation.

$\mathcal{U} n\ P(n)$ if and only if $\mathcal{U} n\ \mathcal{U} m\ P(n+m)$ .

Indeed, note also that the definition of ultrafilter addition can be reformulated as follows.

$(\mathcal{U}+\mathcal{V})n\ P(n)$ means that $\mathcal{U} s\ \mathcal{V} t\ P(s+t)$ .

Example 3: Hindman's theorem

Hindman's theorem is a celebrated result in Ramsey theory, which asserts the following. Suppose that $\mathbb{N}$ is coloured with finitely many colours. Then it is possible to find $n_1<n_2<n_3<\dots$ such that all numbers of the form $\sum_{i\in A}n_i$ , where $A$ is a non-empty finite set, have the same colour.

To prove this using an idempotent ultrafilter $\mathcal{U}$ , we proceed in a fairly similar way to the way we proved Ramsey's theorem. In particular, we once again decide in advance what colour to go for. Since every integer has one of a finite collection of colours, there must be some colour $C$ such that $\mathcal{U}$ -almost every $n$ has colour $C$ . Let us write $C(n$ for the statement that $n$ has colour $C$ . So we know that $\mathcal{U} n\ C(n)$ . Now we shall try to build a sequence $n_1<n_2<\dots$ inductively in such a way that all finite sums have the colour $C$ .

What should we hope for at stage $k$ ? By then we will have chosen $n_1<n_2<\dots$ . Clearly we will want $\sum_{i\in A}n_i$ to have colour $C$ for every $A\subset\{1,2,\dots,k\}$ . But we need more than this, because we want to have a chance of continuing the induction. For that we want there to be plenty of $n$ such that $n+\sum_{i\in A}n_i$ has colour $C$ for every $A\subset\{1,2,\dots,k\}$ . But what does "plenty of $n$ " mean? In an ultrafilter proof it can mean only one thing: plenty of $n$ have property $P$ means that $\mathcal{U} n\ P(n)$ .

Without any real thought we have arrived at the correct inductive hypothesis: that $\sum_{i\in A}n_i$ has colour $C$ for every $A\subset\{1,2,\dots,k\}$ , and that for $\mathcal{U}$ -almost every $n$ , $n+\sum_{i\in A}n_i$ has colour $C$ for every $A\subset\{1,2,\dots,k\}$ . Let us introduce a condensed notation for this: we shall write $\langle n_1,\dots,n_k\rangle$ for the set of all sums $\sum_{i\in A}n_i$ with $A\subset\{1,2,\dots,k\}.$ Then our inductive hypothesis is $\mathcal{U}n\ \langle n_1,\dots,n_k,n\rangle\subset C$ .

If this is the case, then $\mathcal{U} n\ \mathcal{U} m\ \langle n_1,\dots,n_k,n+m\rangle\subset C$ , and of course it is also the case that $\mathcal{U} m\ \langle n_1,\dots,n_k,m\rangle\subset C$ . But then it follows from the intersection property that $\mathcal{U} n\ \mathcal{U} m\ \langle n_1,\dots,n_k,n\rangle\cup\langle n_1,\dots,n_k,n+m\rangle\cup\langle n_1,\dots,n_k,m\rangle\subset C$ . But this union is equal to $\langle n_1\dots,n_k,n,m\rangle$ , so we have shown that $\mathcal{U} n\ \mathcal{U} m\ \langle n_1,\dots,n_k,n,m\rangle\subset C$ . From this it follows that there exists $n_{k+1}$ such that $\mathcal{U} m\ \langle n_1,\dots,n_k,n_{k+1},m\rangle\subset C$ . This has got us to the next stage of the induction, so the proof is finished.

Example 4: polynomial recurrence of rotations

Here is a cute application of the existence of idempotent ultrafilters due to Vitaly Bergelson. Let $\alpha$ be a real number. Our aim is to prove that for every $\epsilon>0$ there exists a positive integer $n$ such that $\alpha n^2$ is within $\epsilon$ of an integer. To do that, let us first solve an easier problem. We take the compact set $\R/\Z$ and map $\N$ to it by sending $n$ to $b_n=\beta n+\Z$ , which for convenience we shall denote by $\beta n$ . Then we calculate $l=\lim_{\mathcal{U}}b_n$ . Since $\mathcal{U}$ is idempotent, this is equal to $\lim_{\mathcal{U}+\mathcal{U}}b_n$ , which can easily be checked to be the same as $\lim_{\mathcal{U}}\lim_{\mathcal{U}}b_{m+n}=\lim_{\mathcal{U}}\lim_{\mathcal{U}}(\beta m+\beta n)=l+l$ . So $l=l+l$ , from which it follows that $l=0$ .

Now let $a_n=\alpha n^2+\Z$ and let $t=\lim_{\mathcal{U}}a_n$ . Since $\mathcal{U}$ is idempotent, this is equal to $\lim_{\mathcal{U}+\mathcal{U}}a_n$ , which equals $\lim_{\mathcal{U}}\lim_{\mathcal{U}}a_{m+n}=\lim_{\mathcal{U}}\lim_{\mathcal{U}}(\alpha(m^2+2mn+n^2)$ , where the first limit is as $m$ varies and the second is as $n$ varies. Taking the inner limit first (as we must), and using the previous result with $\beta=\alpha m$ , we find that this equals $\lim_{\mathcal{U}}(\alpha m^2+0+t)$ , which in turn equals $t+t$ . So $t=t+t$ and therefore $t=0$ .

As one might expect, this proof can be generalized. Indeed, it can be used to prove the same result for $\alpha p(n)$ , where $p$ is any polynomial that vanishes at $0$ .

General discussion

Clearly the definition of ultrafilter addition works for any $X$ with a binary operation, and it will be associative if the binary operation on $X$ is associative. Let $\mathcal{U}$ and $\mathcal{V}$ be idempotent ultrafilters with respect to some associative binary operation on a set $X$ . We say that $\mathcal{U}<\mathcal{V}$ if $\mathcal{U}+\mathcal{V}=\mathcal{V}+\mathcal{U}=\mathcal{U}$ . It is easy to check that this is a partial order, and it can be shown that for every idempotent $\mathcal{V}$ there is a minimal idempotent $\mathcal{U}$ such that $\mathcal{U}\leq\mathcal{V}$ . Minimal idempotents turn out to be useful for proving more complicated results with a similar flavour to Hindman's theorem.