How to use the continuum hypothesis

Quick description

Cantor's continuum hypothesis is perhaps the most famous example of a mathematical statement that turned out to be independent of the Zermelo-Fraenkel axioms. What is less well known is that the continuum hypothesis is a useful tool for solving certain sorts of problems in analysis. It is typically used in conjunction with transfinite induction. This article explains why it comes up.

Example 1

Given two infinite sets $X$ and $Y$ of positive integers, let us say that $X$ is sparser than $Y$ if the ratio of the size of $X\cap\{1,2,\dots,n\}$ to the size of $Y\cap\{1,2,\dots,n\}$ tends to zero as $n$ tends to infinity. The relation "is sparser than" is a partial order on the set of all infinite subsets of $\mathbb{N}$ . Is there a totally ordered collection of sets (under this ordering) such that every infinite subset of $\mathbb{N}$ contains one of them as a subset?

Let us see how to solve this question rather easily with the help of transfinite induction and the continuum hypothesis. These two techniques take us as far as the first line of the proof, after which a small amount of work is necessary. But here is the first line that gets us started. By the well-ordering principle we can find a one-to-one correspondence between the set of all infinite subsets of $\mathbb{N}$ and the first ordinal of cardinality $2^{\aleph_0}$ . By the continuum hypothesis, this is also the first uncountable ordinal. Therefore,

Let us well-order the infinite subsets of $\mathbb{N}$ in such a way that each one has countably many predecessors.

Once we have written this, we are well on the way to a just-do-it proof. Let us build a collection of sets $\mathcal{X}$ that is totally ordered (and in fact well-ordered) by the sparseness relation, and let us take care of each infinite subset of $\mathbb{N}$ in turn. For each countable ordinal $\alpha$ let us write $Z_\alpha$ for the infinite subset of $\mathbb{N}$ that we have associated with $\alpha$ . Now let $\alpha$ be a countable ordinal and suppose that for every $\beta<\alpha$ we have chosen a subset $X_\beta\subset Z_\beta$ , and suppose that we have done this in such a way that whenever $\beta<\gamma$ the set $X_\gamma$ is sparser than the set $X_\beta$ . We would now like to find a subset $X_\alpha$ of $Z_\alpha$ that is sparser than every single $X_\beta$ with $\beta<\alpha$ .

Because there are only countably many sets $X_\beta$ with $\beta<\alpha$ , it is straightforward to do this by means of a diagonal argument. Here is one way of carrying it out. For each $X_\beta$ , let $f_\beta$ be the function from $\mathbb{N}$ to $\mathbb{N}$ that takes $n$ to the $n$ th element of $X_\beta$ . Now enumerate these functions as $g_1,g_2,g_3,\dots$ , define a diagonal function $h$ by taking $h(n)$ to be the smallest element of $Z_\alpha$ that is larger than both $h(n-1)$ and $\max\{g_1(n^2),\dots,g_n(n^2)\}$ . Then let $X_\alpha=h(\mathbb{N})$ . Then if $X_\alpha\cap\{1,2,\dots,N\}=n$ , we know that $Y\cap\{1,2,\dots,n\}\geq n^2$ for each $Y$ that corresponds to one of the functions $g_1,\dots,g_n$ . This proves that $X_\alpha$ is sparser than every $X_\beta$ with $\beta<\alpha$ . Therefore, we can construct a collection of infinite sets $X_\alpha$ , one for each countable ordinal $\alpha$ , such that $X_\alpha$ is sparser than $X_\beta$ whenever $\beta<\alpha$ , and $X_\alpha\subset Z_\alpha$ for every $\alpha$ . This completes the proof.

General discussion

What are the features of this problem that made the continuum hypothesis an appropriate tool for solving it? To answer this, let us imagine what would have happened if we had not used the continuum hypothesis. Once we had decided to solve the problem in a just-do-it manner, we would have well-ordered the subsets of $\mathbb{N}$ , and tried, as above, to find a subset $X_\alpha$ for each $Z_\alpha$ , with these subsets getting sparser and sparser all the time. But if the number of predecessors of some $Z_\alpha$ had been uncountable, then we would not have been able to apply a diagonal argument.

So the feature of the problem that caused us to want to use the continuum hypothesis was that we were engaging in a transfinite construction, and in order to be able to continue we used some result that depended on countable sets being small. In this case, it was the result that for any countable collection of sets there is a set that is sparser than all of them, but it could have been another assertion about countability, such as that a countable union of sets of measure zero has measure zero.

There is sometimes an alternative to using the continuum hypothesis, which is to use Martin's axiom. Very roughly speaking, Martin's axiom tells you that the kinds of statements one likes to use about countable sets, such as that a countable union of sets of measure zero has measure zero, continue to be true if you replace "countable" by any cardinality that is strictly less than $2^{\aleph_0}$ . For example, it would give you a strengthening of the Baire category theorem that said that an intersection of fewer than $2^{\aleph_0}$ dense open sets is non-empty. If the continuum hypothesis is true, then this is of course not a strengthening, but it is consistent that Martin's axiom is true and the continuum hypothesis false.

Example 2

A special case of Fubini's theorem is the statement that if $f$ is a bounded measurable function defined on the unit square $[0,1]^2$ , then $\int_0^1\int_0^1 f(x,y)\,dy\,dx=\int_0^1\int_0^1 f(x,y)\,dx\,dx$ . That is, to integrate $f$ , one can first integrate over $y$ and then over $x$ , or first over $x$ and then over $y$ , and the result will be the same in both cases.

One might ask whether it is necessary for $f$ to be measurable. Perhaps it is enough if $f(x,y)$ is a measurable function of $y$ for fixed $x$ and vice versa, and that the functions $F(x)=\int_0^1f(x,y)\,dy$ and $G(y)=\int_0^1f(x,y)\,dx$ are measurable as well. That would be enough to ensure that both the double integrals $\int_0^1\int_0^1 f(x,y)\,dy\,dx$ and $\int_0^1\int_0^1 f(x,y)\,dx\,dx$ exist.

If you try to prove that these two double integrals are equal under the weaker hypothesis, you will find that you do not manage. So it then makes sense to look for a counterexample.

An initial difficulty one faces is that the properties that this counterexample is expected to have are somewhat vague: we want a function $f$ of two variables such that certain functions derived from it are measurable. But this measurability is a rather weak property, so it doesn't give us much of a steer towards any particular construction. This is just the kind of situation where it is a good idea to try to prove a stronger result. For example, instead of asking for $f$ to be bounded, we could ask for it to take just the values $0$ and $1$ . And instead of asking for the two double integrals to be different, we could ask for something much more extreme: that $\int_0^1f(x,y)\, dy=0$ for every $x$ and $\int_0^1f(x,y)\,dx=1$ for every $y$ . (Note that if we can do this, then we will easily be able to calculate the two double integrals. Note also that we won't have to think too hard about making $f$ non-measurable: that will just drop out from the fact that Fubini's theorem doesn't apply.)

If $f$ takes just the values $0$ and $1$ , then we could let $f(x,y)=1\}$ . Then the condition we are aiming for is that for every $x$ the set $(x,y)\in A\}$ has measure 0 and for every $y$ the set $(x,y)\in A\}$ has measure 1.

It is at this point that we might think of using transfinite induction. Let us well-order the interval $[0,1]$ and try to ensure the above conditions one number at a time.

Suppose then that each real number in $[0,1]$ has been indexed by some ordinal $\alpha$ and write $t_\alpha$ for the real number indexed by $\alpha$ . Suppose that for each $\beta<\alpha$ we have decided what the cross-sections $(t_\beta,y)\in A\}$ and $(x,t_\beta)\in A\}$ are going to be. And suppose that we have done this in such a way that $(t_\beta,y)\in A\}$ always has measure 0 and $(x,t_\beta)\in A\}$ always has measure 1. Now we would like to choose the cross-sections $(t_\alpha,y)\in A\}$ and $(x,t_\alpha)\in A\}$ with the same property.

We do not have complete freedom to do this, since we have already decided for each $\beta<\alpha$ whether $(t_\alpha,t_\beta)$ belongs to $A$ and whether $(t_\beta,t_\alpha)$ belongs to $A$ . And this could be a problem: what if the set $\beta<\alpha\}$ is not a set of measure zero?

But we can solve this problem by ensuring that initial segments of the well-ordering of $[0,1]$ do have measure zero. How do we do this? We use the continuum hypothesis. Once again, we make the first line of our proof the following:

Well-order the closed interval $[0,1]$ in such a way that each element has countably many predecessors.

If we do this, then we find that for each $\alpha$ we have made only countably many decisions about which points $(t_\alpha,y)$ and $(x,t_\alpha)$ belong or do not belong to $A$ . Therefore, we can decide about the rest of the points as follows: for every $y$ such that the value of $f(t_\alpha,y)$ is not yet assigned, set it to be $0$ , and for every $x$ such that the value of $f(x,t_\alpha)$ is not yet assigned, set it to be $1$ . This guarantees that $f(t_\alpha,y)=0$ for all but countably many $y$ , an $f(x,t_\alpha)=1$ for all but countably many $x$ . Therefore, for every $\alpha$ we have $\int_0^1f(t_\alpha,y)\,dy=0$ and $\int_0^1f(x,t_\alpha)\,dx=1$ .

General discussion

If we look at the function $f$ that we have just defined, then we see that it has the following simple description. If $x\leq y$ in the well-ordering we placed on $[0,1]$ then $f(x,y)=0$ and if $x>y$ in this well-ordering then $f(x,y)=1$ . We could have omitted most of the above discussion and simply defined $f$ in this way in the first place, pointing out that for each $x$ , $f(x,y)=0$ for all but at most countably many $y$ , and for each $y$ , $f(x,y)=0$ for all but at most countably many $x$ . The point of the discussion was to show how this clever example can arise as a result of a natural thought process.

However, it is also worth remembering the example, since it gives us a second quite common way of using the continuum hypothesis. To clarify this, let us summarize the two ways we have discussed.

Method 1. In a proof by transfinite induction on a set $X$ of cardinality $2^{\aleph_0}$ , start with the line, "Let $<$ be a well-ordering of $X$ such that every element has countably many predecessors," or equivalently, "Take a bijection between $X$ and the first uncountable ordinal $\omega_1$ , and let $x_\alpha$ denote the element of $X$ that corresponds to the (countable) ordinal $\alpha$ ."

Method 2. Let $<$ be a well-ordering of the reals, or some subset of the reals such as $[0,1]$ , such that each element has countably many predecessors, and use this well-ordering as the basis for some other construction.

It is also worth noting that the function $f$ defined in Example 2 is very similar in spirit to the double sequence $(a_{mn})$ , where $a_{mn}=0$ if $m<n$ and $1$ otherwise. This sequence has the property that $\lim_na_{mn}=0$ for every $m$ and $\lim_ma_{mn}=1$ for every $n$ . It is discussed as Example 5 in the article on just-do-it proofs.