Think about the converse

Quick description

When one is trying to find a proof of a mathematical statement, it can be surprisingly helpful to think about the converse of that statement as well. The reason is that an understanding of the converse can give important information about what a proof of the original statement would have to be like, thereby speeding up the search for it.

This article is incomplete. This article needs several more examples and some general discussion.

Example 1

This may seem a slightly artificial example, but it came up recently in a research problem, and thinking about the converse was an essential step in finding a solution.

Suppose that you have a norm $\|.\|$ on $\mathbb{R}^n$ and you would like to prove that $\|w\|\leq 1$ for every $w$ belonging to some subset $W\subset\mathbb{R}^n$ . Suppose also that you want to do this by estimating the norms of at most $N$ points. (This situation occurred because the norm in question was randomly defined, and it was not possible to ask for too many events to occur simultaneously – at least if one wanted to avoid understanding the very subtle dependencies between those events.) The obvious method is to choose some subset $Z$ of $W$ consisting of at most $N$ points, and to run an argument with the following general structure:

every element of $W$ can be approximated (in a suitable sense) by an element of $Z$ ;

if two elements of $\mathbb{R}^n$ are close (in that same sense) then their norms are close;

the norm of every point in $Z$ is smaller than $1-\epsilon$ .

Now let us think about whether this scheme of proof is necessary. That is, if $Z$ has the property that if every point in $Z$ has a small norm then so does every point in $W$ , does it follow that every point in $W$ can be approximated, in some suitable sense, by a point in $Z$ ?

The answer is an emphatic no: one soon realizes that if the norm of every point in $Z$ is at most 1, say, then the norm of every point in the convex hull of $Z$ is also at most 1. Armed with that observation, we can go back to the original problem with a potentially much more flexible method of proof:

every element of $W$ belongs to the convex hull of $Z\cup(-Z)$ ;

every element of $Z$ has norm at most 1.

However, if we are sensible, we should learn our lesson and again investigate the converse. Suppose that $W$ does not lie inside the convex hull of $Z\cup(-Z)$ . Is it still possible that the norm restricted to $Z$ could control the norm restricted to $W$ ?

The answer turns out to be no. If some point $w\in W$ lies outside the convex hull of $Z\cup(-Z)$ , then the Hahn-Banach separation theorem implies that there is a linear functional $\phi$ such that $\phi(w)>1$ , but $|\phi(z)|\leq 1$ for every $z\in Z$ . Thus, the seminorm $x\mapsto|\phi(x)|$ is at most 1 everywhere on $Z$ but greater than 1 somewhere on $W$ . And provided $W$ is bounded we can easily convert that into a norm with the same property.

What we learn from this is that if we want to find a set $Z$ and deduce from the fact that the norm of every vector in $Z$ is small that the norm of every vector in $W$ is small, then, unless we know, and can use, further information about the norm $\|.\|$ (which in our problem we could not), we are forced to use the second method above. Thus, we can stop wasting time searching for alternative approaches.