Partial summation

Quick description

Suppose that you are trying to bound the size of a sum of a product of two functions, one reasonably smooth and the other reasonably oscillating, and suppose that you believe that the oscillations give rise to cancellation that causes the sum to be small. Then a discrete analogue of the "use integration by parts to exploit cancellation" trick may well give rise to an efficient way of proving that your belief is correct.

Prerequisites

Basic real analysis. Complex numbers.

This article is incomplete. It would be very nice to have more examples, both for evaluating sums and for estimating them.

Example 1: Abel summation

Let $z$ be a complex number, of modulus $1$ but not equal to $1$ , and let $a_0,a_1,a_2,\dots$ be a decreasing sequence of positive real numbers tending to zero. Then the sum $\sum_{n=0}^\infty a_nz^n$ converges.

How might we prove this? Why do we even believe it?

One can give a geometrical answer to the second question. If you plot the points $a_0$ , $a_0+a_1z$ , $a_0+a_1z+a_2z^2$ , $a_0+a_1z+a_2z^2+a_3z^3$ , and so on, drawing straight line segments between them, then you obtain a piecewise linear curve that seems to spiral inwards to a point. (The closer $z$ is to $1$ , the bigger this "spiral" tends to be.)

How about a rigorous proof? Well, the observation on which the proof is based is that the set of numbers of the form $1+z+z^2+\dots+z^n$ is bounded. Indeed, the formula for summing a geometric progression tells us that

$1+z+z^2+\dots+z^n=\frac {1-z^{n+1}}{1-z},$

which has modulus at most $2/|1-z|$ .

Since we know how to work out sums of the above form, it makes sense to try to use this information to investigate the sum $\sum_{n=0}^\infty a_nz^n$ , which we do by breaking it up into sums of the form we like. We can set aside the convergence issues for now and just look at the sum $\sum_{n=0}^N a_nz^n$ . In order to split this up into polynomials with constant coefficients, we begin by noting that the sequence $(a_0,a_1,a_2,\dots,a_N)$ can be split up as

$(a_0-a_1,0,0,\dots)+(a_1-a_2,a_1-a_2,0,0,\dots)+(a_2-a_3,a_2-a_3,a_2-a_3,0,\dots)+\dots+(a_N,\dots,a_N).$

The best motivation for this splitting comes from drawing a picture of the "graph" of the sequence $(a_0,a_1,\dots,a_N)$ , which we are chopping up horizontally.

This article is in need of attention. If anyone knows how to produce a nice picture for this it will really help.

This suggests the following way of rewriting the sum;

$\sum_{n=0}^Na_nz^n=\sum_{m=0}^{N-1}\sum_{n=0}^m(a_m-a_{m+1})z^n+\sum_{n=0}^Na_Nz^n.$

To check that this rewriting is correct, we can use the old trick of interchanging the order of summation.

$\sum_{m=0}^{N-1}\sum_{n=0}^m(a_m-a_{m+1})z^n=\sum_{n=0}^{N-1}z^n\sum_{m=n}^{N-1}(a_m-a_{m+1})=\sum_{n=0}^{N-1}(a_n-a_N)z^n.$

Adding $\sum_{n=0}^Na_Nz^N$ to this gives us $\sum_{n=0}^Na_nz^N$ as we wanted.

Once we have this decomposition of the sum, the estimate above for sums of geometric progressions involving $z$ tells us that $\sum_{n=0}^m\lambda z^n$ has modulus at most $2\lambda/|1-z|$ for any positive $\lambda$ . Therefore, we see instantly that the rewritten sum has modulus at most

$\frac 2{|1-z|}((a_0-a_1)+(a_1-a_2)+\dots+(a_{N-1}-a_N)+a_N)=\frac {2a_0}{|1-z|}.$

This calculation has already illustrated the main idea of partial summation. However, let us finish the proof. The same argument shows that the partial sum $\sum_{n=N}^Ma_nz^n$ has modulus at most $2a_N/|1-z|$ . Since $a_N$ is assumed to tend to zero, this proves that the sequence of partial sums $S_N=\sum_{n=0}^Na_nz^n$ is Cauchy, and therefore it converges, which is what we wanted to prove.

General discussion

Analogy with integration by parts. At the beginning of this article, we claimed that partial summation was the discrete analogue of integration by parts. What is the connection?

To answer this question, let us write down a general principle (together with its proof) that encapsulates what we have done. (The notational conventions will be somewhat different, however.)

$\sum_{n=1}^Na_nb_n=\sum_{n=1}^N\left(\sum_{m=n}^{N-1}(a_m-a_{m+1})+a_N\right)b_n =a_N\sum_{n=1}^Nb_n+\sum_{m=1}^{N-1}(a_m-a_{m+1})\sum_{n=1}^mb_n.$

Now let us write $B_m$ for the partial sum $\sum_{n=1}^mb_n$ . Then this identity becomes

$\sum_{n=1}^Na_n(B_n-B_{n-1})=a_Nb_N+\sum_{m=1}^{N-1}(a_m-a_{m+1})B_m,$

which is closely analogous to the formula

$\int_a^bf(x)g'(x)dx=\Bigl[f(x)g(x)\Bigr]_a^b-\int_a^bf'(x)g(x)dx.$

We are simply replacing the derivative by the difference operator and integration by summation. (We are also restricting attention to functions $B$ such that $B_0=0$ , which is why the first terms on the right-hand side seem to be slightly different, but in principle one could generalize the discrete formula.)

When is partial summation useful? One answer to this is rather like the answer in the case of integration by parts. If we have a sum of the form $\sum_{n=1}^Na_nb_n$ that we find hard to evaluate or estimate, we may be able to simplify it by replacing the sequence $(a_n)$ by a difference sequence and the sequence $(b_n)$ by a sequence of partial sums.

If that seems a bit tautologous, then here is a slightly more restricted answer. One circumstance where partial summation is useful is when one can get good upper bounds for the partial sums $b_1+\dots+b_m$ and also good upper bounds for the sum $|a_1-a_2|+\dots+|a_{N-1}-a_N|+|a_N|$ . This often occurs if the $b_n$ oscillate a lot and the $a_n$ vary fairly smoothly.

Example 2: a simple statement that implies the Riemann hypothesis

A recent paper of Maier and Montgomery refers to a well-known result of Littlewood, that the Riemann hypothesis is equivalent to the statement that the partial sums of the Möbius function do not grow much faster than $\sqrt{n}$ . They include the following sentence, having made the assumption that $M(x)=\sum_{1\leq n\leq x}\mu(n)<<x^{\frac 12+\epsilon}$ . (This means that for every $\epsilon>0$ there is a constant $C_\epsilon$ such that we have the bound $|M(x)|\leq C_\epsilon x^{\frac 12+\epsilon}$ .)

"The converse is trivial, since this estimate, by partial summation, implies that the series $\sum_{n=1}^\infty\mu(n)n^{-s}=1/\zeta(s)$ converges for $\sigma>1/2$ ."

[H. Maier and H. L. Montgomery, The sum of the Möbius function, Bulletin LMS, Volume 41, 2009, p. 213.]

Since the series $\sum_{n=1}^\infty\mu(n)n^{-s}=1/\zeta(s)$ cannot converge if $\zeta(s)=0$ , this tells us that the Riemann hypothesis would follow if one could prove the above bound for partial sums of the Möbius function.

Here is the argument that Maier and Montgomery had in mind (which is indeed trivial once one is used to partial summation). We want to prove that the infinite sum $\sum_{n=1}^\infty\mu(n)n^{-s}$ converges. This sum is made out of a function $\mu$ that we think behaves rather randomly, and a function $n^{-s}$ that decays in a nice smooth way. The "randomness" of $\mu$ leads us to believe that its partial sums do not grow too fast (indeed, this is our hypothesis), so it makes obvious sense to use partial summation, "integrating" the $\mu$ part and "differentiating" the $n^{-s}$ part. This gives us the identity

$\sum_{n=1}^N\mu(n)n^{-s}=N^{-s}\sum_{n=1}^N\mu(n) +\sum_{m=1}^{N-1}(m^{-s}-(m+1)^{-s})\sum_{n=1}^m\mu(m)=M(N)N^{-s} +\sum_{m=1}^{N-1}M(m)(m^{-s}-(m+1)^{-s}).$

Suppose that the real part $\sigma$ of $s$ is greater than $1/2$ and let $\tau$ be a real number strictly between $1/2$ and $\sigma$ . Then our hypothesis tells us that $|M(x)|\leq C_\tau x^{\tau}$ for every $x$ , where $C_\tau$ is a constant that depends on $\tau$ only. Also, $|m^{-s}-(m+1)^{-s}|\leq |s|m^{-(\sigma+1)}$ , by the mean value theorem. Roughly, the idea here is that the modulus of the derivative of the function $x^{-s}$ never exceeds $|s|m^{-(\sigma+1)}$ on the interval $[m,m+1]$ . See how to use the mean value theorem for more details about this kind of argument. With these two estimates and the identity above, we find that

$|\sum_{n=1}^N\mu(n)n^{-s}|\leq C_\tau N^{\tau-\sigma}+C_\tau|s|\sum_{m=1}^{N-1}m^\tau m^{-(\sigma+1)}.$

The first term tends to zero as $N$ tends to infinity, and the second term is bounded because $\tau-\sigma-1<-1$ . Therefore, the infinite sum $\sum_{n=1}^\infty\mu(n)n^{-s}$ converges.

If you want to know why $\sum_{n=1}^\infty\mu(n)n^{-s}=1/\zeta(s)$ , then see the article on Möbius inversion.

General discussion

Note that the infinite sum $\sum_{n=1}^\infty\mu(n)n^{-s}$ does not converge absolutely, since $|\mu(n)|=1$ far too often. So the convergence exploits cancellation in a crucial way, which is one of the signs that partial summation may be a useful tool.

Note also a difference between Example 1 and Example 2. In Example 1 we managed to replace the original sum of products with a sum of products in which one sequence was bounded and the other absolutely summable, whereas in Example 2 we obtained two new sequences where one tended to infinity, but the other tended to zero fast enough to compensate for that and give rise to a convergent sum.