Tricki
a repository of mathematical know-how
Add article
Navigate
Tags
Search
Forums
Help
Top level
›
Different kinds of Tricki article
›
Front pages for different areas of mathematics
›
Combinatorics front page
›
Probabilistic combinatorics front page
View
Edit
Revisions
Useful heuristic principles for guessing probabilistic estimates
Title:
*
Area of mathematics:
*
A comma-separated list of areas of mathematics to which this article applies. Use ">" to tag in a subcategory. Example: Analysis > Harmonic analysis, Combinatorics
Keywords:
A comma-separated list of keywords associated with this article. Example: free group
Used in:
A comma-separated list of examples of where this technique is used. Example: Cauchy-Schwarz inequality
Parent articles:
Order
-1
0
1
-1
0
1
Body:
[QUICK DESCRIPTION] If you are hoping to use the probabilistic method as part of a long and complicated proof, you will probably want to begin by producing a plausible sketch of the argument before you go into the technical details. For this purpose it is very useful to be able to guess upper (and sometimes lower) bounds for the probabilities of various complicated events. This article discusses heuristic principles that can help one do this, and illustrates them with examples. [note article incomplete] This article needs more heuristic principles and more examples. [/note] === Principle 1: pretend your variables are independent === This is the single most useful method for guessing probabilistic bounds: if you have some variables that exhibit a reasonable degree of independence, then they will probably give estimates that are very similar to the ones that would apply if they actually were independent. Of course, one needs to be clear about what "reasonably independent" might mean, so let us look at a few examples. [EXAMPLE] Let $G$ be a random graph with $n$ vertices in which each edge is chosen independently with probability $\lambda n^{-1}$. Let $\tau$ be the number of triangles in $G$. Then $\tau$ is a random variable: how should we expect $\tau$ to be distributed? Given any triangle $T$ in the complete graph on the $n$ vertices of $G$, then $T$ will belong to $G$ with probability $\lambda^3n^{-3}$. The number of such triangles is $\binom n3$, which is about $n^3/6$, so the expectation of $\tau$ is about $\lambda^3/6$. Also, if $T_1,\dots,T_k$ are disjoint triangles, then the events "$T_i$ belongs to $G$" are independent; and a typical pair of triangles ''will'' be disjoint. It is clear then that we have a lot of independence about. What would happen if the events "triangle $T$ belongs to $G$" were ''all'' independent? Then $\tau$ would be a sum of $\binom n3$ Bernoulli random variables, each with probability $\lambda^3n^{-3}$. In other words, it would be counting the number of occurrences when you have lots of independent unlikely events. The probability distribution that's appropriate for this is the Poisson distribution, so we might guess that $\tau$ is distributed roughly like a Poisson random variable of mean $\lambda^3/6$. This is not a proof of course, and it turns out to be a hard problem to determine the distribution of $\tau$. Nevertheless, the exercise of pretending that the events are independent is a helpful one: it gives us some idea what to expect, and it also gives us a starting point if we want to prove it. (The starting point would be the thought that we could look very carefully at the proof that $\tau$ is approximately Poisson when the events are independent and try to relax the assumptions we make, allowing a small amount of dependence. And there are indeed important proofs like this in probabilistic combinatorics: see [[Janson's inequality]], for instance.) [EXAMPLE] A certain amount of care is needed even when one is just guessing. For instance, suppose we tried to use the same reasoning to count the number of copies of a graph $S$ that consists of five vertices $x_1,\dots,x_5$ with the first four all joined to each other and the fifth one joined just to the fourth (so $S$ has seven edges). And suppose that the edges of $G$ have been chosen with probability $p=\lambda n^{-5/7}$. The expected number $\sigma$ of copies of $S$ is about $p^7n^5/6=\lambda^7/6$ (the factor of $1/6$ is there because $S$ has a symmetry between three of its vertices). So do we expect the distribution of $\sigma$ to be approximately Poisson with mean $\lambda^7/6$? To see that we don't, note that the expected number of copies of $K_4$ in $G$ is $p^6\binom n4$, which is about $\lambda^4n^{-30/7}n^4/24=\lambda^4n^{-2/7}/24$. In other words, it is far less than 1 (for fixed $\lambda$ and large $n$). What this implies is that although the expected number of copies of $S$ is $\lambda^7/6$, the norm is to get no copies at all, and just occasionally you get a $K_4$ with a huge number of extra edges joined to its vertices forming copies of $S$. [EXAMPLE] Let us return to triangles, but this time let us take a much larger value of $p$. Indeed, let us take $p$ to be independent of $n$. The expected number of triangles is $p^3\binom n3$. How about the variance? Well, for any triple $xyz$ of vertices, let $T_{xyz}$ be the event that these vertices form a triangle in the graph. If two triples are disjoint or overlap in just one vertex, then the corresponding events are independent. Given two triples $xyz$ and $xyw$, the probability of $T_{xyz}$ and $T_{xyw}$ both occurring is $p^5$. The number of such pairs of triples is less than $n^4$, so this implies that the expectation of the square of the number of events $T_{xyz}$ that occur is at most $p^6\binom n3^2+p^5n^4+p^3n^3$. We see from this that the main contribution to the ''variance'' of the number of triangles comes from the term that is proportional to $p^5n^4$. What heuristic model could explain this? We clearly cannot pretend that all the events $T_{xyz}$ are independent, since then we would have a variance more like $p^3(1-p^3)\binom n3$. (This is the variance of the binomial distribution with parameters $\binom n3$ and $p^3$.) One way of making a better guess is to count first how many edges there are in the graph, and then count the number of triangles based on each edge, treating these numbers as independent. That gives us about $p\binom n2$ edges, and each edge is the base of a number of triangles that is distributed binomially with parameters $n-2$ (which we can approximate by $n$) and $p^2$. If we pretend that the numbers of triangles based on the different edges are independent, then we are looking at a sum $X_1+\dots+X_N$, where the $X_i$ are independent binomial random variables with parameters $n$ and $p^2$, and $N$ is a binomial with parameters $n^2$ and $p$. A standard fact about this sort of situation is that the variance is $\mathbb{E}N\mathrm{Var}X_1+\mathbb{E}X_1^2\mathrm{Var}N$, which gives us about $pn^2p^2(1-p^2)n+p^4n^2p(1-p)n^2$, which is much more like what we actually observe. This suggests that we may have found a good model for what is happening.
This is a stub
A stub is an article that is not sufficiently complete to be interesting.
Notifications
File attachments
Changes made to the attachments are not permanent until you save this post. The first "listed" file will be included in RSS feeds.
Attach new file:
Images are larger than
640x480
will be resized. The maximum upload size is
1 MB
. Only files with the following extensions may be uploaded:
jpg jpeg gif png svg
.
Revision information
Log message:
An explanation of the additions or updates being made to help other authors understand your motivations.
Search this site:
Recent articles
View a list of all articles.
Littlewood-Paley heuristic for derivative
Geometric view of Hölder's inequality
Diagonal arguments
Finding an interval for rational numbers with a high denominator
Try to prove a stronger result
Use self-similarity to get a limit from an inferior or superior limit.
Prove a consequence first
Active forum topics
Plenty of LaTeX errors
Tutorial
A different kind of article?
Countable but impredicative
Tricki Papers
more
Recent comments
I don't think this statement
choice of the field
Incorrect Image
Article classification
Higher dimensional analogues
more