Tuesday, August 23, 2005

Additive combinatorics (by Ryan O'Donnell)

For a while now I've taken a dilettantish interest in the subject of Additive Combinatorics. If A is a subset of the integers {1, ..., N}, let A + A denote the set {a + b : a, b in A}. Additive Combinatorics studies such questions as, "If A + A is small compared to |A|^2, what can be said about A?" and "What is the size of the largest A that contains no nontrivial length-3 arithmetic progression?"

My interest in this topic was piqued a little over a year ago by the confluence of three events. First, the Green-Tao theorem was announced: The primes contain arbitrarily long arithmetic progressions. Second, the Barak-Impagliazzo-Wigderson paper on extracting randomness from independent sources came out, which crucially used a 2003 result from additive combinatorics by Bourgain, Katz, and Tao. Third, I was reading some of the course notes and expositional papers on Ben Green's webpage (mainly because he's a masterfully clear writer) and I realized that many of the favourite techniques used in additive combinatorics are also favourite techniques in theoretical computer science -- the probabilistic method, graph theory, discrete Fourier analysis, finite fields and low-degree polynomials therein...

So what are the applications to theoretical computer science? There are already at least a few:

  • More recent work on multisource extractors; e.g., the paper of Dvir and Shpilka or the recent work of Jean Bourgain (mentioned here).
  • Szemer�di's Regularity Lemma, used nonstop in the area of property testing, was developed originally for "Szemer�di's Theorem" from additive combinatorics. Ben Green also has a version of the regularity lemma for boolean functions, which I used in a paper on hardness of approximation for "Grothendieck problems".
  • Tao and Vu's recent papers on the probability that a random boolean matrix has determinant zero.
  • Chandra, Furst, and Lipton, inventors of the "number on the forehead" model in communication complexity, gave surprisingly efficient communication upper bounds based on arithmetic progressions in sets.
  • The most-cited work in TCS history (maybe): Coppersmith and Winograd's n^{2.376} matrix multiplication algorithm. The title: Matrix multiplication via arithmetic progressions.
I like to think that many more applications and connections to TCS are on the way. Here are a few scattershot ideas... anyone want to chime in with others?
  • Property Testing: Since Szemer�di's Regularity Lemma is so useful for property testing, there ought to be some use in the recently discovered hypergraph versions.
  • Low-degree testing: given n-variate functions f over small fields, both communities like to look at quantities like E[f(a)f(b)f(c)f(a+b)f(a+c)f(b+c)f(a+b+c)] -- see this paper on low-degree testing by Alon, Kaufman, Krivelevich, Litsyn, and Ron but also the many works on the "Gowers norm" such as this one by Green and Tao.
  • Pl�nnecke's Theorem: The proof of this theorem, which is used constantly in additive combinatorics, is pure graph theory -- the main tool is the max-flow/min-cut theorem. It'd be great to see it written in TCS language.
  • Lattices: The topic is screaming out for a connection to lattice problems; see Chapter 3 of this book for some background.
For an introduction to additive combinatorics, one might look here.


  1. thanks very much for many useful links. i enjoy your posts.

  2. fantastic post. Informative and thought-provoking.

  3. Hi Ryan,

    Neat posts! One comment though: although widely cited, CW'90 is not likely to be most cited paper in TCS. A few counterexamples:
    - RSA paper: 2938 citations
    - Karp's NP-completeness: 1367 citations
    - Shor's quantum factoring paper: 707 citations
    - Bentley's kd-tree paper: 786 citations
    (name misspelled as "Bently")


    - CW'90: 535 citations

    All data according to Google Scholar.