## Tuesday, February 07, 2006

### Sauer's Lemma

The recent post Discovering the Discovered reminded me of one of my favorite combinatorial lemmas known as Sauer's Lemma.

Sauer's Lemma roughly states that if a collection of sets has VC dimension bounded by d then any set of n elements can only be split nd ways. More precisely

Fix a collections Φ of subsets of U such that for all x1,…,xk in U,
|{S∩{x1,…,xk} | S∈Φ}| < 2k
then for all x1,…,xn in U,
|{S∩{x1,…,xn} | S∈Φ}| ≤ O(nk-1)
This lemma has many important applications, most notably a famous result of Blumer, Ehrenfeucht, Haussler and Warmuth showing that if you don't care about computation costs then one can PAC learn a concept class iff the VC dimension of that class is bounded.

Why is Sauer's lemma connected to Discovering the Discovered? According to Till Tantau,

Vapnik and Chervonenkis appear to have been the first to discover it. They published it in 1968 in Russian and 1971 in English. Sauer, whose paper was published in 1972, claims that Erdös was the first to have conjectured the lemma. Subsequently, Sauer's Lemma has been rediscovered by Clarke, Owings, and Spriggs, and later again by Beigel.

1. It was also rediscovered by Selach, at least according to the book "The probablistic method". The proof is short and elegant, BTW.

2. Its also known as the primal shatter lemma.

I dislike the inductive proof given in all the books though. It seems there should be a more direct "charging" proof: sets smaller than k are ok, while the larger sets could be charged to the smaller sets not present.

Does anyone know of a more revealing proof, than the one by induction? I recall the excellent book "combinatorial geometry" (agarwal&pach) had a linear-algebra method proof of this as well.

3. I think the proper additional reference from the first comment is to Perles & Shelah. That is the first reference I saw to it.

4. A non inductive proof is contained in the book "Combinatorics", by Bollobas (along with several other proofs).

BTW, I am not completely sure that Vapnik and Chervonenkis proved exactly the same result. According to the book by Alon and Spencer "The Probabilistic Method", they proved a slightly weaker result. In the combinatorics literature, the result is generally credited to Sauer and Perles and Shelah.

5. The lemma can also be elegantly proven using shifting. See the proof of Lemma 2 of Haussler's 1995 paper Sphere packing numbers for subsets of the boolean n-cube with bounded Vapnik-Chervonenkis dimension.

6. The following work by Pollard has an unconventional proof of Sauer's lemma. It is highly informal. I am not sure it is correct.

www.stat.yale.edu/~pollard/Asymptopia/Combinatorics.pdf