Thursday, November 30, 2017

Kolmogorov Complexity and the Primes

Bill's post on how to derive the non-finiteness of the primes from Van der Waerden's theorem reminds me of a nice proof using Kolmogorov complexity.

A quick primer: Fixed some universal programming language. Let C(x), the Kolmogorov complexity of x, be the length of the smallest program that outputs x. One can show by a simple counting argument for every n there is an x such that C(x) ≥ n. We call such x "random".

Suppose we had a finite list of primes p1…pk. Then any number m can be expressed as p1e1···pkek. Pick n large, a random x of length n and let m be the number x expresses in binary. We can compute m from e1,…,ek and a constant amount of other information, remembering that k is a constant. Each ei is at most log m and so we can describe all of them in O(log log m) bits and thus C(m) = O(log log m). But roughly C(m) = C(x)  ≥  n = log m, a contradiction.

But we can do better. Again pick n large, a random x of length n and let m be the number x expresses in binary. Let pi be the largest prime that divides m where pi is the ith prime. We can describe m by pi and m/pi, or by i and m/pi. So we have C(m) ≤ C(i,m/pi) ≤ C(i) + C(m/pi) + 2 log C(pi) ≤ log i + log m/pi + 2 log log i + c. The 2 log C(pi) term is needed to specify the separation between the program for i and the program for m/pi.

Since C(m) ≥ log m, we have
log m ≤ log i + log (m/pi)+ 2 log log i + c
log m ≤ log i + log m - log pi + 2 log log i + c
log pi ≤ log i + 2 log log i + c
pi ≤ O(i (log i)2)

The prime number theorem has pi approximately i log i, so we get just a log factor off from optimal with simple Kolmogorov complexity.

I wrote a short introduction to Kolmogorov complexity with this proof. I originally got the proof from the great text on Kolmogorov complexity from Li and Vitányi and they give credit to Piotr Berman and John Tromp.

3 comments:

  1. It should be mentioned that the bound can be improved by a better encoding, like first writing down the length of C(pi), so we get log i + log m/pi + log log i + 2 log log log i + c, and I think pi<=O(i log i (log log i)^2) follows. Of course this can be further improved.

    ReplyDelete
  2. I like the first simpler proof better because Chebyshev used something almost as simple as the second proof that gets within a constant factor of the right density. The idea is to look at the prime factorization of N=(2n choose n) = (2n)!/(n!)^2.

    Consider any prime p and the exponent of p in m!. We get [m/p] terms divisible by p and [m/p^2] divisible by p^2, etc so the exponent of p in m! is just
    [m/p]+[m/p^2]+[m/p^3]+... where [ ] is the floor function and we can go on forever since the terms become 0 eventually.

    Therefore the exponent of p dividing (2n choose n) is precisely
    [2n/p]+[2n/p^2}+[2n/[p^3] +...
    - 2[n/p]-2[n/p^2}- 2[n/p^3] -...
    = [2n/p]-2[n/p] + [2n/p^2]-2[n/p^2] +...+ [2n/p^k]-2[n/p^k]
    where k is the largest power of p smaller than 2n.
    Now for any number m, 2[n/m] > 2(n/m -1) = 2n/m - 2 so
    [2n/m]-2[n/m] = 1 since it is <= 2n/m - 2[n/m] < 2.
    Therefore the exponent of p in (2n choose n) is at most k and hence it contributes a factor at most p^k <= 2n to
    N=(2n choose n).

    Trivially N=(2n)!/(n!)^2 is at least 2^n. (I want to keep this elementary so I am willing to lose the near factor of 2 in the exponent.) Each prime dividing N must be at most 2n and its contribution to the product N is at most 2n by the above argument. Therefore the # of primes smaller than 2n must be at least log_{2n} N > n/ log_2 (2n) which gives the right asymptotics.

    ReplyDelete
  3. As said, this shows only the inequality for p_i that are factors in random numbers. The previous argument shows that this is true for infinitely many i, but to get this for all large i something else is needed (as Alexey Milovanov, if I remember correctly, pointed out)

    ReplyDelete