A quick primer: Fixed some universal programming language. Let C(x), the Kolmogorov complexity of x, be the length of the smallest program that outputs x. One can show by a simple counting argument for every n there is an x such that C(x) ≥ n. We call such x "random".

Suppose we had a finite list of primes p

_{1}…p

_{k}. Then any number m can be expressed as p

_{1}

^{e1}···p

_{k}

^{ek}. Pick n large, a random x of length n and let m be the number x expresses in binary. We can compute m from e

_{1},…,e

_{k}and a constant amount of other information, remembering that k is a constant. Each e

_{i}is at most log m and so we can describe all of them in O(log log m) bits and thus C(m) = O(log log m). But roughly C(m) = C(x) ≥ n = log m, a contradiction.

But we can do better. Again pick n large, a random x of length n and let m be the number x expresses in binary. Let p

_{i}be the largest prime that divides m where p

_{i}is the ith prime. We can describe m by p

_{i}and m/p

_{i}, or by i and m/p

_{i}. So we have C(m) ≤ C(i,m/p

_{i}) ≤ C(i) + C(m/p

_{i}) + 2 log C(p

_{i}) ≤ log i + log m/p

_{i}+ 2 log log i + c. The 2 log C(p

_{i}) term is needed to specify the separation between the program for i and the program for m/p

_{i}.

Since C(m) ≥ log m, we have

log m ≤ log i + log (m/p

_{i})+ 2 log log i + c

log m ≤ log i + log m - log p

_{i}+ 2 log log i + c

log p

_{i}≤ log i + 2 log log i + c

p

_{i}≤ O(i (log i)

^{2})

The prime number theorem has p

_{i}approximately i log i, so we get just a log factor off from optimal with simple Kolmogorov complexity.

I wrote a short introduction to Kolmogorov complexity with this proof. I originally got the proof from the great text on Kolmogorov complexity from Li and Vitányi and they give credit to Piotr Berman and John Tromp.