Monday, February 29, 2016

It works in practice, but does it work in theory (Pollard's Factorization algorithm)

Throughout this post I ignore  polylog factors.

It is trivial to factor N in time N1/2.  Pollard's rho-algorithm (see my write up here or Wikipedia Entry) for factoring does bette expected time N1/4. Or does it?  It works well in practice but  has not been proven to work well in theory. (If I missed some paper that DID prove it works well in theory please leave a polite comment.)

Here we state a conjectures that, if true, will show that  Pollard's algorithm is in time (randomized)  N1/4.  Let p be a prime. Let c be in {2,...,p-1}.

Let fc(x)= x2 + c mod p.

x1 will be specified in the conjecture. xi is f(xi-1).

For at least half of the elements x1 in {2,...,p-1} and at least half of the elements c in {2,...,p-1} the sequence x1, x2,... will have a repeat within the first O(p1/2) items.

This is thought to be true since it is thought that the sequence is random-enough so that the birthday paradox will  work. Still... no proof.

When reading some algorithms papers the interest is in getting an algorithm that you can PROVE the run time of.  By contrast, papers on factoring and discrete log and other things that are used to break crypto systems the interest is more in getting something that actually works. I have to learn to stop thinking ``but they haven't proven that!'' and start thinking ``Oh, yes, that would work''. And to be fair, for Pollard's algorithms and others (e.g., quad sieve, number field sieve, which do better in practice than Pollard for large enough numbers) there are REASONS to think they will work well.

More generally, theoretical and applied work may need different mentalities.

1. Eric Bach did make some basic progress on analyzing Pollard's algorithm rigorously: http://www.sciencedirect.com/science/article/pii/089054019190001I.

2. For discrete logs I think it is no longer the case that one only searches for algorithms `that work' (which to my mind implies a rigorous algorithm in any case) and is not concerned with provability.

In particular, at the risk of being accused of shameless self-promotion, the following paper proposes a quasi-polynomial algorithm for the discrete log problem in fixed characteristic fields which is proven for infinitely many extension degrees, and subject to a single conjecture on field representations of a particular type, works for all extension degrees: http://arxiv.org/abs/1507.01495

The algorithm is also very practical, which goes against the usual `rigorous is slower' rule of thumb, so it is not the case that theoretical and applied work need pull in different directions.

3. yep, factoring is one of the great areas of applied CS, which is not always taken seriously by theoreticians. am a huge fan myself. see also this recent writeup, applied (T)CS

4. I disagree with your conclusion. While it's true that theoretical and applied work need different mentalities, in this case theoretical work needs to cast away the shackles of proof, that serve but to obstruct its view of the truth.