Tuesday, December 02, 2008

How Many Sides to Your Error?

I often get asked various questions about complexity but I now had the same question asked twenty years apart.
Are there natural problems in BPP not known to be in RP∪co-RP?
Informally the class of BPP are the problems efficiently computable on a probabilistic computer with a very small chance of error. For RP if the program says "Yes" it is always right. For co-RP, if the program says "No" it is always right. The questions asks if there are any natural problems that have probabilistic algorithms that seem to require error on both the Yes and No answers.

In the late 80's, David Johnson asked me (and several others) the question when he was putting together his Catalog of Complexity Classes for the Handbook. I didn't have any examples and his catalog ended up citing the 1984 work of Bach, Miller and Shallit offering up Perfect Numbers (and some variants) an an answer to the question. But Bach et. al made their argument based on the fact that the current primality tests put Primes in co-RP but not in RP. In 1987, Adleman and Huang (building on Goldwasser and Kilian) put Primes in RP too. This puts Perfect Numbers in RP killing that example a few years before the Handbook appeared.

Since Primes were put in P in 2002, we have so few problems even known to be in BPP but not in P with the best remaining example the co-RP problem of determining whether two algebraic circuits are identical. So we still don't have any good answers to the above question except for some contrived problems like given three algebraic circuits, are exactly two of them identical. Most complexity theorists believe we have strong enough pseudorandom generators to avoid the randomness in probabilistic algorithms altogether. So in the end the question is likely irrelevant. Still we complexity theorists also like to know what we can prove beyond what we just believe.

All this doesn't mean BPP is not the right class to capture probabilistic algorithms when you can trust your randomness and don't mind a negligible error on either side. Complexity classes are defined to capture the properties we want them to have, not to mold to our limited knowledge of current problems.


  1. What's wrong with a perfectly natural problem like that? Given the apparently central nature of the algebraic circuit problem (cf. the results of Impagliazzo-Kabanets) maybe we should also not be so dismissive of it as our "only" remaining problem.

    The real question should be whether BPP collapses to RP intersect coRP (and of course whether that further collapses to P). As in your example, once we have a natural problem in RP union coRP that is not in RP intersect coRP then we will get candidates for natural problems in BPP that are likely not in RP union coRP by taking the obvious problems in the Boolean closure.

  2. My definition of UN-NATURAL
    would be a problem defined
    JUST FOR THE PURPOSE OF BEING IN/OUT OF A CERTAIN CLASS. For example, the sets that are in DTIME(n^2)
    but not DTIME(n) via the
    time hier theorem.
    Hence the `given three circuits ...' problem is
    certainly natural.

    Note that its hard to really define `natural'

  3. What about (promise) decision problems that derive from approximation problems for which we only know randomized algorithms? For example, given a convex body described by a linear program, decide if its volume is less than 1 or greater than 2. Alternatively: given a purportedly "pseudorandom" generator, decide if its first output bit is either unbiased, or significantly biased.

    For my money, the approximation-based promise problems are slightly more natural than derivatives of RP\coRP problems like the two-out-of-three algebraic circuits example.

    Of course, I understand that we would like total problems rather than promise ones, but if naturalness is your criterion...


  4. In the process of doing property testing on data streams we have come across problems in which the natural algorithm has double sided error.

    That is, the algorithm is always correct on any instance that strongly has or doesn't have the property, while on values near the boundary it can give an arbitrary answer (for both yes and no instances).

    We call these algorithms "Windsor-Detroit" algorithms as they are probabilistic only on the border.

    This is in contrast to property testing where usually the algorithm is BPP outside the boundary region (or in some cases even in RP outside the boundary region).

    Alex Lopez-Ortiz

  5. The number field sieve factorization algorithm has two-sided error (at least when you ask something like: is 7 the last digit of the smallest prime factor?) So, how about factorization of numbers with (log n)^(5/2) digits? Is this natural?

    Assuming I've done my arithmetic right: this can be solved in polynomial time by the number field sieve, but not by any of the deterministic factoring algorithms, which are all somewhat slower.