Thursday, July 15, 2010

Sparse problems in NP thought to not be in P

(This post is similar to this old post. I am posting this anyway since when I first posted I made fundamental mistake. I fixed it the point I was trying to make get lost.)

About once a semester I get asked
What are the natural problems that are in NP, not known to be in P, and not known to be NPC? What is known about them?
And I give the standard answers:
  1. Graph Isomorphism. If its NPC then PH collapses so it is though to NOT be NPC. There is no real consensus about its status with respect to P.
  2. Factoring. If its NPC then NP=coNP so it is thought to NOT be NPC. Most people think that it is NOT in P since people really want to solve this one and have not been able to. This is not a rigorous argument. Factoring is in QP. If quantum computers are ever practical then we may need to rethink how we think about these things.
  3. Discrete Log. Actually, are there reasons to think this is NOT NPC?
The answer above are standard and I suspect most of my readers know them. However, there is a large source of problems that are in NP, likely not NPC, likely not in P, that people don't seem to talk about much: SPARSE PROBLEMS that are thought to be hard. The ones I know about are from Ramsey Theory. I give one example but there are many like it:

{(1n,1m,c) : the n×m grid can be c-colored and not have any mono squares}

  1. This is clearly in NP since the coloring itself is the witness.
  2. Since this is a sparse set it is not NPC unless P=NP. (Actually it can be coded as a tally set.)
  3. I want to say People think this is hard. You may ask Name Three of Them! Indeed, the number of people who work on Computational Ramsey Theory is fairly small. However, Ramsey Theory itself has many people working on it and nobody has anything close to a result indicating that this problem is in P. I personally think that it is hard.
  4. For a fixed c there is a finite number of grids G1,...,Gp such that the n×m grid is c-colorable iff none of G1,...,Gp can fit inside the n×m grid. Hence, for fixed c, the problem is in P. (So the problem is Fixed Parameter Tractable.)
  5. Everything said above holds if you replace squares with rectangles or other configurations in the above.
  1. Are there other sparse problems in NP that we think are NOT in P? (that is, ones that DO NOT come from Ramsey Theory).
  2. Should we be teaching these in our complexity classes as examples of possible intermediary problems? The proof that they are likely not NPC is easy. However, to argue that these problems are likely not in P is hard. Also, these problems may appear unnatural to some students. (They may appear unnatural to some of my readers.)
  3. It is easy to argue that Factoring is probably not in P since you can point to the fact that people REALLY want to solve it quickly and so far have not. This is not a rigorous argument but I do count it as evidence. Is there a similar argument for GI? Is there a similar argument for my Ramsey Problems?
  4. Is there a mathematical reason to think that Factoring, GI, or my Ramsey Problems are not in P?


  1. Let me ask a silly question: Mahaney Theorem says "if a sparse set is NP-complete, then P=NP". Is there anything known about "if a sparse set is not in P, then ???" ?

  2. How about the padded version of any natural NEXP-complete set? But maybe padded versions of natural languages no longer count as natural?

  3. @Lance: whoops. Just saw your comment.

  4. Hartmanis-Immerman-Sewelson (1985): There are no sparse sets in NP-P iff E=NE (reference)

  5. > Discrete Log. Actually, are there reasons to think this is NOT NPC?

    What is the decision version of your DLOG problem? Let's consider this version: Given a prime p, a generator g of (Z/pZ)^* and two numbers a and b in {1,...,p-1}, is the unique i such that a = g^i mod p smaller than b? An oracle to this problem lets you find discrete logs by binary search.

    This problem is clearly in NP and is also in co-NP because if the unique such i is not smaller than b then you can guess it and verify it.

    If you are worried about the promise in the input that g is a generator of (Z/pZ)^*, please note that this promise can also be checked in NP \cap co-NP: for a certificate that g is a generator guess the factorization of p-1 and check that g^{(p-1)/q} \not= 1 mod p for every prime factor q of p-1, and for a certificate that g is not a generator guess some i < p-1 such that g^i \not= 1 mod p.

  6. B.--there are plenty of sparse sets provably not in P. For instance, a random subset of 1^*, the all-ones strings.

  7. Forgive me if this is to far off track- but this discussion brings to mind a problem that has disturbed me for a while. It is not quite sparse, but it feels similar in that it has very little expressive power (there for it is hard to use it as a target of a reduction). Given: a set of linear inequalities and a set of points- do they represent the same convex set? Anyone know the state of this problem? The problem has a easy co-structure (just show one of the vertices violates an inequality or exhibit a new feasible point not in the convex hull of the vertices) but I don't know of prove the affirmative method.

  8. Discrete Log sits in BQP (and in NP \intersect co-NP as explained above), so all the Factoring arguments apply to dlog as well.

  9. @John: I am not totally sure but maybe the answer to your question can be deduced from

  10. To me, the people who desperately want to factor, don't come into the equation at all. They are practical people and would be satisfied by an exponential algorithm provided the runtime was still fast enough (tiny coefficient in front of a barely-bigger-than-1-base exponent) to easily crack RSA. And conversely they'd be displeased with a polynomial-time algorithm if the degree of the polynomial was 5000 digits. So there's very little real relation...