## Monday, September 12, 2005

### Favorite Theorems: NP-Incomplete Sets

August Edition

In the 1950's Friedberg and Muchnik independently showed there existed computably enumerable but non-computable sets that are strictly weaker than the halting problem. How about a polynomial-time version? We have some natural sets that are good candidates like Factoring and Graph Isomorphism but no proofs that these sets lie in-between P and NP. Any proof would imply P≠NP and Ladner, in a theorem that now bears his name, shows that P≠NP is the only assumption you need.

Richard Ladner, On the Structure of Polynomial Time Reducibility, JACM 1975.

Ladner shows that if P≠NP there exists an A such that

• A is not in P,
• A is in NP, and
• A is not NP-complete.
Here is my write-up of two proofs of this result, one due to Ladner and the other to Russell Impagliazzo. Ladner proves a more general result, given any computable sets A and B with B reduces to A but A does not reduce to B, there is a set C such that B reduces to C and C reduces to A and A does not reduce to C and C does not reduce to B. This result holds for any reasonable notion of resource-bounded reduction, and in fact you can embed any partial order between B and A.

Ladner's proof works by blowing holes in SAT using a clever looking-back technique to keep the set in NP. In the end it is a little unsatisfying because from the viewpoint of any fixed length, the set is either NP-complete or easy on that length. Impagliazzo's proof tries to get around this by slowing down the reduction but his proof still leaves large gaps of easily computable inputs. But until we learn how to show P≠NP we won't have any other method for proving the existence of incomplete sets.

1. Hi Lance, your last point is interesting. Why do you believe that there will be no other way to show the existence of incomplete sets, short of proving that P does not equal NP?

For example... are there reasons to believe that we cannot prove a statement like:

If P does not equal NP, then Graph isomorphism is NP-incomplete.

?

2. If P does not equal NP, then Graph isomorphism is NP-incomplete.

Someone correct me if I'm wrong, but I think this is already known: it should be the contrapositive of a result due to Schonig (1988) which states that if GI is NPC, then PH collapses to the second level.

If P != NP then the entire PH is infinite and so PI_2 != SIGMA_2 thus we have that GI is *not* NPC.

3. Follow up to my previous comment: It should be Schoning. Also, the result doesn't preclude the possibility that GI is in P. In fact, I can think of no good "evidence" that GI is not in P other than the fact that no one has found an algorithm for it. Does anyone know of any results in this direction?

4. Actually, it should be "Schoening".
When there is an umlaut on a letter x, you can replace the letter with "xe".

5. P=NP implies the PH collapses (to P).
But from P != NP it is *not* known to follow that the PH is infinite; indeed there are oracle worlds relative to which the hierarchy is strict up to any desired level, then collapses to that level (see 'The Complexity Theory Companion' and its Ch. 8 references).

To back you up on the other query, I can't find any evidence of GI's intractibility in Schoening et al.'s book on GI. It's hard to imagine what kind of conclusions could be drawn from GI in P, other than some other iso- type problems being in P too.

One (somewhat dubious) form of evidence that one might be able to produce:

Define a problem S[A] that is in NP^A relative to an arbitrary oracle A.

Show that with A random, then w/ prob. 1, S[A] is neither in P^A nor NP^A-complete.

Hope that this suggests S[�] is neither in P nor NP-complete. But this might fall flat. ?

6. Anyone know of any good writeups of Ladner's proof? (Sorry, Lance, but I didn't find your writeup very intuitive...)

7. I think Papadimitriou's '94 text does a decent job, as well as being a great complexity theory resource more generally.

8. Anyone know of any good writeups of Ladner's proof?

Lance also has a mini-writeup in his "Diagonalization" paper that may be more useful. I don't know how much more intuitive you can get though.

9. I think Lance's proof is rather clear, but perhaps a bit too telegraphic for people who are not too familiar with the subject.

Here's a roadmap of the proof:

We want a language L that is neither polynomial nor NP-complete. For the first part we need to diagonalize away from all polynomially recognizable languages, hence our language won't be in P. For the second we need to diagonalize away from all possible polynomial Turing reductions, thus precluding the posibility of SAT <= L. Hence L won't be NP complete.

For the first part, we enumerate all polynomial machines and diagonalize away from them. For the second part we enumerate all reduction functions f and diagonalize away from them.

10. Yes, Papadimitriou's proof is a little different. There, he proceeds by running each stage for at most n steps (rather than checking whether (log n)^{f(n)} \geq n, which I find sort of un-intuitive).

11. While Ladner's result is wonderful, and the proof (as attested by some
comments) far from trivial, it is much less satisfactory, and I would say,
less important than the corresponding result in Computability Theory, the
Friedberg-Muchnik Theorem.

The reason is that Ladner's proof gives us very little insight to the
fundamental problems, beyond its statement. In other words, there are no
fundamentally new techniques in its current proof(s). In contrast, the
Friedberg-Muchnik Theorem introduced the Priority Method, that is THE
major tool in "classical" Computability Theory.

More precisely, Friedberg and Muchnik had to overcome a huge technical problem.
The major tool to obtain negative results in Computability is to use
diagonalization. The standard trick can be described as follows:

make a (usually infinite) list of requirements (in the usual
diagonalization proof the requirements are "the function I want to define has
to be different from f_i , the i-th function on the list")

for each i, find an input w(i) ("the i-th witness") such that w(i) shows that
the i-th requirement has been satisfied (in the standard diagonalization proof
w(i)=i and we define the diagonal function D so that D(i) != f_i (i)

It is not very hard to show that if we write down the requirements that a c.e.
set not be complete, and want to satisfy the requirements by diagonalization,
and want the function w(i) to be computable, we will construct a set that is
decidable.

So, using the only tool available before Friedberg, we always obtain either
decidable or complete c.e. sets.

The big new insight was that the witness does not have to be computable in
order to get a proof--it can be a c.e. process if one defines it carefully.

It would be great to obtain a similarly useful tool by looking at incomplete
sets in NP -- even if the tool was not powerful enough to settle the P vs NP
question.

12. I think there is a problem with the proof in Lance's write-up: "we can compute f(n) in time polynomial in n": I do not understand why
f(n) can be computed in polynomial time in n. E.g., we have to produce and simulate
M_i on small inputs x to determine f(n). Even if x is very small, how do you do that without further assumptions on the enumeration of the M_i?

13. We did make the assumption that M_i(x) runs in time at most |x|^i. That's enough to make sure we can compute f(n) quickly.

14. Don't you need that the enumeration of the machines M_i can be done not only effectively, but even quickly, given i?
-- OK, but this is easily possible, no problem.
Thanks for responding to my comment. I like your write-up,
it makes several points more clear compared to the other places where you find the proof.

15. I am also not (even after Lance's answer to a previous comment on this point) convinced by the claim that "we can compute f(n) in time polynomial in n".
If M_i(x) runs in time |x|^i and |x| < log n this gives the bound (log n)^i. But i is not a constant - it is a function of n. If i stays close to n (which seems likely since "most" TM's will disagree with L on a short input word so i will almost always be incremented) the bound becomes (log n)^n, which doesn't look very polynomial! Sorry if I am being stupid here, but a slightly more detailed explanation of this would be much appreciated. (I find it easier to believe that f(n) can be computed in time is polynomial in f(n), which I think suffices for the rest of the proof, at least with a little more work.)