Thursday, April 19, 2012

How important is the PROOF of Cooks theorem for my ugrad class?

I am teaching Automata theory this semester. The topics are basically (1) Regular Languages, (2) Context Free Languages, (3) Decidable, undecidable, and c.e. sets, (4) P, NP, NP-completeness.

Other things can be added like Primitive Recursive Functions, the arithmetic hierarchy, the polynomial hierarchy.

I am considering NOT proving Cook's Theorem. State it YES, say how important it is YES, but prove it NO. I am not sure if I want to do this. This post is an attempt to get some intelligent comments on this. My mind is not made up yet so this is I raise the question TRULY non-rhetorically. Some thoughts:
1. The proof is coding a Turing Machine computation into a formula. It is interesting that you can do this. The details of it are not so interesting.
2. The very good students (I would guess 5 out of my 40) will get it. The bad students never get it. I am oversimplifying--- and if this was a reason to not teach a topic I may not have much of a course left.
3. The time spend IN CLASS on this is not so much- one lecture. The time spend in OFFICE HOURS on this re-explaining it is a lot. And its not clear that it helps.
4. Showing them NPC reductions is more important and more interesting. This is what I would put in instead. I usually only get to do a few.
5. Cook's Theorem is SO fundamental that the students should SEE a proof of it even if they don't understand it. but I am not a fan of they should see it even if they don't understand it.
6. Cook's Theorem, like many things, you don't get the first time you see it, but you do the second, helped by having seen it once.
7. Even if we now they mostly won't get it, we want them to know that we WANT them to get it.
8. Students should know that the proof that SAT is NP-complete goes all the way back to the machine model (very few other NP-completeness proofs do). Is it good enough to TELL them this. I would DEFINE Turing Machines, SAY that you CAN code a computation into a formula, but not actually do it?
9. I like to teach things that I can ask problems about. Cook's theorem IS okay for this- I usually show how to make a formula out of some TM instructions and leave as HW making a formula out of other TM instructions. Even so, not that much can be asked about it. The students have a hard time with this (Much more so than other parts of the course) so I wonder if its too much sugar for a cent.
10. I teach this course a lot so I want to mix things up a bit. I realize this is not really a good reason for the students.
11. An argument for: See how it goes! The decision I make this semester need not be what I always do.

1. Another argument FOR inclusion is that there is a fundamental difference between Cook's theorem and other NP-Completeness proofs -- it's a reduction from an arbitrary problem in NP to SAT, rather than a reduction between two concrete problems.

2. My theory is that formal language theory is formal language theory, not complexity theory. The theory of NP completeness should be part of a complexity theory class. Are you covering LR(k) parsing? How about Markov models, or probabilistic/weighted grammars? These seem far more pertinent to an automata theory class than Cooks theorem.

1. Actually the title of the course is ELEMENTARY THEORY OF COMPUTATION,
and the content is
Reg Langs (DFA,NFA,Reg Exp),
CFGs,
Decidable, undecidable stuff,
P, NP stuff.

3. In class: give them the hand-wavy proof, the idea of the proof, but not the full details.

Write up, or take a full writeup of the proof and make it available to them.

Give a lecture on the proof and record it. Make the recording available electronically and invite them to watch it (twice!). (maybe make the video available to the public!)

The good students will take the opportunity, the bad ones won't. On the whole, I agree that its a far better use of time to do more NP reductions in such a class.

4. I like the approach taken by Kleinberg and Tardos in "Algorithm Design". They appeal to most CS majors understanding that anything that can be done in software in polynomial time can be done in a polynomial amount of circuitry (a true statement as long as all logic is feed-forward), and then they show that logical circuit satisfiability is the same as SAT and that SAT is 3SAT. It has a little handwaving in the beginning, but works better than any other method I have tried.

5. Which proof do you give? I find that the proof using Circuit-SAT (as done in Chapter 9 of Sipser's book for example) is one that students seem to find pretty intuitive. From Circuit-SAT one can then go directly to 3-SAT which is also pretty intuitive. I could see much more difficulty for students to follow all the different correctness subformulas in a proof in the style of Cook's original one (as done in Chapter 7 of Sipser's book). It is worth the digression into circuits to do the more intuitive proof and you can mention some circuit complexity along the way if you want.

1. I have used the VERIFY definition of NP
{ x | (exists^p y) [ (x,y) \in B }
and then used Turing Machines.

Based on your comments and others if I do it (likely given
the comments) I will do it using CIRCUIT-SAT.

2. another alternative: write a first order formula stating that c is an accepting computation of NTM M on x, and show them how to translate this into a polynomial size propositional formula.

3. I also think giving a few reductions from NP-complete problems to SAT before proving the NP-completeness of SAT is helpful in understanding why the proof works.

6. I am currently a second term computer science student at the KIT (Karlsruhe Institute of Technology in Germany). Last term we (that means about 300 students who have to make this module, but only about 30 who attended the lecture) had Cooks Theorem and the proof. It took quite long and was very technical, but I think it is also very interesting. I could not make this proof again without looking it up, but I think I understood how it works. In my opinion it was good that my prof explained it in lecture, because I don't know if I would have tried to understand it otherwise. But most of the students who were attending thought it would be enough to point to a proof and skip the proof in the lecture.

7. Since you're going to teach NPC reductions anyway, I feel you might as well teach Cook-Levin's thm as in many ways, it is the first reduction.

8. I'm teaching that course this semester:

http://www.cs.umass.edu/~barring/cs401

and I would never consider leaving the proof of Cook-Levin out. Without it, the rest of your NP-completeness proofs are not grounded in anything they have seen -- this is ok for an algorithms class but not for a class which should be getting them to understand why what they are learning is true.

And as you say, from the algorithms point of view this proof is the only motivation for doing complexity in terms of TM's at all.

As for reductions, I had time for the ones in Sipser (including his HAMPATH one which is really elegant) and I assigned them the card-deck, 3-color, and dominating set problems from Sipser on homework.

Is your class required for all majors? Ours isn't, so we get mostly the mathematically better half. Our algorithms class, which is supposed to look at NP-completeness in a practical way, is required for all majors.

9. If we're voting, then I believe that you should at least sketch the proof of Cook's theorem. The best argument I see against is your last one, namely that this semester could be an experiment and could inform later decisions. (There are interesting parallels to the multi-armed bandit problem here.)

If I may offer some advice, then I would suggest that you not reduce to CNF-SAT (or 3-SAT), but rather CIRCUIT-SAT. The proof is fairly clean, and it is also intuitive: it makes sense that you can simulate Turing machines with circuits ...

10. Here's a (not terribly original) idea for approaching this:

1. Teach students about the existence of universal Turing machines, with a constant number of states for the machine head. (This is independently important.)

2. Either explain, or assign as an exercise, the fact that Turing machines with O(1) states are really just a special case of 1D cellular automata. (Students like cellular automata--there are pretty visuals and so on.)

3. Either explain or assign the fact that 3SAT can encode a search problem asking for a completion of a partially-specified CA initial state that leads to acceptance (in a given polynomial number of steps). I believe this is more intuitive for CAs than for Turing machines.

11. In the automata course, there were already analysis of the behavior of simpler kinds of automata. So a description of the Turing machine made sense, and Cook's Theorem made sense. In more advanced CS courses, you skim over the details of the Turing machine, so Cook's Theorem is more complicated to explain.

So, from the perspective of someone who will take more advanced courses, it is best to include Cook's Theorem in automata theory.

12. I'm sorry for your classes, but the Cook-Levin Theorem is really false:

This “theorem”, in fact, is not a genuine theorem, for its proofs are incorrect, since into them is used the hidden non-proved (faulty, really) assumption (or hidden axiom, contradictory with ZFC), that states that a polynomial p(n) that upper bounds the running time of some NTM that decides in nondeterministic polynomial time the problem to be reduced into that reduction to the SAT can be always anyway provided (either that p(n) is known a priori or can be computed by a poly-time DTM). This is not always accurate.

See the complete proof at http://arxiv.org/ftp/arxiv/papers/0907/0907.3965.pdf

13. I'm very happy you covered the proof when I took the grad class. Before that, I thought the proof involved very complex ideas, and afterwards I saw it was mostly mechanical.

When I teach Cook's theorem, I tell the students that there are theorems that are easy to conjecture, but hard to prove, and theorems that are hard to come up with, but then easy to prove. Cook's theorem is the latter if it is stated as: given a non-det poly time turing machine, there is a SAT instance of poly size that is satisfiable iff the machine says yes. I discuss how much more information I give when I say that than when I just say: NP-complete problems exist. Then I sketch some rules of the formula, and that's it. The rest is just mechanical and somewhat easy to believe, check online, or do yourself.

14. I first saw the normal boring proof of Cook’s Thm in such a class as yours. Then in a grad class the professor gave a proof sketch in 30 seconds.

Just write a tableau of circuits to compute the ith tape character at time j (add extra character for encoding where the head is). This is clearly possible, just look at the i-1, i, i+1 positions of the previous time step. Now you have to write out the guts of the circuit using formulas. This is technical, but easily believable. The circuit computes a finite function, and a formula can do this just as well. Then you’re done.
Also, I think this proof delivers much more understanding than just writing everything out. I was pretty amazed when I saw this proof: is it really that easy? Can this professor say in 30 seconds what we spent 30+ minutes on before?

Anyhow, that is my recommendation.

1. I am not sure how the proof you saw DIFFERS from the boring proof"

15. I (think) I prefer the proof of Cook's Theorem via CIRCUIT-SAT - and then I would vote strongly for actually giving the proof, at least via pictures. Although it's not really different, the translation from NTMs to circuits seems to me like it would be more intuitive than the translation to formula, for someone who's never seen it before. I think this has something to do with the fact that circuits *compute* something, in essentially the same way a TM does, whereas formulas don't "compute" anything, per se.

I think the idea that there is a universal NP language because the computation history of a Turing machine can be encoded into a circuit/formula is a very important one. In my (granted, limited) experience when you just *tell* students "here's some big idea" without showing them, they forget rather quickly. Of course, as you point out, even when you show them, they may not get it the first time. Also I agree strongly with your point 7.

The CIRCUIT-SAT proof also has the advantage that it leads naturally to other useful ideas in complexity, like the fact that SUCCINCT CIRCUIT-SAT is NEXP-complete, the relationship between time on a TM and size of a circuit, and the Karp-Lipton Theorem.

16. When I was a student, we just were just given the statement "SAT is NP-Complete" and moved on to reductions. This bugged me for a long time (but apparently not enough to look up the proof): where does the first' NP-Complete problem come from? Some time later somebody SAID that you CAN code a computation into a formula, but not actually how. (Your point 8.) Then I was enlightened, as I could see that this was surely possible and the that details are for most purposes irrelevant.
I later came to view the proof of Cook's Theorem as reduction from NON-DETERMINISTIC TURING MACHINE ACCEPTANCE to SAT, where the former is trivially NP-Complete. I think that is an elegant way to view it.

17. I second the CIRCUIT-SAT approach (I typically follow CLRS). I think at least an outline of the proof idea is helpful to students.

18. I concur with @DaveMB that it's crucial to prove that an NP-complete problem exists before you start doing reductions. Without that fact reduction proofs are circular logic, and an astute student may suspect that the class of NP-complete problems is actually empty, and that all the related theorems are vacuously true. (It sounds like that was sticking in @Thamas' craw.)

It's true that only 5ish students follow the entire SAT proof, but I think most come away at least believing that NP-complete languages exist, which is something.

I do wonder if there's an approachable nonconstructive proof that there exists an NP-complete problem, and if that would be a better compromise.

I've had better results with the argument in the Goodrich-Tamassia algorithms book. You invoke Church-Turing to say that any TM can be compiled to machine code, and according to the computer architecture course every machine instruction can be built out of boolean circuits. Then you aggregate those into a humongous-yet-polynomial circuit for the entire computation. The setup is hand-wavy but it lets you talk in terms of core prerequisite CS concepts, instead of introducing SAT and working through all the technical details of the formula widgets.

19. ...Other things can be added like Primitive Recursive Functions, the arithmetic hierarchy, the polynomial hierarchy.

Why would you even think of adding those topics to an undergrad class? If your choice is one of those topics or the proof of Cook's theorem, I would go with the latter.

20. Primitive Recursion: I do it sometimes, not other times.
AGREE that its certainly FAR LESS IMPORTANT than a PROOF of
Cook's Theorem.

Arithmetic and Poly Hierarchy- These I LIKE. Arithmetic Hierarchy
gives you a way to classify sets beyond HALT. Poly hier analogous.
AGREE that there are less important than proof of Cook's theorem,
BUT these are easily understood by the students, which is a plus.

ANYWAY, the comments have convinced me to TO the proof of Cooks theorem, using the CIRCUIT SAT proof.

21. Alex Lopez-Ortiz11:56 PM, April 19, 2012

The proof we use here at Waterloo follows a long sequence of concepts from systems and programming languages that students are already rather familiar with:

1) Introduce non-determinism as a parallel search on a very large size cluster (I actually never use the word non-determinism, since it is rather confusing at this early stage).

2) Then introduce a very restricted subset of the C language enhanced with a "Try" statement as in "Try x=0 and x=1" to write such parallel programs. The Try statement causes a Unix fork of the code with parallel execution on two machines thereafter.

3) At this time, we introduce the concept of NP problems as problems whose solution can be found in polynomial time on a (very large) cluster using a C+Try program.

4) We then show some simple code samples illustrating (3) above such as a C+Try program solving TSP or SAT.

5) We give the compiler rules to compile any arbitrary C+Try program into a SAT formula, which as a target architecture might be a bit unusual but no more than, say, java being compiled into pcode instead of machine code. This is about two slides worth of simple rules.

6) Argue the correctness of the compilation rules, i.e. the program accepts/rejects if and only if the SAT formula is satisfiable.

7) Point out that if instead of a JVM someone gave us a SAT-VM "executing" (i.e. solving) SAT formulas in polynomial time we have a means to solve any problem in NP.

8) Hence SAT is NP-complete.

I've tried several other ways of teaching this, and settled on this one over the years.

The fact that students have already used fork, have a natural familiarity with exhaustive search, are familiar with large Google clusters and know about compilers that compile into weird target architectures means that they buy into the proof both at a technical level and at an intuitive "seems sound" kind of way.

22. I teach a very similar course in Florence and, so far, I have always taught the proof of the Cook-Levin's theorem. As far as I can see, the students do not have so many problems in understanding the proof, even because they have already seen something very similar while proving that Turing machines are equivalent to unrestricted grammars or (if I have time) while proving the undecidability of the Post correspondence problem. In other words, when I teach the C-L theorem, the students already know that the computation of a Turing machine can be emulated by different formalism (if powerful enough) and they are not surprised to see that this can be done by using logical formulas: that's why the proof requires no more than half an hour and, as far as I can see during the oral examination, it is understood by the majority of my students.

23. I have great fun teaching the standard (Garey-Johnson) proof, one of the highlights of my intro theory class. Be excited about it and the class will find it exciting too.

24. I also cover the Garey-Johnson proof in CMSC651 and I really like doing the proof. It may be mechanical but its so clever and shows why SAT is such a basic problem.

25. I also think that giving a proof of Cook's theorem (or its extended sketch) is important. (Just like to give a proof-sketch of Goedel's theorem in a logic course.) This shows WHY we can forget all this "mess" (states, transitions, cell contents, etc.) and land in a 0-1 realm of CNFs. This will pay off: students will then trust in CNFs and circuits. And will better understand why, say, Clique problem is "the same" as SAT. From my experience, the only small technicality, which sometimes confuses students, and which should be explained in details is that the event "if A and B then C" in a TM is equivalent to the event "not A or not B or C is true". Then they understand that instructions of TM are, in fact, clauses. After that we can take a vocation, and say "good bye TM".

26. I am assuming that in the computability part of your course you introduce
universal Turing machines (students can read the rhetoric in Turing's paper, which is GREAT) and complete c.e. sets.

Since you define NP by bounded existential quantification, it is then a simple observation that

Thm1 There are NP-complete languages

Then the polynomial hierarchy and complete languages for the sigma_i, pi_i levels are good homework questions.

Now one definitely should state

Thm2 There are interesting NP-hard languages

together with a list.

I believe the easiest proof of Thm2 is via the Circuit SAT route, after showing how circuits can simulate Turing machines.

The question is how detailed you want these proofs to be, and this depends on the character of the course, and the mathematical abilities of the students. For smart enough students, too much detail is counterproductive. Do you want students to believe the proof or to be able to reproduce it?

27. I never taught this material so perhaps my opinion should carry less weight, but I would split the proof into two parts to be taught in two different parts of the course. When you introduce the concept of Turing machines, before nondeterminism is even in the picture, you show the equivalence of Turing machines and circuits (maybe just in one direction to avoid discussing uniformity). This can be done in the more general context of equivalence between computational models (like multi-tape TMs, unrestricted grammars, programs in C, lambda calculus etc.). I remember being pretty amazed when I first saw that all of these models are actually equivalent (hardware design and software design is actually the same thing, who would have guessed). You don't need to prove all of the equivalences, just prove for circuits and tell them about the rest without proof and maybe give an easy example in the homework. Then, when you get to NP completeness, you state Cook-Levin, and now the proof becomes very simple because we already know that TMs can be translated into circuits, so we just need to show that CIRCUIT-SAT is NPC.

I think this approach gives the students sufficient time to "digest" the first part, so by the time you get to the second part, they are already convinced that TMs can be converted to circuits. Therefore they can dedicate all of their mental effort to the second part.

28. Bill, this is a question that I also pondered each and every time I taught Cook's theorem. I like the circuit-SAT approach; and then there's another that I have been using, which is to use a TILING PROBLEM as my basic NPC example. Mapping Tm computations to tilings is, in my opinion, more transparent than the classic reduction to SAT. It is even a bit of a game. Basically, you just write down the computation history by laying cards on a table according to simple rules.

Having done that, I reduce tiling to SAT, which is again quite easy and a more typical use of SAT (that is, an example of a pattern that they can use for many other problems).

When I take this route, I tell them that Cook reduced to SAT directly, just for historical interest.

An advantage of the approach via tiling is that the tiling problem is "generic" in a sense: with very minor modifications, the problem becomes complete for other classes involving space and time (e.g., PSPACE, EXP). There are a lot of nice things you can do with it.

29. Proving that "there exists an NP-complete problem" is much easier than Cook-Levin. Define A_{NP}, by analogy with Sipser's other notation, to be {(M, w, 1^t): M is an NDTM and it is possible for M to accept w within time t}. Then you can reduce an arbitrary NP problem to A_{NP} quite easily. This makes even more sense if you proved that A_{TM} is complete for the TR languages under <=_m reductions back in the computability section.

You can then phrase Cook-Levin as constructing a reduction from A_{NP} to CNF-SAT or to CIRCUIT-SAT, changing hardly anything. This makes SAT a little less magical -- it's not the "original NP-complete problem", it's just historically the first interesting one, as @CSProf says above.

30. @DaveMB
Thanks, that is indeed easy!