Wednesday, June 06, 2018

I tell my class that P is important because... but is that really true?

When teaching P vs NP  the questions arises (and if not then I bring it up) what if you have algorithm in P that takes n^{100} time?. Or even n^{5} time which might be too long.

I have given the following answers for years; however, I post this and will use your comments as a sanity check. In fact, I suspect I am wrong on some points.

1) IF SAT was in P then something very clever was done. Even if its n^{100} it was very clever. That cleverness will almost surely result in other better algorithms. They may only be better in practice by not in theory (it works in practice- if only we can get it to work in theory :-) ), but it will still work and be a great advance.

2) IF SAT was in P then perhaps we could USE SAT in P to get a better algorithm for SAT in P. This was the theme of a chapter in Lance Fortnow's book The Golden Ticket: P, NP. and the search for the impossible and is also likely behind something I once heard (I think it was from Scott but I can't find it on his blog)  If P=NP then math would be easy and we would have already found a proof that P=NP. Hence P ≠ NP

The two above really can't be WRONG or even RIGHT or even NOT EVEN WRONG since they are speculations. The next few is where I need my sanity checked

3) Whenever a real world natural problem is in P it has a low degree poly algorithm OR there are ideas to make it work in practice, if not in theory. When Lance read a prior version of this post he pointed out that `real world natural problem' is not a well defined term. True. Not sure what to do about that. Even so, is this true? roughly true? Something theorists say to be able to sleep at night?

4) In the real world people say ``OH- the problem is NP-complete. Better look for approx solutions instead''  Or similar things. While this sounds true since I have never worked in the real world I really don't know. Is that what people say? do?

5) If Lance proves P=NP next week then the consequences are enormous for real people working on real computing problems. But what if Lance proves P NE NP? Would that affect people working on real computing problems? A long time Carl Smith told me If P NE NP was proven then the proof would have great insight into computing which would have a real affect on real people working on real computing problems. My Darling (who is a Software Engineer) is skeptical of that. What do you think?


  1. In the real world people are faced with NP-complete problems. I think they tend to solve them using heuristics rather than approximation algorithms with rigorous guarantees. SAT solvers are a good example of this: they seem to work well on "most" real-world instances.

  2. On 1, overwhelming focus on worst case time complexity on all problems took us in the wrong path.

    Prudent one would have been to study worst case time complexity for P and best case time complexity for NP problems and that would have brought us more fruitfulness.

    On 2, everything is hard/impossible when we don't know about it. Once we know about it, then it is a question of possible or not.

    1. Worst-case might be overemphasized in research, but that's for three very good reasons:
      1. Worst-case gives firm reliability guarantees;
      2. Average-case analysis is considerably more difficult and mathematically intensive than worst-case, generally relegated to graduate study. See e.g.
      3. "Best-case" analysis is basically trash. There is nothing "prudent" or "fruitful" about observing that some trivial family of instances causes your algorithm to abort early. If you were to buff up your model of best-case analysis to be more resistant to derailing by trivial instances, you'd end up with something at the level of mathematical sophistication of average-case analysis, which, again, is too difficult a toolset for widespread usage.

  3. One example coming to mind is the approximation algorithm for non-negative permanents (Jerrum, Sinclair, and Vigoda). Despite some effort to optimise its running time, the current best is still roughly n^7. It doesn't seem to be an easy problem to solve in practise either.

  4. I do work in "the real world", and I do say things like 4).

    Although often, the people I discuss with when something like that comes up have no idea what NP-complete problems are, so that typically gets translated to "We don't have the resources to give an exact solution", or something similar.

    As for 5), I'm on Your Darlings side. A proof may give great insight, and it may have a real effect on real computing problems. But that remains to be seen.

  5. Even if SAT is not in P there may be some few efficient SAT-solvers that handle all real world problems. See Knuth's fascicle "Satisfiability" for an overview of the huge progress that has already been achieved. Though I am not subscribing to his thesis that P=NP (but non-constructively!) I think it possible that SAT can be solved, though not necessarily in general, but for most, if not all, practical purposes. Also, even if NP is not contained in P/poly there may be sufficiently efficient circuits for all practical purposes.

  6. On (4): I do applied operations research at Google. We solve mixed integer linear programs w/ ~millions of constraints and variables every hour or so, often achieving optimality because practical solvers are very good. When we set it up wrong (bad conditioning, scaling, etc.) it takes 8-12 hours to finish. When we set it up right, 15 minutes to 2 hours.

    Rather than what you said in (4), I'd say "If your problem is NP hard, then there are great tools but you have to work a lot harder (and know a lot more theory) to make sure you don't spiral out of control in terms of time/space." If we mess up the set up for the problem (we as in our group collectively, not all of whom are experts in OR), we have to go ask the OR experts for help to figure out what, in theory, is causing our problem to solve slowly. Also, my impression is that a constant factor approximation would not fly for most applications, and it's FPTAS or use a solver (for serious problems), or give up on theory and use randomized greedy until it becomes a problem.

    That being said, a colleague of mine came to me with what turned out to be the dynamic transitive closure problem for DAGs, and I was delighted to learn (via some notes of Virginia Williams) that there's a polynomial lower bound depending on the matrix multiplication exponent. This informed me and my colleague that she couldn't expect to have a data structure for this problem with a log-time update. That episode makes me think fine-grained complexity makes P more relevant, if not in its relation to NP, then among the relation between problems in P.

  7. SAT cannot still be used to mine bitcoin efficiently (and it is a very practical problem!).

  8. Sebastian Oberhoff12:38 PM, June 07, 2018

    Determining whether a graph has a clique of size 100 takes O(n^100) time using exhaustive search. Does that constitute a "natural problem"?

  9. P=NP

  10. " If Lance proves P=NP next week then the consequences are enormous for real people working on real computing problems."

    I am not sure that is true either. A proof of that might not have any impact for at least quite some time. It is not as if people just close their shop once they know that the problem is NP-hard. People have developed efficient solvers for hard problems, and it is very hard to believe any of these algorithms will be displaced anytime soon.

  11. I think the relative lack of polytime algorithms with a large exponent is simply due to the fact that coming with a say O(n^17) algorithm is very hard: The algorithm has to be complicated for its complexity to be O(n^17), at least if the value 17 does not appear explicitly in the input ("finding cliques of size 17"). On the other hand, coming with an algorithm in O(2^n) is often quite simpler. Our limited brains are still able to understand four nested loops, or maybe 5 or 6 for the brightest amongst us, but are short of understanding the power of tens of nested loops. Of course, this statement has no proof nor evidence, but I am nevertheless convinced that there is some truth in it.

  12. Hold on your holy crowd. What if the proof that P=NP
    is non-constructive after all ? It would shed little, to none light on things.

    (Incidentally, what is so wrong about saying NP=P ? Come on guys it is an equality after all and not an assignment.)

  13. I think that your point 2) somewhat corresponds to what Scott Aaronson described as "8. The Self-Referential Argument" in his "Reasons to believe" post:

    (Presumably that's also similar to what Lance wrote in _The Golden Ticket_, but I haven't read that).

    This only stuck in my memory because a while ago, I commented on a "Godel's Lost Letter" blog entry describing "A Panel on P vs. NP":

    Basically, I was saying that even if P!=NP, you could still possibly figure out what the smallest circuit for SAT is (for _tiny_ problems), using a SAT solver. Then, incorporate that circuit into the solver; if it's actually an improvement, then you can find the best circuit for a _slightly_ larger problem...

    Since my guess is that P!=NP, I don't think this would get very far (and would be a lot of work, and use a ton of CPUs). And if the circuit size for SAT increases quickly for small problems, that would only be circumstantial evidence that P!=NP. (However, it would find an optimal SAT-solving circuit for small problems, as a side effect. Admittedly, probably only very small problems.../)

  14. Great News.

    IF, P = NP

    In the factorization of every odd number.

    published in: International Journal of Engineering and Future Technology Vol 16 (1) 2019.

    Is online now.

  15. P versus NP is considered as one of the most important open problems in computer science. This consists in knowing the answer of the following question: Is P equal to NP? This question was first mentioned in a letter written by John Nash to the National Security Agency in 1955. A precise statement of the P versus NP problem was introduced independently in 1971 by Stephen Cook and Leonid Levin. Since that date, all efforts to find a proof for this problem have failed. Another major complexity classes are LOGSPACE and NLOGSPACE. Whether LOGSPACE = NLOGSPACE is another fundamental question that it is as important as it is unresolved. SAT is easier if the number of literals in a clause is limited to at most 2, in which case the problem is called 2SAT. This problem can be solved in polynomial time, and in fact is complete for the complexity class NLOGSPACE. If additionally all OR operations in literals are changed to XOR operations, the result is called exclusive-or 2-satisfiability, which is a problem complete for the complexity class LOGSPACE. Given an instance of exclusive-or 2-satisfiability and a positive integer K, the problem maximum exclusive-or 2-satisfiability consists in deciding whether this Boolean formula has a truth assignment with at leat K satisfiable clauses. We prove that maximum exclusive-or 2-satisfiability is in NLOGSPACE. Moreover, we demonstrate this problem is NP-complete. To attack the P versus NP question the concept of NP-completeness has been very useful. If any single NP-complete problem can be solved in polynomial time, then every NP problem has a polynomial time algorithm. Since every language in the class NLOGSPACE is in P, then we show that our problem is in P and NP-complete and thus, P = NP.

  16. Given an instance of $\textit{XOR 2SAT}$ and a positive integer $2^{K}$, the problem exponential exclusive-or 2-satisfiability consists in deciding whether this Boolean formula has a truth assignment with at leat $K$ satisfiable clauses. We prove exponential exclusive-or 2-satisfiability is in $QP$ and $\textit{NP-complete}$. In this way, we show $QP \subseteq NP$.

  17. It was updated in

  18. As an actual software developer, what we ACTUALLY need to provide the world the benefits of the "P=NP utopia" is NOT a specific resolution to P vs. NP.

    No, it is really this: An explicit algorithm for 3-SAT (or another NP-hard problem) that is DOMINATED by an n^k polynomial with k < O(10) for all n < O(2^1000). Basically, an algorithm that is effectively a low-degree polynomial for almost any data set we'd actually feed it. A sufficiently slow-growing, sub-exponential algorithm is just as good as a literal polynomial time one (maybe even better).

    So, our responses to the various P vs. NP outcomes would be something like this:
    1) P=NP, Explicit useful algorithm (e.g. scaling like n^5).
    The dream scenario. YES, we'd all jump on this and try to implement it. There'd be a new version of the internet by the time we finished.

    2) P=NP, Explicit but unusable algorithm (e.g. n^50, n^10K)
    The hunt is on! Improve the constants and exponent to get a usable algorithm, or prove we can't beat the bad one. Basically P vs. NP all over again.

    3) P=NP, but non-constructive:
    Get back to us when you've got some idea what the algorithm might look like.

    3) P!=NP, non-constructive:
    Well that blows, but deep down we're not really surprised. How bad is it? Is brute force really the best we can do with 3-SAT, or can we still hope for "effectively" P=NP on non-enormous 'n' inputs?

    4) P!=NP, brute force is the best you can do (e.g. "Strong Exponential Time Hypothesis" (SETH))
    Well, at least we won't waste any more time dreaming of improvement. Let's see if it holds for quantum machines...

    5) P!=NP, but far better algorithms than any we know of for 3-SAT must exist.
    Okay, then basically case (2) depending on what the floor is.

    4) P!=NP, which we proved by showing you can do 3-SAT in something like 2^log(log(n)) but no better:
    Oh really? So basically we CAN solve NP-hard problem instance for any sane value of n in effectively polynomial time. We'll treat this as a variant of case (1).