Wednesday, December 16, 2020


Many of you have heard of Russell Impagliazzo's five worlds from his 1995 classic A personal view of average-case complexity In short 

  • Algorithmica: P = NP or something "morally equivalent" like fast probabilistic algorithms for NP. 
  • Heuristica: NP problems are hard in the worst case but easy on average.
  • Pessiland: NP problems hard on average but no one-way functions exist. We can easily create hard NP problems, but not hard NP problems where we know the solution. 
  • Minicrypt: One-way functions exist but we do not have public-key cryptography.
  • Cryptomania: Public-key cryptography is possible, i.e. two parties can exchange secret messages over open channels.
Impagliazzo's world has an explicit "you can't have your cake and eat it too", either you can solve NP-hard problems on average, or have cryptography but not both (neither is possible). That's the mathematical world of P v NP. 

The reality is looking more and more like Optiland, where we can solve difficult NP problems and still have cryptography thanks to vast improvements in machine learning and optimization on faster computers with specialized hardware.

Back in 2004 I gave my guess of the world of P = NP
Learning becomes easy by using the principle of Occam's razor--we simply find the smallest program consistent with the data. Near perfect vision recognition, language comprehension and translation and all other learning tasks become trivial. We will also have much better predictions of weather and earthquakes and other natural phenomenon.

Today you can take your smartphone, unlock it by having the phone scan your face, and ask it a question by talking and often get a reasonable answer, or have your question translated into a different language. You get alerts on your phone for weather and earthquakes, with far better predictions than we would have though possible a dozen years ago.

We have computed the shortest traveling-salesman tour through nearly 50K UK pubs. AlphaFold can simulate protein folding with an accuracy nearly as good as what we get with real-world experiments. You can view GPT-3 as generating a form of a universal prior. Even beyond P = NP, we have self-trained computers easily besting humans in Chess, Go and Poker.

Meanwhile these techniques have done little to break cryptographic functions. Plenty of cybersecurity attacks but rarely by breaking the cryptography. 

Not all is rosy--there is still much more we could do positively if P = NP  and we are already seeing some of the negative effects of learning such as loss of privacy. Nevertheless we are heading to a de facto best of both worlds when complexity theory tells us those worlds are incompatible. 


  1. So for the real world it looks like
    Factoring is harder than TSP.
    Why is this?
    One reason: approximating TSP to within BLAH of optimal is worth doing and one may regard the problem as solved; however, for factoring there is no notion of approximate factoring.

    Any other reason?

    1. For TSP people don't really care about worst-case instances. For Factoring they do.

    2. AH- good point.
      Also, for factoring there really is an adversary trying to give you hard cases (prod of two large primes). By contrast I doubt the pubs in England were build for the sole purpose of making TSP for them hard.

  2. I do not understand why it is theoretically impossible to (a) have an algorithm for NP-complete problems similar to the simplex method for the linear programming, that is, solve all real-life instances fast but being able to artificially construct instances on which the algorithm run exponential time. But at the same time (b) still have cryptography by encoding the messages into exactly the hard instances.

    1. The whole issue is to make sense of "real-life" instances. Are hard factoring problems not real life? Simplex method is not panacea for all instances. The provably poly-time interior point method outperforms Simplex in many large instances.

    2. The "real-life" instances for each problem differ from problem to problem. As stated above, the "real life" instances of factoring are the hard ones; the "real life" instances of TSP are the (relatively) easy ones.

  3. regarding equality notation:
    Always had a bit of hesitation when thinking of
    whether it should be P=NP or NP=P.
    When we assign values to variables, how would you
    assign to show that NP is literally P, would it
    not be NP=P ?
    What's going wrong here ...?

    1. P and NP are names of specific sets, and = here is mathematical equality of sets, not variable assignment.

      Equality of sets is a symmetric relation; P=NP is the same as NP=P. The two statements are the same, but historically we prefer P=NP.

  4. Thank you. I think this is an important point that a lot of newbies (or do people call them noobs nowadays?) are not aware of,yet take P=NP (rather than NP=P) for granted. The historical reason is important, and those of us not having been part of the TCS community will probably be less aware of the preference.

  5. Why assuming P=NP implies that you can find the smallest solution of a search problem in P time ? I mean, it depends on the size of the solutions to the optimization problem, isn't it?

  6. Factoring and similar all-or-nothing-at-all problems are harder than TSP in practice. There is a single factorization, finding any approximate factorization (primal or dual), no matter how close to the correct one, does hot help in any way. Techniques like local search, mathematical programming or ML are powerless in that context.