## Friday, December 17, 2021

### Fifty Years of P vs. NP and the Possibility of the Impossible

I have a new article Fifty Years of P vs. NP and the Possibility of the Impossible, to mark the anniversary of the 1971 publication of Steve Cook's seminal paper, a month too late in the January 2022 Communications of the ACM

Initially Moshe Vardi asked me to update my 2009 CACM survey The Status of the P versus NP Problem. The P vs NP problem hasn't changed much but computing has gone through dramatic changes in the last dozen years. I try to view P vs NP in the lens of modern optimization and learning, where we are heading to a seemingly impossible Optiland (a play on Impagliazzo's Pessiland), where we can solve many of the toughest NP-complete problems in practice and yet cryptography remains unscathed.

CACM produced a slick companion video to the article.

1. A summary and a question:
Some NPC problems are doable via ML or SAT-Solvers or NEWTHING
Factoring is still hard.

Can ML or SAT-Solvers or NEWTHING be used on factoring?
Why is factoring in the HARD part of NP, whereas SAT might be
in the EASY part of NP?

I am sure we can't prove any of this now, but what is the intuition for Factoring being hard given that other problems are easy.

2. Factoring is not harder than SAT in practice, it is just that the SAT instances that we care about are easy while the factoring instances that we care about are hard.

1. If we took two large primes, multiplied them, and made the
factoring problem into a SAT problem I assume that that SAT problem would be hard. But is this because
1) Its a large formula, or
2) the TYPE of formula it is.

2. Most formulas in the SAT competition have millions of variables, but it seems that there are formulas with <100k variables that correspond to intractable factoring instances: https://arxiv.org/abs/cs/9809117. So I guess it is because of the type of the formula.

In this twitter thread https://twitter.com/soosmate/status/1374818799149518852 a SAT solver expert makes the case that the size of a CNF has nothing to do with its hardness by giving a CNF with <10k variables whose solving amounts to breaking 128 bit AES.

3. Is it know WHY the formulas coming out of AES and factoring and whatnot are hard? Is there a clean statement like:

If the clauses overlap X amount then the SAT is hard

4. An obvious feature is that factoring formuals are similar to unique sat, i.e. the number of solutions is very small.

Perhaps there is also a conjecture (by Impagliazzo?) which says that SAT formulas with many satisfying assignments are easier.

But I think that the number of solutions is not enough; also a sort of "self-reference degree" plays an important role: e.g. a flip on every variable (or on most variables) can propagate to the whole formula.

I have in mind SAT formulas from Ramsey problems (e.g. the k-colorable rectangle-free grid problem); the SAT formulas are very small (and you can further restrict the solution space adding a simple order on the "elements" of the solution) but completely out of reach of current computational power.