Monday, February 08, 2016

The Moral Hazard of Avoiding Complexity Assumptions

Moshe Vardi's CACM editor letter The Moral Hazard of Complexity-Theoretic Assumptions practically begs a response from this blog. I also encourage you to read the discussion between Moshe and Piotr Indyk and a twitter discussion between Moshe and Ryan Williams.

I take issue mostly with the title of Vardi's letter. Unfortunately we don't have the tools to prove strong unconditional lower bounds on solving problems. In computational complexity we rely on hardness assumptions, like P ≠ NP, to show that various problems are difficult to solve. Some of these assumptions, like the strong exponential-time hypothesis, the Unique Games Conjecture, the circuits lower bounds needed for full derandomization, are quite strong and we can't be completely confident that these assumptions are true. Nevertheless computer scientists will not likely disprove these assumptions in the near future so they do point to the extreme hardness of solving problems like getting a better than quadratic upper bound for edit distance or a better than 2-ε approximation for vertex cover.

If you read Vardi's letter, he doesn't disagree with the above paragraph. His piece focuses instead on the press that oversells theory results, claiming the efficiency of Babai's new graph isomorphism algorithm or the impossibility of improving edit distance. A science writer friend once told me that scientists always want an article to be fully and technically correct, but he doesn't write for scientists, he writes for the readers who want to be excited by science. Scientists rarely mislead the press, and we shouldn't, but do we really want to force science writers to downplay the story? These stories might not be completely accurate but if we can get the public interested in theoretical computer science we all win. To paraphrase Oscar Wilde, as I tweeted, the only thing worse than the press talking about theoretical computer science results is the press not talking about theoretical computer science results.


  1. > These stories might not be completely accurate but if we can get
    > the public interested in theoretical computer science we all win.

    And here I disagree.

    These stories are not "a bit" inaccurate; equating polynomial running time with (real-world) efficiency is, for most purposes, not helpful at all. In particular when talking to laypeople.

    That implies that we should not misdirect public attention. Complexity theory is a nice field for brain teasers, but "the public" is arguably interested in (practically) fast algorithms. We should not pretend that complexity theory does usually (can?) provide such.

    So if we don't want "the public" to be majorly disappointed, we should admit that research in complexity theory is unlikely to yield fast algorithms any time soon.

  2. This WWE mentality in journalism is a wonderful way of thinking.
    Journalism need not be accurate it's about getting excited - it's entertainment.
    I just don't understand why limit ourselves to science reporting.

  3. I think you mean a better than quadratic *upper* bound for edit distance, and as far as I know Babai only has a new GI result, not factoring :)

  4. I agreed with Raphael that inaccurate press is bad. (Besides the obvious intellectual honesty issues, I'd hate to see a "Theory Winter" akin to the AI one.) But then...

    > Complexity theory is a nice field for brain teasers, but "the public" is arguably interested in (practically) fast algorithms.

    Under what authority do you speak for the public? I can't claim to speak for all of them either. But whenever I've talked to a science writer about what I do, or anybody who isn't a scientist for that matter, the listeners are far more intrigued by the fact that so many different computational problems are "connected" by reductions. (The SETH and Edit Distance connections are a good example.) It's the interconnectness of computational problems and the universality of computation, rather than a pedantic march towards the "practically" wall-clock fastest procedure, that seems most fascinating to non-scientists. This is also part of what distinguishes complexity (and the theory of algorithms!) from software engineering: the practically fastest procedure will naturally depend on exploiting whatever hardware you've got at the moment. Which is great, as long as you're OK with theorems that go obsolete as often as your laptops do.

    > We should not pretend that complexity theory does usually (can?) provide such.

    Complexity absolutely can help in the design of "(practically) fast algorithms", but that is absolutely not its raison d'etre either.

    1. Things don't change that fast in experimental algorithms or software engineering. Just like in other fields of technology, superficial things like product names and technical details may change, while the important ideas remain valid for decades.

    2. Ryan, you are proposing a false dichotomy. There is plenty of stuff between complexity theory and wall-clock-time experiments. Basically, the whole field of analysis of algorithms.

      To give a concrete example, let's look a Quicksort.

      The complexity-theory approach is to write down the algorithm in very abstract terms, apply the Master theorem and say it has O(n log n) average-case time "complexity". (Hence proving something about the complexity of sorting.)

      The SE approach is to look at a very detailed, maybe even platform-specific implementation, measure running times in seconds, and say that their variant is 5% faster than another.

      The AofA approach would take a specific yet platform-independent (not necessarily model-independent) pseudo-code description, derive that it takes in expectation and asymptotically (I'm making numbers up) 2n log n - 2n +- O(log n) comparisons and 1.5n log n + 0.25n +- O(log n) swaps.
      I think it's obvious that this approach can give answers that neither of the other two approaches can.

      The bone I have to pick with complexity theory (as a field, meaning the collective of humans pursueing it) from the perspective of algorithms is that a) they focus on Landau bounds (usually for the worst case) and b) create "algorithms" which are never implemented. Which is completely fine for their purposes, but is not of much help when you are interested in ... actually doing stuff.

      That, and that they carry their mindset to algorithms classes.

      > Complexity absolutely can help in the design of
      > "(practically) fast algorithms"

      I have not seen that happening all too often, but maybe I have not been looking closely enough.


      FWIW, note that I am not saying that complexity theory is irrelevant or not a good thing to know something about for people working in/with algorithms. In fact, my opinion is quite the opposite. All I'm opposing is the view that complexity theory is all theory there is about algorithms.

    3. No false dichotomy here. You said "the public" is interested in X. I said they're not interested in X (or even X', which is even more "practical"), but rather Y. I also tried to explain why the public is interested in Y instead of X (or X'). I did not say "there's nothing between X and Y".

  5. To add to what Ryan wrote compare with physics. There is a huge founding for experiments that are unlikely to give benefits any time soon if ever like LHC. But understanding the world around us is a worth while endeavor for humanity. Many of "useful" things would not have happened if we followed the mentality proposed by Raphael. Does anyone believes that gravitational waves will have any useful applications in the short term? Of course not. Still it is in the popular news headlines and we don't see similar inaccuracy of popular CS articles in them. There is a lot for popular CS writers to learn from popular physics writers on how to create enthusiasm without dishonesty. The source of dishonesty and inaccuracy in CS articles is our failure to explain why what we do is worthwhile endeavor.

    Making dishonest claims by what science is doing is a bad, not just ethically but also because of dangerous of public starting to distrust scientists. We have enough of trust issues with the general public about science already, we should not add more just to get a little bit more funding. If you want more funding give a better big picture of what all these are about.

    While on the topic, Knuth gave a talk at Stanford about historians of computer science (available on Youtube) and many of his points also apply to popular science writers.

    1. Useful activity can be defined in the following way:

      1. An activity is directly useful, if it prevents people from dying in the short term or produces something other people find interesting in its own right.

      2. The activity is indirectly useful, if it helps in something that has already been established as useful.

      3. The activity is incidentally useful, if it encourages useful activities.

      Essentially, a research community does useful work, if it produces results people outside the community find interesting or useful. Most subfields of theoretical physics have a pretty good track record in this. After all, a significant part of the world can be seen as a pipeline taking ideas from physics, creating chains of increasingly applied results, and ultimately delivering something immediately useful.

      It can be an interesting thought experiment to determine to which extend this holds in TCS. Take your favorite subfield of TCS and try to determine, what kind of impact it has had to people outside the subfield / TCS / CS / science and technology in 5/10/20/40 years.

    2. Anonymous writes:

      > Many of "useful" things would not have happened if we
      > followed the mentality proposed by Raphael.

      I don't think I have proposed a mentality, and I certainly have not proposed to do away with complexity theory. Quite the opposite! I'm a big fan of basic research, and saddened by the lack of funding it receives. All I'm asking is the honesty to say, "we're doing basic research; we're fishing in the dark in many places; practical applications (for X) are far off, and may never surface".