Monday, November 29, 2010

Complexity as Friction

What advantages can we get from computational hardness? Cryptography and pseudo-random number generators come to mind. But perhaps the universe made computation difficult for a reason, not unlike friction.

In the physical world friction usually costs us energy to overcome but on the other hand we can't walk without it. In the computational world, complexity can often slow progress but if it didn't exist we could have many other problems.

We've seen a taste of this as computers got faster, information more accessible and better algorithmic tools particularly in learning and search.
  • Privacy: We can now store more information about each person and even the old information is much easier to access. Perhaps as important we can now learn more from less with much better algorithms. We're almost at the point where computers know more about us than we know about ourselves.
  • Markets: Two words: Flash Crash 
  • Airlines: We can easily find the cheapest airline flights and so airlines have put considerable effort into competing on this single dimension of price at the expense of the whole flying experience. This has also reduced competition where now airlines have cut capacity so they can keep prices higher.
  • Politics: Politicians act as a proxy for the people in a democracy and we entrusted them to make difficult decisions because we didn't have the tools to help in these processes ourselves. But now we have much more access giving less flexibility to politicians, making them focus more on the short term instead of the longer picture and polarizing our politics.
This is just a few of many such stories. As computer scientists our goal is to make computation as efficient and simple as possible but we must keep in our minds the costs of reducing the friction too much.

Wednesday, November 24, 2010

Game Theory, Terrorism, Hardness and SAT Solving

Last week I attended the Army Research Office sponsored Workshop on Reasoning in Adversarial and Noncooperative Environments related to Topic #22 of the 2011 MURI. Most of the attendees were AI types many of whom look at games with incomplete information against opponents you can't trust. A group at USC talked about their work using game theory to deploy federal air marshals on planes and road-side checking at LAX. Game theory that might save lives.

I talked in a panel on the complexity of computing equilibrium on why we should turn the question around and use computational complexity to redefine equilibrium. Bart Selman, also on the panel, talked about his limited success in using SAT solving heuristics for QBF. During the Q&A we got into an interesting debate on the importance of computational complexity in real-world scenarios. The word "irrelevant" came up to note that NP-completeness is too blunt a tool to capture hardness in practice. Bart didn't think average-case was quite the right approach either, some other structural property of problems was needed.

Though we both care deeply about satisfiability, Bart and I don't hang out much in the same circles and this workshop was the first time we really had to a chance to chat. Despite the panel, Bart later showed quite an interest in computational complexity. I found myself explaining Ryan's new result to him, which get complexity results from betters circuit SAT algorithms.

The PCP theorem says it is as hard to approximate max-SAT as it is to solve it exactly. In practice, according to Bart, this separation makes SAT easier to solve heuristically. So Bart suggested one might use the PCP theorem as a method to preprocess a formula to improve current heuristics. The PCP theorem still has large constants in the size of the reduction making a direct application impractical. But perhaps parts of the theorem, for example using good error-correcting codes, can help. That would take the PCP theorem, normally used for showing problems hard, to help make SAT easier. Cool if it could be made to work.

Monday, November 22, 2010

Erdos Distance Problem SOLVED!

(All papers referred to in this post can be accessed from my website on the Erdos Distance Problem.).

In 1946 Erdos raised the following question: Given n points in the plane how many distinct distances between points are you guaranteed? Let g(n) denote the number. How does g(n) grow? This problem is now called The Erdos distance Problem.

A while back I created a website of all(?) of the papers on this problem. Today I happily added one more paper to it which comes close to just about settling it. See also Tao's Blog on the recent result.

Julia Garibaldi, Alex Iosevich, and Steven Senger have a book due out in Jan 2011 entitled The Erdos Distance Problem. I proofread it for them so, without giving away the ending, I'll just say it is AWESOME.

I have seen it written that Erdos conjectured either g(n) &ge n/sqrt(log n) or g(n) &ge n1-&epsilon. These paper reference Erdos's 1946 paper. No such conjecture is there. I suspect Erdos did state the conjecture in talks.

I give a brief history:

  1. Erdos (1946) showed O(n/\sqrt(log n)) &ge g(n) &ge &Omega(n0.5). These proofs are easy by today's standards.
  2. Leo Moser (1952) showed g(n) &ge &Omega(n0.66...). (Actually n2/3.)
  3. Chung (1984) showed g(n) &ge &Omega(n0.7143...). (Actually n5/7.)
  4. Chung, Szemeredi, Trotter (1992) showed g(n) &ge &Omega(n 0.8/log n).
  5. Szekely (1993) showed g(n) &ge &Omega(n0.8).
  6. Solymosi and Toth (2004) showed g(n) &ge &Omega(n0.8571). (Actually n6/7.)
  7. On page 116 of Combinatorial Geometry and its Algorithmic Applications by Pach and Sharir there is a prediction about how far these techniques might go: A close inspection of the general Solymosi-Toth approach shows that, without any additional geometric idea, it can never lead to a lower bound greater than &Omega(n8/9). The exponent is 0.888... .
  8. Gabor Tardos (2003) showed g(n) &ge &Omega(n0.8634...). (Actually n(4e/(5e-1)) - &epsilon.)
  9. Nets Hawk Katz and Gabor Tardos (2004) showed g(n) &ge &Omega(n0.864...). Actually n((48-14e)/(55-16e)) - &epsilon).
  10. (ADDED LATER) Elekes and Sharir (2010) have a paper that sets up a framework which Guth and Katz used to get their result.
  11. (THE NEW RESULT) Guth and Katz (2010) showed g(n) &ge &Omega(n/log n).
Some thoughts
  1. There is a big gap between 1954 and 1984. I'd be curious why this is.
  2. I have seen it written that Erdos conjectured either g(n) &ge n/sqrt(log n) or g(n) &ge n1-&epsilon. These paper reference Erdos's 1946 paper. No such conjecture is there. I am sure that Erdos made the conjecture in talks.
  3. The paper by Szekely is a very good read and is elementary. Solymosi and Toth is still elementary but is getting harder. It is not clear when these papers cross the line from Could be presented to a bright high school student to Could not be.
  4. While there is a casual statement about needing new techniques this field did not go in the barriers-direction of formally proving certain techniques could not work.
  5. Did the new result use additional geometric ideas? Not sure yet- though I have emailed some people to ask.
  6. The first result by Erdos really is sqrt(n)-O(1), not \Omega(\sqrt(n)). The rest are asymptotic.
  7. If there are 289 points in the plane then there are at least 17 distinct distances. Are there more? I would be curious what happens for actual numbers like this. Exact results are likely very hard; however, some improvement may be possible.
  8. Gabor Tardos is Eva Tardos's brother. Nets Hawk Katz is not related to Jon Katz (Cryptographer and Blogger). (Nets Hawk Katz is a great stage name!) I do not know if Leo Moser and Robin Moser are related.

Thursday, November 18, 2010

Eleven Tweets from GASARCH

I don't do tweets-I don't think I have enough interesting 140-char notes to tell people (though that doesn't stop Lance). However, several tweet-worthy notes did come up recently so I'll post them here. Hope they are of interest. Some are more than than 140 characters. Oh well.
  1. Godel Prize nominations due soon.
  2. Knuth Prize nominations due soon.
  3. Would you rather win a Godel Prize or a Knuth Prize?
  4. Howard Karloff (no relation to Boris), Phil Gibbons, and Sergei Vassilvitskii are organizing a DIMACS workshop on Parallelism: A 2020 Vision on March 14-16, 2011. They will try to predict the future!
  5. This looks like its from THE ONION, but it is real.
  6. This looks like it is real, but its from THE ONION.
  7. NO progress on my 17x17 problem BUT someone was inspired by it to make a game out of grid colorings. See Brian Hayes's Post about it
  8. UPDATE: 17x17 was solved. See Feb  8, 2012 post. Also see my paper on grid coloring on arXiv.
  9. My colleague Hal Daume has a Blog on NLP (Nat Lang Processing). One of his recent entries is of interest to all academics so I link to it here.
  10. Lance is impressed with Ryan William's result, as am I. However, Lance also has some Family Pride in seeing his Uncle Ryan get the result. (I'll let you figure out what that means.)
  11. My brush with fame: I know this guy. He is Phil Resnick (1) he does NLP in the Linguistics Dept at UMCP, and (2) I graded his papers in Automata Theory when he was an ugrad and I was a grad student (at Harvard).
  12. Private Information Retrieval: A Dilbert Strip and a Wikipedia Entry.
ADDED LATER SINCE IT WAS SENT TO BE LATER: Midwest theory day, Local Arrangements ~

Monday, November 15, 2010

Is Ryan's Theorem Only Interesting to Complexity Theorists?

Last week, the complexity theory bloggers DickScottLuca and myself all heaped praise on Ryan Williams' new circuit lower bound, NEXP not in ACC0. Outside the computational complexity community the reaction has been something like "Wow, those complexity theorists are excited by Ryan's paper."  So let me try to explain why we are excited by the result and perhaps why you should be too. 

In the 1980's complexity theorists started looking at circuit complexity. Circuits are ANDs, ORs and NOT gates that feed into each other (no cycles) with input bits on the bottom and an output bit at the top. Every function mapping n bits to one bit can be expressed as a circuit but may require an exponential (2O(n)) number of gates. Every polynomial-time computable function can be expressed by circuit with a polynomial number of gates. If one could show some problem in NP cannot be computed by polynomial-size circuits then you would have proven P ≠ NP.

Furst, Saxe and Sipser made the first breakthrough in 1981. They showed a simple function (parity) cannot be computed by circuits with polynomial-size and constant depth where depth is the length of the longest chain of gates from the input bits to the output. Yao and Hastad would soon show that you need an exponential number of gates for a constant-depth circuit to compute parity.

I started graduate school in 1985 with Mike Sipser as my advisor. During that first year we saw Hastad's paper and then a new result by a Russian, Alexander Razborov. Razborov showed that no polynomial-size monotone circuit (just AND and OR gates, no NOTs) can compute the clique function. At this point Sipser was sure a proof of P ≠ NP using circuit complexity was just around the corner.

In 1987, Razborov and Smolesky has a nice extension on the constant-depth bounds. They showed that one cannot compute the parity function with constant-depth polynomial-size circuits even if we augment them with mod3 gates. A mod3 gates outputs 1 if the number of inputs is not a multiple of 3. More generally they showed that for any primes p ≠ q, one cannot compute the modp function with ANDs, ORs, NOTs and modq gates. Their proof broke down if q is not a power of a prime.

That's six major circuit results in six years. We had every expectation that the results would keep coming. But nothing in 1988. Nothing in 1989. Nothing in the 90's. Nothing in the first decade of the 2000's.

Nothing is a bit strong, there were several nice lower bounds for uniform and restrictive models. Some new characterizations of circuit classes. We also saw arguments that further progress would be difficult, Razborov showed that his techniques for his clique result break down completely with NOT gates and, of course, natural proofs.

But for 23 years we had no progress on general circuits beyond Razborov-Smolensky. Forget NP, we couldn't even show there were problems in NEXP (problems verifiable in exponential time) that didn't have polynomial-size constant depth circuits with only Mod6 gates. (Nothing special about 6, same holds for any number that is a multiple of at least two distinct primes).

That is until last week when Ryan showed that NEXP cannot have polynomial-size constant-depth circuits with AND, OR, NOT and Modm gates for any m, prime or not.

And that's why we are so excited about this paper.

Thursday, November 11, 2010

A Problem with Wikipedia and the Next Best Thing to Winning a Godel Prize

(There are TWO theory day events in NY this semsester: Nov 11, 2010 and Nov 12, 2010. The first one has likely already started as you read this.)

I searched for myself on Wikipedia and I found out something scary. Not about me. About Wikipedia.
  1. One of the hits was to the page Godel Prize Winners. Since I don't recall winning a Godel prize this surprised me. The links to the Godel-prize winning papers Proof Verification and the Hardness of Approximation Problems by Arora, Lund, Motwani, Sudan, Szegedy and Probabilistic checking of proofs: a new characterization of NP by Arora and Safra both point to the copies of these papers on MY PCP website. This is truly frightening. I plan to keep the link up (I suppose now I have to); however, Wikipedia has such a problem with Link Rot that they even have an entry on it! For now. Why did they use my link for those papers? Was it the only copy of the paper that was free online? Doubtful- the authors surely have a copy on their websites. My PCP website is NOT well known: I whipped it up for a seminar. I never intended it to be stable or referenced by anyone else. (NOTE to Ryan Williams- I'll be runing a seminar on circuit lower bounds next semester. Please put your recent AWESOME paper someplace free and easy to find so that they point to your copy, not mine, when you win a Godel Prize.)
  2. One of the hits was on the page Recursively inseparable sets. Two sets A and B are rec. insep. if they are disjoint and there is no decidable set C that contains A and is disjoint from B. Here is a quote from the entry:
    Let φ be the standard indexing of the partial computable functions. Then the sets {e | φe(0) converges and equals 0} and {e | φe(0) converges and equals 1} are recursively inseparable (Gasarch 1998, p. 1047).
    This is NOT my result- I think it's folklore. They reference my Survey of Recursive Combinatorics which states the result without proof or reference (my bad). They don't even point to the (free online) paper-they point to the math review of it. (If you know who proved that result then let me know. I plan on changing this entry by correcting the mis-attribution, giving a correct attribution if I have one (else attribute folklore), and including a proof. If you don't know who proved it then you can always look it up on Wikipedia. Oh. Don't!)
  3. One of the hits was on the page The Coppersmith-Winograd algorithm for matrix multiplication in O(n2.376) time. The link to their paper points to my applications of Ramsey Theory website (the algorithm uses 3-free sets which is a concept from Ramsey Theory). This pointer makes some sense since my applications-of-Ramsey website is well known and it seems to be the only free online copy. Even so, seems fragile.
  4. One of the hits was on the page Graham's Number. The link to the paper Ramsey's Theorem for n-Parameter Sets by Graham and Rothchild points to my van der Warden's Theorem website (not to be confused with my applications of Ramsey Theory website). This is very odd since (1) there is a well known website of Ronald Graham's papers and (2) my vdw website is NOT well known.
I will at some point FIX some of these entries.

The above points to both the strength and weakness of Wikipedia. On the one hand, some of these links are fragile and there are likely better ones out there. On the other hand, for some of them they found a link that worked and went with it. If they had waited around for a better or less fragile link then the entry would not have been made.

One lesson: If you find something online that you want an ecopy of, download it! It may not be there next time. (I made the same point here.)

Wednesday, November 10, 2010

Dr. Scott Goes to Washington

Fellow blogger Scott Aaronson was chosen as one of 19 NSF PECASE awardees and wins a trip to the White House. Congrats to Scott! He received the award "for pushing the frontiers of quantum complexity theory, and for making the insights of theoretical computer science accessible to the general public."

For all those of you who wonder how having a blog could affect your academic career, Scott won the PECASE not despite his having a blog but because of it.

Tuesday, November 09, 2010

A Breakthrough Circuit Lower Bound

On October 8th, when in my graduate complexity course as we discussed the 1987 Razborov-Smolensky result that one can't compute the Modp by constant-depth polynomial-size circuits with AND, OR, NOT and Modq gates for primes p≠q, I noted the limits of our knowledge,
We can't even prove that NEXP (the exponential time version of NP) can't be solved by constant-depth polynomial-size circuits with just Mod6 gates, even without AND and ORs.
Now we can. And by "we" I mean Ryan Williams who just yesterday posted his proof that NEXP cannot be solved in the class ACC0, the class of constant-depth poly-size circuits with AND, OR, NOT and Modm gates for any fixed m. For the larger class EXPNP, Ryan gets exponential lower bounds. Dick, Scott and Luca already heaped their praise on these new results. No doubt, this is perhaps the most interesting computational complexity result since Reingold and the best progress in circuit lower bounds in nearly a quarter century.

It's not so much the result itself, in class I'll now just say "We don't even know how to prove that NEXP isn't in TC0 (poly-size constant-depth circuits with majority gates)" but an outright example that the proof techniques described in Ryan's STOC paper can be used to get new lower bounds, really opening the door to the first fresh approach to circuits in a long time. This approach converts weak algorithms for solving circuit satisfiability questions into circuit lower bounds. Ryan's proof doesn't use deep mathematical techniques but rather puts together a series of known tools in amazingly clever ways.

Ryan breaks through the natural proofs barrier in an interesting way. We didn't know if one can have one-way functions described by ACC0 circuits so it wasn't clear if natural proofs applied to lower bounds for ACC0. Ryan's paper doesn't break these one-way functions, rather he avoids the issue by using diagonalization and so his proof does not fulfill the constructivity requirement of natural proofs. Modifying his proof to make it "natural" would imply showing there aren't ACC0 one-way functions but that's still open. So there is no inherent reason Ryan's techniques couldn't work on TC0 or larger classes without violating natural proofs.

We're still a long long long way from showing NP does not have arbitrary polynomial-size circuits and P ≠ NP but Ryan has taken the first real baby step in decades.

Monday, November 08, 2010

The FOCS Experience

You can now watch videos of the recent FOCS talks online. Glencora, who I had the pleasure of meeting at the conference, has her favorites. If you watch any, check out the talk of Dan Spielman for no reason other than to see how a talk should be given.

I don't have other recommendations for I go to very few talks at STOC and FOCS. Why should I when the talks are online? 

I expect most of you, whether or not you attended the meeting, will watch few to none of the videos. If you do it will be on in the background while you check email or are busy with other things. You would feel guilty spending a day just watching videos. If you went to the conference you might feel guilty not going to the talks. Though some people spend the time during the talks checking email or being busy with other things.

The transportation revolution before I was born makes it possible for everyone to meet up in multiple conference every year. The communication revolution since I was born has brought us to the point where there is no more reason to go. We can replicate the activities of a conference from our desktops, a point Bill brought up last week. Many of you go just because you are supposed to give a talk, but you could just record the talk back home.

Going to a conference is an excuse to attend the conference, a justification for missing meetings and office hours and getting others to cover your classes. We pay many dollars for hotel, registration, airfare not because it allows us to attend the conference but because it doesn't allow us not to. 

Thursday, November 04, 2010

A non rigorous proof technique

As a grad student (1980's) I found a book in the library that claimed to solve Fermat's Last Theorem using elementary methods. (This was before Wiles had shown it true using advanced methods.) The book claimed that their proof was likely what Fermat had in mind. I KNEW that it was wrong. Why? Because
If FLT had really been solved then I would know that. (Note- I am not bragging about how much math I knew; I am bragging about how much math I knew of.)
Is that a rigorous proof technique? No, but it is useful and often correct.

(All of the math that I refer to below is done rigorously in my writeup here.) As a Grad student I noticed the following:
  1. Let C(x) be the length of the shortest description of x (the usual Kolmogorov complexity).
  2. It is easy to show that C ≤T HALT.
  3. The only proof I had seen that C was not computable was by contradiction: assume C is computable and then use that to create a string x that had no short description along with a short description of x. This proof did not use HALT at all! It was the only proof I knew of that a set was undecidable that did not use HALT (are there others?)
  4. Hence, from what I knew, it was possible that C was of intermediary Turing degree- neither decidable nor Turing-complete. This would be a natural intermediary Turing degree!! (As opposed to sets that are constructed for the sole purpose of being of intermediary Turing degree.)
But- I knew this could not be the case. How?
If there was a natural intermediary Turing degree then I would know that. (Note- I am not bragging about how much computability theory I knew; I am bragging about how much computability theory I knew of.)
Was my intuition correct? Unfortunately yes- one can show that HALT ≤T C. Too bad- I would have preferred to be wrong and have seen a natural intermediary Turing degree. (The writeup I pointed to above has the classic proof that C is undecidable, the proof that C ≤T HALT and the proof that HALT ≤T C. The proof of the latter I got from Peter Gacs.) (NOTE- I had it as Gac but a commenter pointed out that it is Gacs.)

Tuesday, November 02, 2010

The revolution will be DVRed

(There are TWO theory day events in NY this semester: Thu Nov 11. and Fri Nov 12.)

BILL: Will you be going to the RALLY TO RESTORE SANITY AND/OR FEAR? (See also here.) It will be the event that defines this generation!!!! Just look at all of these pictures that will be taken there! Here are some pictures by Evan Golub who went, took pictures, then went home early to actually watch some of it. And here is a picture that is more suited to the readers of this blog.

DONNA: I'll DVR it.

Is this the wave of the future?
  1. Even though watching a sports event on TV gives a better view and reasonably priced food (as opposed to the stadium which gives a terrible view and has overpriced food-like substances) most people say that being AT the game has a certain je ne sais quoi. At least people who think they know French say that. Will technology get SO good that being at home is preferred? Give the viewer the ability to pick what parts of the field he wants to see? Give him the ability to rewind in real time (already can via DVR)? 3-D? (not for a while). They might pay us to go to the game since an empty stadium looks bad for TV.
  2. Museums: Why go to the Louvre when you can do a virtual reality tour of it which saves you the cost of a trip to France? I think this is possible with today's technology. If not, then certainly soon. There already is a partial virtual tour.
  3. The rally was really crowded and I couldn't see much. Even so, being there had a certain I know not what which made it worth going. However, I'm glad I DVRed it so I can see what really happened. When I tell my great nieces that I was there at the dawn of a new age it will be partially fiction since I'll tell her of what I saw on DVR. Or I'll just show her the recording. You can also currently see it here.
  4. Movie Theaters are facing a problem with people waiting for the DVD to come out rather than go to the theater. For families the economics are insane: buying the movie on DVD (and thus OWNING it) costs less than taking the family to see it once. COUNTERPOINTS: (1) Some people want to get out of the house. (2) Lance told me that I really should see AVATAR in a theater.
  5. With conference talks online will people still go to conferences? (This was already discussed here but it fits today's theme also.)

Will technology make it easier and easier to stay home? I think yes. When that happens you may have some surprises- some Rally has far less people than anticipated because people worry it will be too crowded, but then for the next Rally people think it won't be that crowded so it gets more people than anticipated. Rather than count how many people were actually there you might calculate some weighted sum of how many were there, downloaded it, DVRed it, sign up on the Facebook page for it, etc.

Monday, November 01, 2010

By Any Other Name Would Be Just As Hard

In 1973 Donald Knuth searched for a name for the hardest problems in NP. Steve Cook didn't give a name in his paper and Karp called them P-complete. Knuth suggested three names "Herculean", "Formidable" and "Arduous". He sent out a ballot to people in the theory community and also allowed write-in votes.

The results he put in a SIGACT News Article, a fascinating and hilarious read. Of course Knuth's suggestions didn't fly and he got many quite inventive counter-proposals. My favorite: Hard-Ass Problems (Hard As satisfiability).

The winning solution came from a group at Bell Labs (likely including one of their new hires David Johnson), which you readers should all know as NP-hard and NP-complete. These names did have the advantage that it could generalize to other notions of hardness (like PSPACE-complete).

Knuth muses at the end
NP-hard problems hits lots of people, and that's why I began searching for a special term. In other words, I don't consider it a major goal to invent descriptive terminology for every conceivably interesting question of the type considered here; the major goal is to have a good term to use for the masses, in the one case which experience shows is almost omnipresent. To say NP-hard actually smacks of being a little too technical for a mass audience, but it's not so bad as to be unusable. 
The importance of the P versus NP problems has reached well beyond the theory community despite its name but I wish Knuth had been more successful in finding a descriptive term. P, NP, NP-hard and NP-complete are technical names that tend to isolate us and confuse others.