Friday, June 29, 2012

Instance Compression

Some announcements: FOCS Accepts, ITCS Call and ToC Special Issue for Rajeev Motwani

I'd like to highlight one of the new FOCS papers "New Limits to Classical and Quantum Instance Compression" by MIT student Andrew Drucker which solves a problem that has dogged me for a few years. Ignore the "Quantum" in the title, that's not the interesting part.

Consider the OR-SAT problem, given m formulas each of size n, is at least one of them satisfiable. In STOC 2008, Rahul Santhanam and I showed that if there is a reduction from OR-SAT to any language L with the property that the reduction reduces to instances of size polynomial in n (independent of m) then the polynomial-time hierarchy collapses. The question arose from parameterized complexity though it also has some applications to cryptography and PCPs.

Our techniques failed to get a similar result for AND-SAT, where we want to know if all of the formulas are satisfiable. Drucker settled the question, showing that a compression of AND-SAT (reducing to instances of size polynomial in n) also gives a similar collapse of the hierarchy. What impressed me the most was the ingenuity Drucker uses in his proof.

Drucker derives the following lemma about probability distributions based on a strong version of Fano's inequality. Let A be a set of strings, D a distribution over A and f any function that maps Am to {0,1}m-2. If you pick a y from D, with high probability there is an index i, such that the distributions f(Dm) and f(Dm) with the ith location replaced by y are statistically close.

Drucker combines this lemma with a minimax argument and a result of Sahai and Vadhan that characterizes Statistical Zero-Knowledge as determining whether two distributions are different from the circuits that generate them.

Sometimes people solve your open questions with a simple argument that makes you hit your head for not thinking of it yourself. Other times you just sit back amazed at the solution knowing there was no hope for you to find it. Drucker's paper definitely falls in that second category.

1 comment:

  1. I've written some revised/expanded notes responding to Lance's post and describing/motivating my work, which you can find here in pdf:

    Main points:
    1) I'm grateful for Lance's kind words; OTOH, contra Lance, the quantum part of the paper is also interesting!

    2) The lemma on stability of distributions is not hard; also, there are similar earlier results from which it can be derived (as colleagues helped me understand).

    3) The use of SZK is not needed for my main result on hardness of compression for an AND of SAT instances. I do use properties of SZK to get improved hardness of (probabilistic) compression for OR-SAT, however.

    4) Since Lance posted, I found a proof of my "basic" result on AND-SAT that avoids using any information theory terms or results, and also avoids the minimax theorem. This modified proof is based on my original (though weaker quantitatively), but it also bears an interesting resemblance to Fortnow and Santhanam's work on OR-SAT.

    5) For readers of previous drafts: in my general result, I've also significantly simplified the way I handle the case of compression reductions with "large output size", i.e., output length O(t log t) for an input AND of t length-n SAT instances. I now use a unified proof that works for this and the "small-output case."