Thursday, August 14, 2008

AI Follows Theory

In one of the 2006 AAAI Outstanding Paper Award Winners Model Counting: A New Strategy for Obtaining Good Bounds, Gomes, Sabharwal and Selman show how to approximately count the number of satisfying assignments of a Boolean formula with a SAT solver. They add random parity constraints ala Valiant-Vazirani and approximate the number of solutions based on the number of constraints needed until the formula is not satisfiable.

Sounds like a neat idea that complexity theorists should have come up with in the 80's. And we did, where by "we" I mean Larry Stockmeyer in his 1985 SICOMP paper On Approximation Algorithms for #P. Stockmeyer uses random hash functions from Sipser but it is essentially the same algorithm and analysis as Gomes et. al.

Stockmeyer wrote a purely theoretical paper. Since then we have better algorithms and much faster computers and one can now solve satisfiability on non-trivial input lengths. Gomes et. al. write the paper as a practical technique to approximate the number of solutions and even implement their algorithm and contrast with other programs that exactly count the number of solutions.

Gomes et. al. mention Toda's paper on the hardness of exact counting but don't cite Stockmeyer or any other paper in the vast theory literature on approximate counting.

We complexity theorists know many other nifty things one can do with an NP oracle such as uniform sampling and learning circuits. Read them now so you don't have to reinvent them later.


  1. I'd say if us theorists want credit for our work (and to prevent losing credit to re-invention) then we would should disseminate our work like the AI guys do. Most of the links to the theory work in your article are contents behind locked doors, which in this age of convenience may as well not exist. I can easily find, read and cite the AAAI work- I am not entirely happy citing locked-up work (even if I personally happen to have a copy). I have a medium-sized rant on this issue at: .

  2. Although that link to the Stockmeyer paper cost money, a scan is available online for free:

  3. There's a larger point here, which is that the existence of good SAT solvers means that the boundaries of "feasible" computation have shifted to look a little more like P^NP than P. Except, of course, these solvers don't work on arbitrary length inputs, and they may have specific instance families where they fail. Understanding and characterizing this shift, and the limitations, feels like a theory problem.

    For example, Richard Beigel's work on bounded queries might be a way to model a computer with a SAT solver. (assuming the SAT solver is only treated as a "black box", which matches how I see them used most of the time).

  4. Regarding John's comment, as a physicist I definitely agree. When I want to read a CS paper related to my work, it is often unavailable. When will theoretical computer scientists start actually using computers? That is, when will they create something like the arxiv?

  5. Gomes, Sabharwal and Selman give a method only for lower bounding the number of solutions, but do go on to demonstrate its practical viability for some reasonably interesting instances. Their method adapts the Valiant-Vazirani idea of halving the number of solutions in expectation by conjoining with a random parity function. However, for efficiency reasons they use parities of small length, and for these show that the halving is still true as an estimate from one side.

    Presumably noone has yet shown that a known reduction of approximate counting to NP works as is in conjunction with an existing SAT solver, to solve some interesting instance of a problem.

  6. John Mount's point about pdfs hidden behind locked doors is valid, but a subject for a separate rant. The fact is that the AI researchers didn't do their homework and find Stockmeyer's seminal paper. No excuses for that. A simple google scholar search for "approximate counting" brings up Stockmeyer's paper on the first page of results. The same idea has been (over)-used in counting distinct elements in data stream algorithms (several rehashes of the same idea). What's disappointing is that one of the authors, Sabharwal, has a PhD in theory, having worked with Paul Beame at UW. What's extremely disappointing is that the reviewers for the conference didn't notice the rediscovery of Sipser/Valiant-Vazirani/Stockmeyer idea.

    On the other hand, the hallmark of a great idea is that several people rediscover it, so we may forgive the authors of the AAAI paper.

  7. While there isn't much excuse for missing the Stockmeyer reference (the Valiant-Vazirani reference is there), one thing not to miss is that the algorithm that they succeeded with does not fit the Stockmeyer analysis (and that algorithm would have failed) because random parity constraints are too expensive to evaluate at every node of a backtracking SAT solver. The key to the effectiveness of the algorithm in the paper is that they get provable bounds with randomly chosen SMALL parities, which are constraints that are easy to evaluate repeatedly. The paper proves the new theoretical result that this is enough to get provable lower bounds on the number of solutions.

    The bottom line of this should not be "AI people should be giving theory people appropriate credit" but rather that "powerful theoretical ideas may have powerful applications if one understands the applications well enough to suitably modify the theoretical ideas".