Monday, October 10, 2005

Favorite Theorems: Logical Characterization of NP

September Edition

Usually we think of the class NP as either languages accepted in polynomial-time by a nondeterministic Turing machine or languages with polynomial-time verifiable witnesses. Ronald Fagin gives a characterization of NP based on logic without references to Turing machines or polynomial time.

Ronald Fagin, Generalized first-order spectra and polynomial-time recognizable sets. Complexity of Computation, ed. R. Karp, SIAM-AMS Proceedings 7, 1974, pp. 43-73.

In this paper Fagin shows that NP consists of exactly the languages expressible with existential second-order formulas. For example consider a graph G described by an edge relation E(i,j) and we can define whether G is k-colorable by

∃C ∀i,j (1≤C(i)≤k ∧ (E(i,j)→C(i)≠C(j)))

With some more work you can use binary predicates. In general every language in NP has an existential second-order characterization with binary predicates and a universal first-order part.

Stockmeyer generalizes Fagin's result to characterize the polynomial-time hierarchy with second-order formulas.

Fagin's result started the area of descriptive complexity that characterized many common complexity classes in various logics and has connections to the complexity of database queries. Neil Immerman's work in descriptive complexity led him to his proof that nondeterministic space is closed under complement. Robert Szelecpsényi independently came up with a similar proof through a different approach.

Papadimitriou and Yannakakis use Fagin's result to characterize the class MAX-SNP of optimization problems. One of the first corollaries of the PCP Theorem is to show the MAX-SNP hard problems cannot be approximated within an arbitrary constant unless P=NP. In fact the concept of probabilistically checkable proof itself originally comes from a second-order view of NP that originated from Fagin's paper.

Update 10/12: Fagin adds an addendum.

Thanks to Lance and Siva for the kind words about my theorem. Let me clarify the story on the arity of the existentially quantified relations.

An existential second-order formula about, say, graphs, is a formula of the form

∃ Q1 ... ∃ Qk S( E, Q1, ..., Qk)

where E represents the edge relation, Q1, ..., Qk are existentially quantified predicates of arbitrary arity, and S(E, Q1, ..., Qk) is a first-order formula that involves only E, Q1, ..., Qk. As an example, 3-colorability of the graph can be expressed by an existential second-order formula

∃ Q1 ∃ Q2 ∃ Q3 S(E, Q1, Q2, Q3),

where Q1, Q2, and Q3 are unary predicates (that represent the 3 colors), and S(E, Q1, Q2, Q3) is a first-order formula that says "Each point has exactly one color, and no two points with the same color are connected by an edge''.

In the case of graphs, my theorem says that if T is a class of graphs that is closed under isomorphism (that is, whenever T contains a graph G, then it contains every graph isomorphic to G) , then T is in NP if and only if T is defined by an existential second-order formula. In the case of graphs, it is an open problem as to whether we really need to allow existentially quantified predicates of arbitrary arity. On the one hand, it is conceivable that there are NP properties of graphs that require existentially quantified predicates of arbitrarily large arity. On the other hand, it is conceivable that we can capture every NP property of graphs by allowing only a single existentially quantified binary predicate.

If we consider NP properties not just of graphs, but of arbitrary structures (such as structures with, say, two ternary relations and five 7-ary relations), then the characterization of NP in my theorem continues to hold, but in this case, it is known that existentially quantified binary predicates do not suffice. In particular, Ajtai proved (in the same amazingly rich 1983 paper [Σ11-formulae on finite structures, Annals of Pure and Applied Logic 24, 1983, pp. 1-48] where, among other things, he proved the Furst-Saxe-Sipser theorem independently of Furst, Saxe and Sipser), that if we consider structures consisting of a single m-ary relation, then the statement "The number of m-tuples in the relation is even" cannot be captured in existential second-order logic with any number of existentially quantified predicates of arity less than m.

A gentle introduction to the work I've mentioned in this note and to some other work in finite model theory appears in my survey paper Finite model theory-a personal perspective.


  1. Two comments:

    (1) As far as I know, it is an open question whether every NP langauge has an existential second-order characterization with binary predicates only. See, for example, the talk given by Ron Fagin at a BATS meeting .

    In fact, it is conceivable AFAIK that every NP language has a second-order characterization with just one binary quantifier.

    (2) To me, Fagin's theorem is perhaps the best argument we have for why NP is such an important class (aside from the empirical ones). NP is about searching for "solutions" to "problems", and as humans, we are quite limited in our ability to express what constitutes a "solution". The description "solutions that can be verified by a polynomial-time algorithm" isn't even close to what I believe to be capable of human expressibility. On the other hand, in virtually every real example, the "verifiability" of a purported solution can be distilled to a simple first-order sentence (e.g., if you want to produce a time-table for your university, you're looking for a relation (class, professor, room, timeslot) that satisfies natural constraints like "no professor can be present in two classrooms at same timeslot", "no room can hold two classes as the same time", etc.).

    A nice counterexample to my claim above is factoring: I can't think of a simple first-order sentence that will tell us if N = p * q. With the logical structure most natural for the problem (integers, bits, etc.), it is likely that this is provably impossible (by reduction from PARITY). Yet factoring is in NP.

    Btw, Fagin's theorem, as stated by Lance, might be incorrect; this business is a bit picky about the structure (graphs, numbers, etc.) underlying the logic you're talking about. (I am sure Ron will point out the subtleties very soon :-)

  2. Apologies for the bad URL for the talk by Ron Fagin. It is

  3. Siva, thanks for your helpful comments. Lance has kindly posted my response to your question as an addendum.

  4. The alert reader might note that I left out a piece of the story in my addendum to Lance's column. The class of NP properties that can be defined using only unary existentially quantified predicates is called "monadic NP". In the example I gave of 3-colorability, only unary existentially quantified predicates are used, and so 3-colorability is in monadic NP. Is every NP graph property in monadic NP? The answer is no: in a 1975 paper, I showed that graph connectivity is not in monadic NP. In a 1995 paper with Stockmeyer and Vardi, we gave a simplified proof of this result. This result is especially intriguing, since it is easy to show that the complementary class (graph nonconnectivity) is in monadic NP. Hence, monadic NP is not closed under complement.

  5. "Fagin's theorem, as stated by Lance, might be incorrect; this business is a bit picky about the structure (graphs, numbers, etc.) underlying the logic you're talking about."

    With a bit or work, I believe Fagin's theorem could be reformulated in categorial terms, thus eliminating the problem you're pointing out.