Friday, January 24, 2003

Vacation

Next week I will go away on vacation. When I go on a real vacation I go cold turkey on the Internet. No new posts or responses to comments or email until I return.

Have a great week everyone and I will see you in February.

An Efficient Algorithm for Testing whether a Graph is Perfect

Last year we saw the resolution of the Strong Perfect Graph Conjecture, a major result in graph theory. The conjecture stated that a graph is perfect if and only if it does not contain an induced subgraph H such that H or its complement is an odd cycle of length at least five. Chudnovsky, Robertson, Seymour, and Thomas proved the conjecture (now called the Strong Perfect Graph Theorem) last spring. Vašek Chvátal has a web page describing this result.

Why am I mentioning this result here? The problem of testing whether a graph is perfect in polynomial-time remained open even after this theorem as neither characterization gives an obvious algorithm. I just saw the the abstract of a talk that Paul Seymour will give at the Institute of Advanced Study next week. There he claims he and Chudnovsky have found a polynomial-time algorithm for testing whether a graph is perfect. I cannot find more about the algorithm on the web and I will miss the talk at the institute. I will post more information about this exciting new development when I have more details.

Thursday, January 23, 2003

The Lambda Calculus, Part 1

One cannot celebrate Alonzo Church, part of our Celebration of Geniuses without talking about his creation, the Lambda Calculus, a way to describe functions and functional evaluation with a very simple description and incredible power.

As an example, consider the square function, square(x)=x*x. Suppose we don't care about the name and just want to talk about the function in the abstract. The lambda calculus gives us the syntax for such discussions. We express the square function as

λx.x*x
This is a function that takes one argument and returns its square. For example
λx.x*x(5) = 25
Also note that the use of x is not important. The following is also the square function.
λy.y*y
So now let us formally define the syntax of the Lambda Calculus. The alphabet consists of an infinite list of variables v0, v1, the abstractor "λ", the separator "." and parentheses "(" and ")". The set of lambda terms is the smallest set such that
  1. Every variable is a lambda term.
  2. If M is a lambda term then (λx.M) is a lambda term.
  3. If M and N are lambda terms then MN is a lambda term.
We will sometimes leave off parentheses when the meaning is clear. Examples of lambda terms are xx, λx.xx, λx.λy.yx.
Free variables are those not closed off by a λ. For example in λy.xy the variable x is free and y is bound.

We use the notation M[x:=E] means replace every occurrence in the lambda term M of the free variable x by the lambda term E. There should not be any free variables in E that are bound in M as this could cause confusion.

There are two basic operations:
(renaming variables formally called α-conversion) λx.M to λy.M[x:=y] where y does not occur in M
(function evaluation formally called β-reduction) λx.M(E) to M[x:=E]

Church and Rossner have shown that if you have a complicated lambda-term it does not matter what order the β-reduction operations are applied.

What can you do with just these simple operations? You get the same power as Turing machines! It's instructive to see this connection and we'll go over the proof over several posts in the future.

Breaking Real-World Locks with Cryptography

Just in case you thought computer scientists only deal with computers, here is a New York Times article describing how AT&T researcher Matt Blaze using tools of cryptography to open locks that use a master key. Blaze has been in the news often in the past, usually for breaking various cryptographic schemes. It is neat to see the same ideas used to break mechanical locks.

Tuesday, January 21, 2003

The Why Me? File

Last night I (and forty other complexity theorists) received by email an anonymous paper entitled "NP = coNP". After a quick scan it was clear there was not much in the paper, so it will just become another entry in my Why Me? file.

Each year for the past dozen years, I receive various papers by people I've never heard of claiming great results in computer science or mathematics. I started the Why Me? file to collect these various manuscripts. In those early ancient days I received papers by postal mail, now I always get them electronically. All of the papers have been incorrect ranging from subtle errors to papers that just don't understand the question. What motivates these people? A chance for glory I suppose.

Most of the papers I get are variants on the P versus NP question, though a surprising number claim to have counterexamples to Cantor's Theorem that there are more reals than integers. As one author put it, "Sure, Cantor's Theorem is true if you consider only integers with a finite number of digits." My favorite is one of the earliest letters I got from a person who believes he deserves the first Noble (sic) prize in mathematics. "We have conclusively shown that Einstein's c2 in E=mc2 is different than Pythagoras' c2 in a2+b2=c2." And all this time I thought E=m(a2+b2).

I never spend more than a few seconds on any of these papers and I certainly never respond which is only asking for trouble. If there is any chance of it being correct, I can wait until someone else finds the bug.

Here is my suggestion to any of you who think you have the theorem of the century: Send it to a grad student with an opening line like "Because you are an expert in complexity...". They'll happily read your paper and tell you the errors of your ways. If by some fluke the result is correct the student will spread it around to the community and you'll get your fifteen minutes of fame.

Monday, January 20, 2003

Foundations of Complexity
Lesson 14: CNF-SAT is NP-complete

Prevous Lesson | Next Lesson

We will show that CNF-SAT is NP-complete. Let A be a language in NP accepted by a nondeterministic Turing machine M. Fix an input x. We will create a 3CNF formula φ that will be satisfiable if and only if there is a proper tableau for M and x.

Let m be the running time of M on x. m is bounded by a polynomial in |x| since A is in NP. m is also a bound on the size of the configurations of M(x).

We will create a set of Boolean variables to describe a tableau and a set of clauses that will all be true if and only if the tableau is proper. The variables are as follows.

  • qij: true if confi is in state j.
  • hik: true if the head in confi is at location k.
  • tikr: true if the tape cell in location k of confi has element r.
We create four clause groups to check that the tableau is proper.
  1. Every configuration has exactly one state, head location and each tape cell has one element.
  2. conf0 is the initial configuration.
  3. confm is accepting.
  4. For each i≤m, confi+1 follows from confi in one step.
1. We will just do this for states. The others are similar. Suppose we have u possible states.
Each configuration in at least one state. For each i we have
qi0 OR qi1 OR ... OR qiu
Each configuration in at most one state. For each i and possible states v and w, v≠w
(NOT qiv) OR (NOT qiw)
2. Let x=x1...xn, b the blank character and state 0 the initial state. We have the following single variable clauses,
q00
h01
t0ixi for i≤n
t0ib for i>n
3. Let a be the accept state. We need only one single variable clause.
qma
4. We need two parts. First if the head is not over a tape location then it should not change.
((NOT hik) AND tikr)→ ti(k+1)r
Now this doesn't look like an OR of literals. We now apply the facts that P→Q is the same as (NOT P) OR Q and NOT(R AND S) is equivalent to (NOT R) OR (NOT S) to get
hik OR (NOT tikr) OR t(i+1)kr

At this point none of the formula has been dependent on the machine M. Our last set of clauses will take care of this. Recall the program of a Turing machine is a mapping from current state and tape character under the head to a new state, a possibly new character under the head and a movement of the tape head one space right or left. A nondeterministic machine may allow for several of these possibilities. We add clauses to prevent the wrong operations.

Suppose that the following is NOT a legitimate transition of M: In state j and tape symbol r, will write s, move left and go to state v. We prevent this possibility with the following set of clauses (for all appropriate i and k).

(qij AND hik AND tikr)→ NOT(t(i+1)ks AND hi(k-1) AND q(i+1)v)
which is equivalent to
(NOT qij) OR (NOT hik) OR (NOT tikr) OR (NOT t(i+1)ks) OR (NOT hi(k-1)) OR (NOT q(i+1)v)
We do this for every possible illegitimate transition. Finally we just need to make sure the head must go one square right or left in each turn.
(NOT hik) OR h(i+1)(k-1) OR h(i+1)(k+1)
Just to note in the above formula we need special care to handle the boundary conditions (where k is 1 or m) but this is straightforward.

Wednesday, January 15, 2003

Supreme Court Rules on Infinity

Breaking news on a post from last October. The US Supreme Court ruled in favor of Disney that ever increasingly long finite lengths of copyright protection does not violate the constitutional prohibition against indefinite copyrights.

Our Friends at the NSA

Last weekend the movie Enemy of the State was shown on network television in the US. This is a pretty good thriller about a rogue NSA official using the resources of the NSA to get back some evidence from a lawyer innocently tangled up in this affair.

What do we know about the National Security Agency? While they don't have the best American mathematicians, who typically go to universities, they have a large collection of very good mathematicians. While they are free to read the same papers I read, we hear about very little of their work. They must have some exciting work in algorithms and complexity I can only dream about. Perhaps they have an efficient factoring algorithm or a working quantum computer in their basement. Unlikely, but possible.

Occasionally I meet NSA scientists at conferences, particularly those meetings devoted to quantum computation. "The NSA is much more interested in quantum computing than quantum cryptography," one such scientist told me. This surprised me since quantum cryptography seems much more likely to have real-world applications than quantum computers, both in theory and in practice. "The real issue is how long our current codes will remain unbreakable. We need to know if our the information currently encrypted will remain safe for 20 or 50 years."

So is Enemy of the State realistic? "Not at all," a different NSA employee told me at a quantum workshop shortly after the movie came out. "We work in boring cubicles, not the sleek offices depicted in the movie." Offices?! What about the satellites that can track people on the ground in real time? "No comment."

Monday, January 13, 2003

A Physics-Free Introduction to the Quantum Computation Model

I just posted to the BEATCS Complexity Column Web Page the February column, written by Steve Fenner, that gives an overview of quantum computing to those of us without a physics background.

Do you have a survey that you are dying to write? I am always looking for volunteers for the column.

Sunday, January 12, 2003

Foundations of Complexity
Lesson 13: Satisfiability

Previous Lesson | Next Lesson

Boolean-Formula Satisfiability (SAT) is the single most important language in computational complexity. Here is an example of a Boolean formula.

(u OR v) AND (u OR v)

u and v are variables that take on values from {TRUE, FALSE}. u means the negation of u. A literal is either a variable or its negation.

An assignment is a setting of the variables to true and false, for example (u→TRUE, v→FALSE). Once all of the variables are assigned a truth value, the formula itself has a truth value. The assignment (u→TRUE, v→FALSE) makes the formula above false. A satisfying assignment is an assignment that makes the formula true. For the formula above, the assignment (u→TRUE, v→TRUE) is satisfying. If a formula has a satisfying assignment we say the formula is satisfiable.

SAT is the set of satisfiable formula. The formula above is in SAT. This formula is not.

u AND (u OR v) AND (u or v)

A formula is in conjunctive normal form (CNF) if it is the AND of several clauses, each consisting of an OR of literals, like the formulas above. A disjunctive normal form (DNF) formula is the same with AND and OR reversed. A formula is k-CNF if every clause has exactly k literals. The first formula above is in 2-CNF.

CNF-SAT is the set of satisfiable CNF formulas. k-CNF-SAT or k-SAT is the set of satisfiable formulas in k-CNF.

Cook and Levin independently showed that SAT is NP-complete. The problem remains NP-complete if we restrict to CNF-SAT or even 3-SAT.

Next lesson we will show that CNF-SAT is NP-complete.

Thursday, January 09, 2003

What is your Erdös number?

Well before the Kevin Bacon craze, mathematicians and theoretical computer scientists measured their worth by their distance from the great combinatorialist Paul Erdös. Your Erdös Number is defined inductively as follows
  • Paul Erdös has an Erdös number of 0.
  • If your Erdös number is not ≤ i and you have one or more co-authors with Erdös number i than your Erdös number is i+1.
My Erdös number is 2 as I have three co-authors who have written papers with Paul Erdös: Laszlo Babai, Mario Szegedy and Noga Alon. As Paul Erdös passed away in 1996, it is quite unlikely that it will decrease.

Jerry Grossman maintains a web page devoted to the Erdös number project.

Here is a conversation I once had with a colleague Carl Smith.
Carl: I think my Erdös number is 4.
Me: My Erdös number is 2.
Carl: My Erdös number is 3.

So what is your Erdös number? It is probably less than you think.

Tuesday, January 07, 2003

Circuits over Sets of Natural Numbers

Last October I mentioned an open problem of Pierre McKenzie. In short, consider a circuit whose wires contain subsets of the natural numbers. Inputs are {0} and {1}. The gates consist of union, intersection, complement, addition and multiplication. For sets A and B, A+B = {x+y | x in A and y in B} and A×B = {xy| x in A and y in B}.

Is the following problem decidable: Given such a circuit for a set A and a natural number w, is w in A?

Here is the paper by McKenzie and Klaus Wagner that describes this problem and gives results for many subcases. It will appear in the upcoming STACS Conference.

I have been haunted by the simplicity of the question and the difficult of the solution. Let me give you the proof (which I left as an exercise in the earlier post) that a decision procedure for the problem would yield a way to determine Goldbach's conjecture that every even number greater than 2 is a sum of two primes.

Define the set GE2 (the numbers at least 2) as {0}∪{1}. Define PRIMES as GE2∩GE2×GE2. Define GOLDBACH as (GE2×2)∩( PRIMES+PRIMES). Now we have Goldbach's conjecture is true if only if 0 is not in {0}×GOLDBACH.

Since I don't think Goldbach's conjecture has an easy decision procedure, I don't believe there is a decision algorithm for the problem. Proving this seems very tricky. The obvious idea is to try and create Diophantine equations. But even generating the set of squares is open.

Monday, January 06, 2003

When will we show P ≠ NP?

Given the comments from saturday's post, perhaps we should discuss the P versus NP question. I shouldn't have to explain the great importance of the P versus NP question to the readers of this web log, but people often wonder when it will be proved.

Like most complexity theorists, I strongly believe that P is not equal to NP, i.e., it is harder to search for a solution than verify it. Let me quote Juris Hartmanis in 1985, "We know that P is different from NP, we just don't know how to prove it."

We are further away from showing P ≠ NP then we have ever been. Let me explain this. In 1985 when I started graduate school, computational complexity theorists were in the midst of showing newer and stronger lower bounds on circuits. Furst, Saxe and Sipser in 1983 gave the first nontrivial lower bounds on bounded-depth circuits. In 1985, Yao followed soon after by stronger results of Hastad, gave exponential lower bounds. In 1986, Razborov showed that clique does not have small monotone circuits. In 1987, Razborov and Smolensky showed that parity could not be computed on bounded-depth circuits with Mod3 gates. It seemed to many complexity theorists that the separation of P and NP was right around the corner.

But circuit lower bounds hit a brick wall. We have seen no significant progress on non-monotone circuit lower bounds since 1987. We have seen some new lower bounds in the past few years, using proposition proof complexity, branching programs, algebraic geometry and old-fashioned diagonalization, but all of these results are in models far too weak to even approach the complexity of the P versus NP question.

Settling the P versus NP question might require some branch of mathematics not even invented yet and that I would never have a prayer of understanding even the idea of the proof. When it will be proven cannot be predicted at all--it could be within a few years or maybe not in the next five hundred. It all depends on how long it will take to come up with the right new idea.

There are as many opinions on the future of the P versus NP question as there are theorists. Bill Gasarch has collected many of these opinions. It makes for an interesting read but you might as well ask Miss Cleo.

It is possible that P = NP is independent of the axioms of set theory. Doubtful I say, but that is a topic for another post.

Saturday, January 04, 2003

Logic in the 21st Century

The June 2000 Association of Symbolic Logic Annual Meeting included a panel discussion on "The Prospects of Mathematical Logic in the Twenty-First Century". From that workshop came this paper, an interesting view of the future of logic. Richard Shore, Sam Buss, Anand Pillay and Alexander Kechris each wrote a section giving their views in the areas of recursion theory, proof theory, model theory and set theory respectively.

Mathematical Logic forms the foundation of computer science and the logic community often looks to the computer science community for directions and applications. The sections on recursion and proof theory really bring out this connection.

Friday, January 03, 2003

Foundations of Complexity
Lesson 12: Turing Machine Redux

Previous Lesson | Next Lesson



Back in Lesson 1 we gave an informal description of a Turing machine as any computer program. That was fine for computability, but for complexity we need to give a more specific, but still informal, definition.

A one-tape Turing machine consists of an infinitely long "tape" consisting of tape cells that each can carry one of a finite set Γ of tape symbols. Typically we have Γ={0,1,B}. The Turing machine has a finite memory, where Q represents the set of all possible states of that memory. The Turing machine also has a tape head that points a specific location on the tape.

Initially the input is put somewhere on the tape with the rest of the tape having the special "B" blank symbol. The tape pointer points to the beginning of the input. The Turing machine starts out in some initial state q0.

In each iteration the Turing machine looks at the tape character under the head and the current state. It writes a new character under the head and then moves the head one step left or right and then enters a new state depending on its instructions.

If the Turing machine enters the accept state qa then it halts and accepts. If the Turing machine enters the reject state qr then it halts and rejects. Otherwise the process repeats.

This simple model captures all of the computational power of much more general Turing machine. It also does this with at most a polynomial slow-down, i.e., if a problem of size n was solved in t(n) steps on a more complex machine it can be solved in time (t(n))k on a one-tape Turing machine for some fixed k.

A deterministic Turing machine's choice of next state, character to write and direction to move the tape is a function of the previous state and current character under the tape. A nondeterministic machine may allow several options and if one series of options leads to acceptance we say the nondeterministic machine accepts.

A configuration of a Turing machine is a snapshot in time of the machine and consists of the tape contents, the current state and the location of the head pointer.

A tableau is a list of configurations

conf0#conf1#...#confm.
A proper tableau for a machine M and input x is a tableau where
  1. conf0 is the initial configuration for M with input x.
  2. confm is a configuration in the accept state.
  3. For all i, 0 ≤ i < m, confi+1 follows from confi in one step.
A machine M accepts input x if and only if there is a proper tableau for M and x.

Wednesday, January 01, 2003

2003: A Year-Long Celebration of Geniuses

What do Alonzo Church, Andrey Kolmogorov and John von Neumann have in common?
  1. They are all brilliant mathematicians.
  2. Their research has helped establish the fundamentals of much of computer science.
  3. They were all born in 1903.
  4. All of the above.
Of course the answer is "All of the above." Throughout the year (in addition to the usual posts) we will honor these men and their research in the 100th anniversary of their births.

I almost added Frank Ramsey, also born in 1903, to the list. Certainly Ramsey Theory has played a major role in theoretical computer science. But the popularity of Ramsey Theory is due more to Paul Erd�s than to Ramsey who was mostly a philosopher.