Monday, December 30, 2002

Reflections on 2002

We have seen many exciting theorems during the past year but once again we have failed to prove that P≠NP or anything even close to it. There is always next year.

The most promising sign of the past year is the increased submissions and attendance at conferences pretty much across the board in theoretical computer science. The large number of students studying theory make of much of this increase. In the late 1990's during the dot-com boom, very few students, in particular Americans, went to graduate school in computer science. But with the days of easy money over and the need for computer science faculty still great, we have seen a large increase in the number of students. These students have and will continue to bring in new ideas and directions to our field. Let us hope there are enough jobs for all of them.

I also started this web log during this past year. Initially, I started this blog just to try out a new technology but I have had a blast writing these posts and sharing my knowledge and experiences. I hope you have enjoyed reading them. I don't understand how I rank so high on a Google search on "web log". Perhaps because "weblog" is supposed to be one word.

In remembrance: Edsger Dijkstra and Steve Seiden.

Have a good New Years everyone!

Friday, December 27, 2002

FOCS is going to Europe

In 2004 for the first time the FOCS conference will be held outside the United State or Canada. The 45th FOCS conference will be held in Italy. STOC held its first conference outside of North America in 2001 in Greece.

First some background on these conferences quoted from the preface of David Johnson's 1991 STOC/FOCS Bibliography.

Since the 1960's, two of the most important venues for presenting new results in theoretical computer science have been the annual STOC and FOCS conferences sponsored respectively by the Association for Computing Machinery and the IEEE Computer Society. "STOC" is an acronym standing for the ACM Symposium on Theory of Computing and "FOCS" stands for what is now called the IEEE Symposium on Foundations of Computer Science. The STOC Conference is organized by ACM's Special Interest Group on Automata and Computability Theory (SIGACT) and has been held every spring since 1969. The FOCS conference is organized by what is now called the IEEE Technical Committee on Mathematical Foundations of Computer Science (TC-MFCS) and is held in the fall. It began in 1960 as a "Symposium on Switching Circuit Theory and Logic Design" (SCT&LD), changed its name in 1966 to "Symposium on Switching and Automata Theory" (SWAT), and assumed its current name in 1975.

A few updates: SIGACT now stands for the Special Interest Group on Algorithms and Computation Theory. The 33rd STOC in 2001, besides being the first held outside the US or Canada, is also the first held in the summer. The SWAT acronym has been appropriated by the Scandinavian Workshop on Algorithm Theory.

FOCS and STOC have changed in other ways. Until the 80's, they were the only major outlet for conference papers in theoretical computer science. Now there are several specialized conferences including, of course, the IEEE Conference on Computational Complexity. Many researchers put a greater emphasis on their specialized conference than STOC and FOCS. This is somewhat true in complexity but much more so in say computational learning theory.

While most, but not all, of the best papers in theory still appear in STOC and FOCS, these conferences no longer reflect the broad interests of theoretical computer scientists. This was probably an inevitable outcome of a field maturing and becoming more specialized.

Thursday, December 26, 2002

Find the Longest Path

How about some music? Here is an oldie but goodie. Daniel Barrett wrote Find the Longest Path during a final exam at Johns Hopkins in 1988. It is an algorithms song sung to the tune of Billy Joel's For the Longest Time. You can listen to the song  here 

Monday, December 23, 2002

A Note on Running Times

Suppose you have two papers to review that give algorithmic improvements for two different problems. Here n is the input size.

Paper A: Old Running Time was 4n. New Running time is 2n.

Paper B: Old Running Time was n4. New Running time is n2.

Which paper gives the best improvement? This is not such an easy question to answer. Here are three ways to look at the question with three different results.

Analysis 1: Consider the improvement as a function of the input size: Old Running Time divided by New Running Time. Paper A is the clear winner with an improvement of 2n over the n2 improvement of Paper B.

Analysis 2: Consider the improvement as a function of the old running time. Here we have a tie; in both papers the new running time is the square root of the old running time.

Analysis 3: Suppose we are interested in the largest problem we can solve with current technology. Fix a time t and consider the largest problem we can solve in time t. In the Paper A we go from (log t)/2 to log t, a factor of 2 improvement. Paper B does much better going from t1/4 to t1/2, a quadratic improvement.

Sunday, December 22, 2002

Pictures of Manindra

Here is a sketch of Manindra Agrawal that I scanned in from the New York Times magazine article I mentioned last week. For contrast I added a photograph of Manindra and his student co-authors Neeraj Kayal and Nitin Saxena (from right to left) from the IIT Kanpur Primes in P page.

A complexity theorist immortalized in art. Brings tears to my eyes.

Friday, December 20, 2002

Rules for a Complex Quantum World

Michael Nielsen has a recent Scientific American article giving a very readable view of quantum information and computation. Nielsen is also co-author with Isaac Chuang of Quantum Computation and Quantum Information, the best book I've seen on the topic.

Thursday, December 19, 2002

Foundations of Complexity
Lesson 11: NP-Completeness

Previous Lesson | Next Lesson

Informally, NP-complete sets are the hardest sets in NP. What does it mean to be hardest? Here we use a polynomial-time version of reduction, a concept we first saw in Lesson 5.

Formally a language L is NP-complete if

  1. L is in NP, and
  2. For all A in NP, there is a polynomial-time computable function f such that x is in A if and only if f(x) is in L.
We say that f polynomial-time reduces A to L.

The following theorem captures the intuition for NP-completeness.
Theorem: Let L be an NP-complete language. Then L is in P if and only if P = NP.

Proof: If P = NP then since L is in NP then L is in P. Suppose L is in P and A is in NP. Let f be the reduction from A to L. We can then determine whether x is in A by testing whether f(x) is in L. ◊

In particular if any NP-complete set has an efficient algorithm then all NP-complete sets have efficient algorithms.

Do there exist NP-complete sets? Here is an example:

L = {(<M>,x,1k) | M is a nondeterministic machine and M(x) accepts in k steps}
Here 1k is just a string consisting of exactly k 1's.

L is in NP by just simulating M(x) for k steps. If A is in NP, then A must be accepted by some nondeterministic machine M using time p(|x|) for some polynomial p. The reduction f is just f(x)=(<M>,x,1p(|x|)).

L is not a natural language; you are unlikely to see it come up in any real-world application. In future lessons we will see that a large number of truly natural search problems are NP-complete which is why NP-completeness is perhaps the single most important concept to come out of theoretical computer science.

Wednesday, December 18, 2002

Ramsey Theory and Computer Science

Today we have a guest post from William Gasarch.

How many papers apply Ramsey Theory to Computer Science? If you said 37 and 4 surveys then you've probably visited www.cs.umd.edu/~gasarch/ramsey/ramsey.html where William Gasarch has a collection of such. A prominent theorist thinks there are over 100. Rather than argue the point, see if your favorite paper that applies Ramsey Theory is there, and if not then email the reference and if possible the paper or a link to it, to gasarch@cs.umd.edu.

Tuesday, December 17, 2002

The Great Journal Debate

Elsevier is closing down IDEAL the electronic access point for Academic Press, a publisher recently acquired by Elsevier. This leaves only Elsevier's Science Direct for electronic access of the Academic Press and other Elsevier journals. Given this news and today's New York Times article I feel I should comment on the great journal debate. As a member of the editorial board of Information and Computation, one of the Academic Press journals, these issues give me some angst.

The internet has, of course, a large effect on the distribution of scientific papers over the last ten years. Even more so, the consolidation of the scientific publishing companies has put a squeeze on university libraries.

Many of my colleagues have suggested that we just start up our own online journals. Running a journal is more than just getting papers refereed and sticking them on a web page. Journals have to be marketed, maintained and presented in a format that makes information easy to find. The private companies do a very good job of this. However, Elsevier's recent pricing policies are causing many libraries to drop several of their journals. Loss of access is never a good thing.

The professional societies, such as ACM, IEEE and SIAM have their own journals with their own on-line access policies that might form a reasonable median. You can also often get early versions of papers from scientist's home pages or sites like citeseer.

I have mixed emotions on the whole journal issue. Clearly status quo is not working--something will have to give. My biggest fear is that scientists will just stop submitting to journals altogether. I don't believe this is the best way to maintain knowledge for generations to come. After all, who will maintain your web page a century from now?

Sunday, December 15, 2002

Outsider Math

The New York Times Magazine today highlights a year full of ideas, one of which is on the Agrawal et. al. primality algorithm under the title "Outsider Math". While Manindra might not have been an expert in number theory he was already an established member of the computational complexity community--certainly not an outsider in my mind.

Friday, December 13, 2002

Learning via Occam's Razor

I have been asked to expand upon the spam example from yesterday's Foundations of Complexity lesson. To do this let me go into a little background on computational learning theory.

The basic model for learning theory has examples given as strings and labelled positive or negative. From these labelled examples, one comes up with a hypothesis that hopefully will classify future examples well.

In PAC (Probably Approximately Correct) learning you want an algorithm that takes labelled examples generated by some distribution and will output some hypothesis that will high confidence will usually classify future examples drawn from the same distribution.

One simple approach is the Occam algorithm based on the Occam's Razor principle "when you have two competing theories which make exactly the same predictions, the one that is simpler is the better."

The Occam algorithm works by finding the smallest representation of a hypothesis consistent with the labelled examples and using that hypothesis to predict future examples. There is a theorem in learning theory that says this algorithm works well with the number of samples roughly the size of the smallest representation.

Let us focus on the problem of identifying spam using general circuits, a collection of AND, OR and NOT gates that capture computation. We'll talk more about circuits in a future lesson but just think of the running time of an algorithm as roughly the size of the equivalent circuit.

Given a collection of emails each labelled either as SPAM or NOT SPAM, the Occam algorithm requires us to find the smallest circuit that correctly labels all of these emails. In general finding such a circuit could be hard but under the assumption that P=NP this is an easy task. By the result mentioned above this circuit will likely classify SPAM correctly most of the time in the future.

Some caveats about this method.

  • Why should there be a small circuit that characterizes spam? Well, I can look at an email and determine whether it is spam and my brain is just not that complex.
  • The theorem only holds if the distribution we learn on is the same as the distribution we apply our circuit to. Spammers might change their spam to fool our new circuit. The P=NP assumption will make it easier for them as well.
  • The P=NP assumption is probably not true.
For more information and details on computational learning theory check out the web site learningtheory.org and the resources mentioned on that site.

Thursday, December 12, 2002

Foundations of Complexity
Lesson 10: The P versus NP Problem

Previous Lesson | Next Lesson

In Lesson 8 we looked at the class P, the set of efficiently computable languages. In Lesson 9 we studied the class NP, the set of languages with efficiently verifiable proofs. Does P=NP, i.e., is every language with efficiently verifiable proofs computable efficiently?

The P versus NP question is the most important question in all of theoretical computer science and in mathematics in general. The Clay Mathematics Institute lists the P versus NP question as one of their seven Millennium Prize Problems. Determine whether P=NP and collect a million dollars.

To understand the importance of the P versus NP, let us imagine a world where P = NP.

  • A large class of interesting search problems in NP, thought to be hard to solve, would have efficient solutions. These include Factoring, Map Coloring, Traveling Salesman, Job Scheduling and thousands of others. Half of Garey and Johnson is just a listing of NP-complete problems.
  • Public key cryptography would be impossible.
  • Learning via Occam's razor would be easy. For example if you wanted an algorithm for separating spam from useful email, just search for the smallest circuit that correctly identifies a large set of samples. You can do this if P=NP.
This is just the beginning. Of course the general consensus is that P≠NP but we have no idea on how to prove it.

Many of the future lessons will deal directly and indirectly with the P versus NP problem. With these lessons, I hope to give you a real feel of the importance and difficultly of this most important question.

Wednesday, December 11, 2002

FYI: The AIP Bulletin of Science Policy News

The American Institute of Physics produces a free electronic newsletter, FYI, covering science policy in the US. Although it has a physics bent, FYI does an amazing job educating the scientific community on how US science policy is set and what the current issues are. For example, check out this bulletin on the new NSF authorization bill.

Anyone interested or involved in science funding should subscribe to this newsletter. There is also a FYI This Month that gives a single monthly email highlighting the important FYI bulletins.

Tuesday, December 10, 2002

CAPTCHA in the Times

Today's New York Times carries a lengthy article about how Yahoo is using Manuel Blum's CAPTCHA project to prevent automated registrations. You saw it here first.

Monday, December 09, 2002

SIGACT News

The December SIGACT News is out, the first edited by David Haglin. Of interest to complexity theorists
  • Lane Hemaspaandra's Complexity Theory Column has Part 2 of the Schaefer-Umans survey on Completeness in the Polynomial-Time Hierarchy.
  • A book review of A New Kind of Science by Stephen Wolfram. "Its not new, its not science and its not kind."
  • The Education Forum discusses a proposal for a Dagstuhl-like facility in the US and a preview of the Homer-Selman book Computability and Complexity Theory.
  • Calls for nominations for the G�del Prize and Knuth Prize. Nominate your favorite complexity papers and theorists.

Complexity Class of the Week: MA

Previous CCW

MA gets its name from the Arthur-Merlin games developed by Babai. MA is an interactive proof system where the all-powerful Merlin sends a message that is verified by a probabilistic Arthur. Here is a formal definition.

A language L is in MA if there is a probabilistic polynomial-time machine M and a polynomial p such that for all strings x,

  1. If x is in L then there is a w, |w|=p(|x|) and M(x,w) accepts with probability at least 2/3.
  2. If x is not in L then for all w, |w|=p(|x|), M(x,w) accepts with probability at most 1/3.
As with many other interactive proof classes, the 1/3 can be replaced with a value exponentially small in |x| and the 2/3 can be replaced by 1.

The w is a proof that Merlin can write down and Arthur can verify years later without further interaction from Merlin. Sometimes MA is called the class of languages with publishable proofs.

Despite the naturalness of the class, there are no known natural problems in MA not known to be in NP∪BPP.

MA contains NP and BPP and also NPBPP. MA is contained in its cousin class AM and thus BPPNP. MA is also contained in many of the same classes BPP is contained in, including S2, ZPPNP, Σ2∩Π2 and PP. There are oracles where AM is not contained in these classes. Also none of these containments are tight in all relativized worlds.

If L is checkable and has polynomial-size circuit then L is in MA. I will leave the definition of checkable to another post but this implies

If EXP is in P/poly then EXP is in MA.
We can replace EXP in the above statement with PSPACE or PP. Using the above statement one can show that MAEXP does not have polynomial-size circuits. MAEXP is like MA with polynomials replaced with exponentials.

A strong pseudorandom generator that derandomizes probabilistic circuits will also derandomize MA to NP. Using the result of Impagliazzo-Wigderson we get: If E require 2ε n size circuits for some ε>0 then MA = NP.

The quantum version of MA, QMA has gotten some recent attention. Here Merlin sends entangled quantum bits and Arthur does quantum transformations and measurements. We will discuss QMA another day when it has its turn as complexity class of the week.

Friday, December 06, 2002

Bad Math Joke

There are 10 kinds of mathematicians: Those who understand binary notation and those that don't.

Thursday, December 05, 2002

Foundations of Complexity
Lesson 9: Nondeterminism

Previous Lesson | Next Lesson

To properly classify problems we will sometimes need to define models of computation that do not exist in nature. Such is the case with nondeterminism, the topic of this lesson.

A nondeterministic Turing machine can make guesses and then verify whether that guess is correct. Let us consider the map coloring problem. Suppose you are presented with a map of a fictitious world and you want to give each country a color so no two bordering countries have the same color. How many colors do you need?

The famous Four Color Theorem states that four colors always suffice. Can one do it in three?

Let L be the set of maps that are 3-colorable. A nondeterministic Turing machine can "guess" the coloring and then verify quickly for every bordering pair of countries that they have different colors.

We let NP be the class of problems that use nondeterministic polynomial time. We can give an equivalent definition of NP using quantifiers: L is in NP if there is a polynomial p and a deterministic Turing machine M such that for all x, x is in L if and only if there is a w, |w| ≤ p(|x|) and M(x,w) accepts in time at most p(|x|).

The string w is called a witness. In the case of map coloring, w is the coloring of the countries.

Can we solve map coloring in deterministic polynomial-time? Is every NP problem computable in deterministic polynomial-time? Therein lies the most important question of all, which we will discuss in the next lesson.

Wednesday, December 04, 2002

You can't judge a proceedings from its cover

Scott Aaronson writes an ode to the FOCS proceedings cover illustration and asks about the origin of the illustration. All I can say is that the FOCS cover has remained unchanged since at least my first FOCS conference in 1986.

I can give the history of the Complexity Conference cover which should be similar since Complexity and FOCS are both IEEE Computer Society conferences. When Complexity, then known as Structure in Complexity Theory, took on IEEE sponsorship in 1987, an IEEE artist took a look at the papers and came up with the cover with various symbols from the text. Here is a scan of the 1990 cover.

When Eric Allender took over the conference he rightly eliminated the strange symbols from the cover and in 1998 what is now the current cover first appeared.

If you are a budding artist and wish to redesign the complexity proceedings cover, we are always welcome to possibilities. If your cover is approved by the conference committee, you will be well acknowledged in the proceedings and bask in the glory of knowing your art work will live on in offices of complexity theorists around the world for generations to come.

Tuesday, December 03, 2002

Quantum Information Science and Technology Roadmap

The Quantum Information Science and Technology Roadmap version 1.0 is now publicly available. From the web site: The overall purpose of this roadmap is to help facilitate the progress of quantum computation research towards the quantum computer science era. It is a living document that will be updated annually.

Section 6.8 gives a nice overview of the theoretical computer science contributions to quantum computing.

Saturday, November 30, 2002

CAPTCHA


Can you read the word above? Could a computer?

Manuel Blum won his Turing Award in 1995 for his work in computational complexity and its applications to cryptography. The theory of much of modern-day cryptography uses the assumption that certain problems are not easily computable to create unbreakable codes and protocols.

These days Blum is working on another project that also uses the assumption that some problems are hard computationally. The idea is to use problems that humans can solve easier than computers to prevent automated registration, voting, etc. Check out the CAPTCHA project web site.

Wednesday, November 27, 2002

Complexity Deadline

One last reminder that today is the submission deadline for the 2003 Conference on Computational Complexity. Good luck to all submitters.

Have a great Thanksgiving everyone!

Monday, November 25, 2002

Identity-Based Encryption

Dan Boneh gave a talk at Princeton today about some recent developments in cryptography based on algebraic geometry. One of these tools is identity-based encryption which is public-key encryption where the public key is just a user's identity such as their email address.

Dan's group has an implementation of the system for Outlook, Yahoo Mail and some other systems. If you want to be the first on your block using the latest and greatest encryption or just want more information check out the IBE web site.

Personally, I send all my email in cleartext. If anyone goes through the hassle of capturing it they will only discover what a boring person I am.

IAS

An article on the search for a new director of the Institute for Advanced Study discusses the changing nature of the institute.

The faculty member in computer science mentioned in the article is complexity theorist Avi Wigderson. It was a coup for complexity and all of computer science when he was appointed a faculty member in 1999. With a large collection of postdocs, visitors and students he has made the institute quite an exciting place for theoretical computer science and discrete math.

Saturday, November 23, 2002

FOCS and Visas

The FOCS Conference, the major fall theory conference held last week in Vancouver, sounded like a complete success. According to PC Chair Bernard Chazelle there were 320 registrants--quite a healthy number for this conference. Most encouraging was the larger number of students attending as well as a number of strong student papers indicating a solid future for theoretical computer science.

The 320 does not count another 50 "registrants" from Nigeria. They registered with fake credit card numbers in order to obtain letters from the conference organizers to help them obtain visas to go in this case to Canada. Whether they got the visas is unclear and they, of course, never showed up at the conference.

The temptation to help those from Africa is strong, especially since that continent is woefully underrepresented in computer science. However we must as a community be diligent against those who use our conferences as a way to get around standard immigration laws. Whether or not we agree with those laws, if abuses of this nature continue it becomes harder to bring in legitimate scientists, a problem I discussed in my last post.

Friday, November 22, 2002

Coming to America

A recent Washington Post editorial brings up an important issue.

Andris Ambainis was supposed to spend the fall at MSRI in Berkeley but instead is enjoying his native Latvia. When Manindra Agrawal came to Boston last month to talk about his primality algorithm, he was supposed to bring along his student co-authors. Instead he came alone.

Worries about terrorism have caused the US government have made them more cautious about issuing visas and this has slowed down the visa process tremendously. Visa problems have always been a thorn for academics but this fall seems particularly bad.

I understand the need to be careful but when science is hindered by politics nobody is a winner.

Wednesday, November 20, 2002

Where was Cook's Talk?

In 1971, Steve Cook gave his conference presentation that showed that SAT was NP-complete. There it did not immediately stir up much excitement but it is, in retrospect, the single most important conference talk in this history of theoretical computer science. So when and where was this talk?

Steve Cook's paper The Complexity of Theorem Proving Procedures appeared at the Third Annual ACM Symposium on Theory of Computing (STOC) that was held May 3-5, 1971 in Shaker Heights, Ohio, a suburb of Cleveland.

Funda Ergun, a professor at Case Western Reserve, just purchased a house in Shaker Heights and wondered where exactly the conference took place. We got the answer from Bill Rounds, who was one of the local organizers of that conference.

It was (at I think Stouffer's hotel) at the intersection of Warrensville Center Road and Chagrin Boulevard, in the Van Aken center district. The hotel is now gone.

Here is a Mapquest map of that location.

Someday we will organize a P versus NP workshop in that area and make a pilgrimage to this site.

Tuesday, November 19, 2002

Foundations of Complexity
Lesson 8: Efficient Computation

Previous Lesson | Next Lesson

In past lessons, we have studied the computable functions. Computable functions can take an arbitrary amount of time: What good is a program that will eventually give the correct answer but might not finish before the universe collapses?

Somehow we want to limit the amount of time or memory that a computer can use. Just giving a fixed bound does not work well. As technology improves and computers get faster and better, we expect to solve larger and larger problems in a reasonable amount of time. Hartmanis and Stearns, in their seminal paper on computational complexity, turn this around to come up with the right idea: Consider time and memory as functions of the size of the input.

The time a machine M takes on input x is just the number of computation steps that it takes before halting starting with input x. I am being very informal about what a "step" is. In a later lesson we will get into formal definitions of computers and steps but for now just use the idea of implementing one instruction.

The memory or space as we theorists call it is just the number of bits of storage used by M on a given input.

Edmonds gave an algorithm for the matching problem that ran in polynomial time: The number of steps used by the algorithm on a graph of size n is nk for some k. He suggests that polynomial-time captures efficient computation.

We now define our first complexity class P as the set of all languages L for which machine exist that determine whether x is in L and halts in time polynomial in the length of x. The P has many nice properties: A polynomial-time algorithm that uses a polynomial-time subroutine remains in polynomial-time. Also P is robust, that is the class is the same no matter how you formally define your machines.

In these lessons, we will treat P as the class consisting of efficiently computable problems. More classes will come.

Friday, November 15, 2002

It all started with a machine...

Steve Homer and I have finally finished our chapter A Short History of Computational Complexity. It will eventually appear in a book collection on the history of mathematical logic.

You can also see a talk I gave at the 2002 Complexity Conference based on the paper.

Thursday, November 14, 2002

Tic Tac Toe Variation

In my daughter's second grade math homework there was an interesting variation of Tic-Tac-Toe designed to teach addition and subtraction. Take a 3 x 3 grid and randomly give each square a different number between 2 and 18. We have two players X and O. Play goes as follows:
  1. Player X chooses a number from 1 to 9.
  2. Player O chooses a number from 1 to 9 that she had not picked before.
  3. Player O adds that number and the last number picked from X and if that square is on the board and unmarked, that square is marked O.
  4. Player X chooses a number from 1 to 9 that he had not picked before.
  5. Player X adds that number and the last number picked from O and if that square is on the board and unmarked, that square is marked X.
  6. Go to step 2.
Play ends when either X or O has three in a row and is declared a winner or when all the numbers run out and the game is declared a draw.

Here is an example:

12 |  5 | 7
-----------
14 | 11 | 3
-----------
4  | 13 | 9
X: picks 1, O: picks 3 (to make 4), X: 8 (11), O: 4 (12), X: 3 (7), O: 6 (9). At the point the board looks like:
 O |  5 | X
-----------
14 |  X | O
-----------
 O | 13 | O
Defensively X plays 2, Y: 1, X; 1, Y:2 and whatever X plays next Y has a forced win by making 13 or 14.

Despite the simplicity this is quite a challenging game. For every initial configuration, is there always a forced draw like in real Tic-Tac-Toe or do some configurations have a forced win for X or O? How complicated is it to compute an optimal strategy?

My daughter was frustrated at how hard it is to win this game but she shouldn't be ashamed--I couldn't figure out the best strategy either. Amazing what complicated things can come out of a second-grade class.

Tuesday, November 12, 2002

Kolmogorov Complexity Web Site

Can't get enough Kolmogorov Complexity. Check out Marcus Hutter's site on Kolmogorov Complexity and Solomonoff Induction. The site is a bit dated but contains many useful links and information about the Kolmogorov mailing list which still seems quite active.

The Union of Complexity Classes

We often see the intersection of two classes as an interesting class in and of itself. For example factoring is in NP∩co-NP. In some cases you get interesting equalities, like that ZPP is equal to RP∩co-RP. But we rarely see the union of two classes. Every wonder why?

In fact, no complexity class can be the nontrivial union of two other classes. To formalize and prove this statement we need some definitions.

Let A and B be subsets of {0,1}*. We define the join, A⊕B, as the union of {0x | x is in A} and {1y | y is in B}. Given a set C we define the 0-projection of C as {x | 0x is in C} and the 1-projection of C as {y | 1y is in C}. Note that the 0-projection of A⊕B is just A and the 1-projection is just B.

Essentially every complexity class is closed under joins and projections. For example if A and B are in NP then A⊕B is also in NP. The fact that no complexity class is the nontrivial union of other classes follows from the following Lemma.

Lemma: Let E, F and G be classes of languages that are closed under joins and projections and G = EF. Then either G = E or G = F.

Proof: Suppose the lemma is false. Let A be a set in G-E and B be a set in G-F. Let C = A⊕B. We have that C is in G since G is closed under joins. Thus C is in either E or F. Suppose C is in E. Since E is closed under projections, we have A is in E a contradiction. If C is in F then B is in F also a contradiction.

Monday, November 11, 2002

Foundations of Complexity
Lesson 7: The Recursion Theorem

Previous lesson | Next Lesson

Here we are in Lesson 7 and have not yet talked about complexity per se. I felt it important to give some background on computability theory not only for the importance of the results but also to introduce the basic concepts of Turing machines, diagonalization and reducibility. We will start complexity in the next lesson.

Let me end the discussion of computability by one of my favorite theorems. Suppose you wanted to create the ultimate computer virus that attacked any program and made it change its behavior. The recursion theorem states that no matter how powerful the virus, some program will remain unscathed. At first this seems impossible just by considering the function that simulates a program and then adds one to the answer. But this process will not affect the machine that never halts.

Theorem: Let f be any computable function. There is some Turing machine M such that

L(M) = L(f(<M>))

The recursion theorem, sometimes called the fixed-point theorem, has one of the most unintuitive proofs where I cannot explain why it works, only that it does.

Proof: Fix a computable function f. For each machine N, construct a Turing machine <R> that on input x, simulates N(<N>) to produce the description of a machine and simulates that machine on x. Let g(<N>) be the function that outputs <R>. Note that if N(<N>) halts then the programs described by g(<N>) and N(<N>) accept the same language.

Note that g is computable even if N(<N>) does not halt. Let T(x) be the machine that computes f(g(x)). We will let M be the machine described by g(<T>). Then we have that
M accepts input x if and only if
the machine described by g(<T>) accepts input x if and only if
the machine described by T(<T>) accepts input x if and only if
the machine described by f(g(<T>)) accepts input x (since T(x)=f(g(x))) if and only if
the machine described by f(<M>) accepts input x. QED

As an application, consider the function f(x) that outputs the description of a machine that accepts {x}. By the recursion theorem must be some M such that L(M) accepts exactly <M>. As an experiment, pick your favorite programming language and find a program that outputs its own code. By an argument based on the recursion theorem, such a task is always possible but it is trickier than it seems.

This ends the section on computability theory which is an exciting area of research in and of itself. For further reading the book of Homer and Selman goes into these ideas with some more detail and examples. For more advanced concepts I recommend the books of Soare, Odifreddi or Schoenfield.

Friday, November 08, 2002

STACS

The STACS Conference has just posted the list of accepted papers for their 20th conference. STACS alternates between France and Germany (and only some truth to the rumor that it alternates between great food and great organization). The upcoming 2003 conference will be held in Berlin, February 27 to March 1.

I have always considered STACS, the Symposium on Theoretical Aspects of Computer Science, the best venue for computational complexity in Europe. I have attended the conference many times and they consistently have several strong papers in the area as well a good attendance of complexity theorists from both Europe and America. You can see the weight complexity gets on the web page where "Computational and structural complexity" gets the same weight as "Algorithms and data structures, including: parallel and distributed algorithms, computational geometry, cryptography, algorithmic learning theory".

The ICALP conference has a longer history, a larger audience, more traditions and does a better job representing Europe as a whole. But the scope in ICALP is quite large and computational complexity often gets lost in the shuffle.

Wednesday, November 06, 2002

Complexity Class of the Week: SPP, Part II

Previous CCW

Last week we gave the history of the complexity class SPP and described GapP functions. This week we will give a definition of SPP and many of the class' amazing properties.

A language L is in SPP if there is a GapP function f such that

  1. If x is in L then f(x)=1.
  2. If x is not in L then f(x)=0.
That is if x is in L there is one more accepting than rejecting path. If x is not in L there are the same number of each.

If we used #P functions instead of GapP functions we have the definition of UP. SPP contains UP since every #P function is a GapP function. In fact SPP contains FewP and even Few where we don't believe such languages are in UP.

SPP is the smallest Gap-definable class, i.e., the smallest class that can be defined by GapP functions as above. There are a number of common Gap-definable classes, for example from the Zoo: ⊕P, AWPP, C=P, ModP, ModkP, MP, AmpMP, PP, WPP and of course SPP. SPP is contained in all of these classes. AWPP is the smallest classical class known to contain BQP, the class of problems with efficient quantum algorithms, though it is not known if BQP is itself Gap-definable.

SPP is exactly equal to the low sets for GapP, i.e., SPP is exactly the set of oracles A such that for any NP machine M, the number of accepting minus the number of rejecting paths of M^A(x) is still an (unrelativized) GapP function. This means that SPP is low for all of the Gap-definable classes, for example that ⊕PSPP = ⊕P. This also means that SPP is self-low: SPPSPP = SPP which means SPP is closed under union, complement and in fact any Turing-reduction.

Kobler, Schoning and Toran showed that graph automorphism is in SPP and very recently Arvind and Kurur have show that graph isomorphism is in SPP. This means that graph isomorphism sits in and is in fact low for every Gap-definable class.

The decision tree version of SPP is interesting. A function f on n bits is in this class if there is a polynomial g with polylog degree such that f(x)=g(x) on all x in {0,1}*. All such functions have low deterministic decision tree complexity--the first complexity application of a combinatorial lemma of Nisan and Szegedy. Applications of this result include relativized worlds where SPP does not have complete sets or where P = SPP and the polynomial-time hierarchy is infinite.

Monday, November 04, 2002

Foundations of Complexity
Lesson 6: The Halting Problem

Previous Lesson | Next Lesson

Last lesson we learned about using reductions to show problems are hard. Now consider the most famous of undecidable problems, the halting problem:

LH = {<M> | <M> eventually halts with blank tape as input}
We will now show that LH is not computable. We do this by reducing the universal language LU to LH where LU is the set of pairs (<M>,x) such that M(x) accepts.

Given <M> and x, consider the following program:
Replace input with x.
Simulate M on x.
If M(x) accepts then halt.
If M(x) does not accept then go into an infinite loop.

Let us call this program N. Note that M(x) accepts if and only if N halts on blank tape.

Now here is the important point. Consider the function f that given <M> and x, will produce the program N. Even though M(x) and N may not halt the actual procedure that converts <M> and x to N is computable. This is just converting one program to another.

So we have that (<M>,x) is in LU if and only if M(x) accepts if and only if N=f(<M>,x) halts on blank tape if and only if N is in LH. Thus f reduces LU to LH and thus by the Lemma of Lesson 5, we have that LH is not computable.

I consider the noncomputability of the halting problem to be the single most important result in theoretical computer science. There are some programs, of course, that are easy to determine whether or not they will halt. But in general, no matter how smart you are or fast the computers, it is simply impossible to analyze a piece of code and see if it will terminate.

Using similar techniques one can prove a general result known as Rice's Theorem: Every nontrivial property of the computably enumerable languages is undecidable. More formally
Rice's Theorem: Let P be any non-empty proper subset of the computably enumerable languages. Then the language

LP = {<M> | L(M) is in P}
is not computable.

For example the following languages are not computable:

  • {<M> | L(M) is empty}
  • {<M> | L(M) is computable}
  • {<M> | L(M) is finite}

Friday, November 01, 2002

Conferences

November is a month for conference deadlines. The STOC conference has a submission deadline of November 6. STOC and FOCS, which is being held November 16-19, are the two major theoretical computer science conferences.

STOC this year is part of the 2003 Federated Computing Research Conference in San Diego in June. Several other theory conferences are also part of FCRC and many of them have deadlines in November or soon thereafter.

My favorite conference, The IEEE Conference on Computational Complexity, will be held in Denmark in July. Their submissions deadline is November 27.

In computer science in general and theoretical computer science in particular, conferences are the primary outlet for announcement and publication of results. Since computer science is a relatively young discipline, the field changes dramatically year to year and the usual long process of journal publications might often publish outdated work. More mature fields like mathematics and physics use journals as the primary source of publication.

The main disadvantage of the computer science system is that while computer scientists are encouraged to submit their work to refereed journals, many of the important papers in the area never make it that far.

There have been at least two recent major exceptions to this process. Alexander Razborov wrote a paper last spring on lower bounds on quantum communication complexity that would have been the best quantum paper in FOCS if not the best paper. Instead he chose to submit it directly to a journal, Izvestiya of the Russian Academy of Science: Mathematics. The Agrawal-Kayal-Saxena Primality Paper which would easily be the best paper at the upcoming STOC is not being submitted to a conference either but directly to Annals of Mathematics. "Why should I send it to a conference," Manindra Agrawal asks, "when everyone already knows the result?"

Are these two papers a trend? Are conferences less important as papers are easily available online? Or is computer science finally becoming a mature field?

Wednesday, October 30, 2002

Complexity Class of the Week: SPP, Part I

Previous CCW

With the new FOCS paper by Arvind and Kurur, "Graph Isomorphism in SPP", people have asked me why they should be interested in SPP, a class first defined by a paper by Steve Fenner, Stuart Kurtz and myself. I thought I would discuss how this class was developed and why we feel it is important.

Gill, in his seminal paper on probabilistic complexity classes, defined the class PP and asked whether the class was closed under intersection. In 1990, Fenner and Kurtz and later myself, decided to try a new approach to the question: Consider a class defined like PP but with additional restrictions, show that this class is closed under intersection and then show the class was really the same as PP. Kurtz had a philosophical approach to the problem and defined three variants of PP, Epicurean-PP, Cynical-PP and Stoic-PP.

Recall that PP is the set of languages L accepted by probabilistic machines such that x is in L exactly when the probability of accepting is greater than the probability of rejecting. Epicurean-PP machines were happy to accept but only rejected by barely rejecting--having one more rejecting paths than accepting paths. Cynical-PP machines were the opposite, willing to reject in any way but would only barely accept. Stoic-PP machines stood their ground and would just barely accept or barely reject. Cynical-PP turned out to be the same as the well-studied class C=P and Epicurean-P was co-C=P. Stoic-PP or SPP was new and thus a complexity class was born.

While it was easy to show SPP was closed under intersection it is unlikely to be the same as PP and thus we failed in this attempt to show PP was closed under intersection. While we were having this discussion, sitting on the printer was a paper Richard Beigel had emailed me earlier, his paper with Nick Reingold and Daniel Spielman entitled "PP is Closed Under Intersection". Their successful approach was completely different the ours. They used rational functions to approximate the sign function.

Not to be deterred we started studying SPP and related classes which also led to GapP functions. Valiant had defined the class #P, functions f such that there was some nondeterministic polynomial-time Turing machine M such that f(x) was the number of accepting paths of M(x). GapP functions were the closure of #P functions under subtraction, or equivalently the difference or gap of the number of accepting and rejecting computation paths of an NP machine.

GapP functions are closed under many of the same properties as #P functions such as polynomial products and exponential sums as well as subtraction of course. The power of subtraction made GapP a much cleaner approach to studying counting classes and the study of GapP showed the great importance of the class SPP.

Independently of us, Ogihara and Hemachandra defined a class XP and Gupta defined a class ZUP, both of which were equivalent to SPP.

I will stop here and in a later post describe the actual properties of SPP that make it such an interesting class.

Tuesday, October 29, 2002

Talks by Manindra Agrawal

I don't usually give talk announcements in this web log, but if you are in the Boston area this week you can see Manindra Agrawal give talks about prime numbers and his new algorithm with Kayal and Saxena. This new algorithm, giving the first provably deterministic polynomial-time algorithm to check primality, will go down as one of the classic results in theoretical computer science.

Manindra is giving a non-technical talk on the history of primes at the Clay Mathematics Institute on Wednesday and technical talks on the primality algorithm at MIT on Thursday and Harvard on Friday.

Monday, October 28, 2002

Foundations of Complexity
Lesson 5: Reductions

Previous Lesson | Next Lesson

In the previous lesson we gave examples of two noncomputable sets. The set

LA = { <M> | Machine M does accept input <M>}
is computably enumerable but not computable while the set
LD = { <M> | Machine M does not accept input <M>}
is not even computably enumerable.

We also defined computable functions. In this lesson we will use computable functions to create other languages that are not computable. To do so we use the notion of reduction. Informally a reduction takes one decision problem and reduces it to another problems. For example to know your current longitude, you only need to know the time of day in a fixed location. The problem of computing the longitude reduces to the problem of proper time-keeping. This is one way the longitude issue was dealt with before the days of GPS.

Formally we say a language A reduces to language B if there is a computable function f such that for all x in Σ*, x is in A if and only if f(x) is in B.

The power of reductions come from the following lemma.

Lemma: Let A and B be sets such that A is reducible to B.

  1. If B is computable then A is computable.
  2. If B is computably enumerable then A is computably enumerable.
  3. If A is not computable then B is not computable.
  4. If A is not computably enumerable then B is not computably enumerable.

Lines 1 and 2 are easy to see: Just compute f(x) and simulate the program for B on f(x). Lines 3 and 4 are just the contrapositive of 1 and 2 and turn out to be especially useful.

For example, consider the universal Turing machine language,

LU = { (<M>,x) | Machine M does accept input x}
We have seen LU is computably enumerable. Let f(<M>)=(<M>,<M>). The function f is easily seen to be computable and reduces LA to LU. Thus by our Lemma, line 3, we have that LU is not computable.

Reductions play a major role in computational complexity so it is very instructive to see them in this context. Next lesson we will use more complicated reductions to show other simple languages are not computable.

Friday, October 25, 2002

Hard Problems

Occasionally someone comes to me and says, "I have a new algorithm for blah and it seems to work on all of the cases I can think of. Can you give me some real instances to test it on?" Now this is not my speciality--I would much rather see a proof of correctness of an algorithm. But for all those who think they have the next great algorithm for satisfiability, let me give you some useful links.

The first place to look is the problem instance collection of INFORMS, the Institute for Operations Research and the Management Sciences. They have links to all sorts of interesting problems like graph coloring, traveling salesman and combinatorial auctions.

For factoring, you can make some real money by solving the RSA Factoring Challenges.

For graph isomorphism, check out The Graph Database. The graph database only seems to have isomorphic graphs but a good isomorphism tester should give the isomorphism. I haven't been able to find a collection of nonisomorphic graphs that fool many isomorphism testing algorithms. See also the Nauty algorithm.

For satisfiability, there are algorithms that seem to generate hard instances. You can also take problems like graph coloring from the INFORMS site and convert them to satisfiability questions.

Tuesday, October 22, 2002

Infinity and the Supreme Court

Don't worry loyal readers. Circumstances are making it difficult for me to write posts this week but I hope to catch up soon.

The U.S. Supreme Court is taking on a case that deals with an interesting mathematical issue. The case consists of the extension of the copyright--often called the Mickey Mouse rule since just when Disney's copyright of Mickey is about to expire, the length of the copyright is extended.

The U.S. constitution prohibits unlimited copyrights. The lower courts have said that a finite extension of a finite term is still finite. So the lawmakers are getting around the constitution by approximating the unconstitional infinite term by longer and longer finite terms. This is basically a mathematical trick--creating infinity while always locally looking finite.

Will the supreme court put a stop to this? This will be an interesting case to watch.

Friday, October 18, 2002

Quantum Parody of Shakespeare's Hamlet III.i

To end the week I give you the following presented by Ken Regan at the Dagstuhl meeting.
To evolve, or not to evolve: that is the question:
Whether 'tis nobler in the mind to suffer
The slings and arrows of coherent waves,
Or to take arms against superpositions,
And by observing end them? To observe; to evolve---
No more; and by a measurement to say we end
The mixed states and the thousand natural shocks
NMR is heir to, 'tis a computation
Devoutly to be wish'd. To evolve, to run;
Perchance to decohere---ay, there's the rub;
For in too long a run what waves may come
When they have shuffled off magnetic coil,
Must give us pause: there's the respect
That makes calamity of so long runs;
For who would bear the whips and scorns of time,
Qubits flipped wrong, the proud man's "quantumly",
But that the dread of something after halting,
The undiscover'd light cone from whose bourn
No classical info returns, puzzles the will
And makes us rather study those models we know
Than fly to others that we know not of?
Thus deadlines do make cowards of us all;
And thus the native hue of evolution
Is sicklied o'er with the pale cast of thought,
And enterprises of great pith and moment
With this regard their funding turns awry,
To lose the name of action.---Soft you now!
The fair Ophelia! Nymph, in thy bra and kets
Be all my states remember'd.

Thursday, October 17, 2002

Scooping the Loop Snooper

Thanks to Jose Balcazar for showing me this poem by Geoffrey K. Pullum giving a proof, in verse, that the halting problem is undecidable.

Wednesday, October 16, 2002

More from Dagstuhl

The highlight of Tuesday was seeing Manindra Agrawal present the new primality algorithm. History in the making.

Graph Isomorphism is a common topic in the conference. Next to factoring, graph isomorphism is the most well-studied problem in NP not known to be in P or NP-complete. Graph isomorphism sits in co-AM, i.e. there is a two-round interactive proof system for showing that two graphs are not isomorphic. The best deterministic algorithm uses time exponential in (n log n)0.5.

Jacobo Toran today gave a talk on the hardness of graph isomorphism. He showed that given a black box for GI, one can compute the number of accepting paths in a directed graph, a class known as #L functions or equivalently the determinant of an integer matrix. He also showed that matching reduces to graph isomorphism. Whether every language in P is log-space reducible to graph isomorphism remains an interesting open question.

Later in the conference V. Arvind will present his result that GI is in SPP, or more specifically that one can determine graph isomorphism in PSAT where every query made to the SAT oracle has at most one satisfying assigment.

Graph Isomorphism is the current algorithmic challenge for quantum computers. It is an example of the hidden subgroup problem, a special case of which was used for Shor's factoring algorithm. Perhaps the interaction between the classical and quantum theorists at this conference may help find a efficient quantum algorithm for GI.

Monday, October 14, 2002

Howdy from Dagstuhl

Howdy from Schloss Dagstuhl, a conference center in Germany that hosts weekly computer science workshops. This week I�m here for the Seminar on Algebraic Methods in Quantum and Classical Models of Computation.

The interesting open problem of the day comes from Pierre McKenzie. Consider a circuit that works on sets of nonnegative integers. Inputs are the sets {0} and {1}. The gates consist of union, intersection, complement, addition and multiplication. Addition of two sets A and B is the set consisting of x+y for x in A and y in B. Multiplication is similar.

Given such a circuit with specified input sets and an integer w, is it decidable whether w is in the set generated by the output gate?

A decision algorithm for this problem yields a way to settle Goldbach�s conjecture that every even number greater than 2 is the sum of two primes. I�ll leave this implication as an excercise.

Thursday, October 10, 2002

Complexity Class of the Week: P

Previous CCW

Edmonds in his 1965 paper on matching suggests defining efficient computation as those running in time polynomial in the length of their input. This became the class P, the most basic of all complexity classes.

The class P has nice properties, for example it is model independent, i.e., P is the same whether one has a single-tape, multi-tape or random-access Turing machine. P is closed under subroutines--polynomial-time machines with access to an oracle for P accept languages in P. Perhaps a running time like n150 is not efficient but one needs the polynomial-time definition to keep the robustness of the model. In nearly every case, natural problems shown to be in P have also been shown to have algorithms with relatively low exponents.

Some have argued that P as efficient computation reflects old technology. Perhaps efficient computation should be classes like BPP (probabilistic) or even BQP (quantum). I don't know about you but the computer on my desk doesn't produce truly random bits or quantum entanglement.

P is equal to alternating log-space. Using this result, we get complete problems for P like the circuit value problem consisting of the set of AND-OR circuits that evaluate to true. For P-completeness we require the reductions to be computed in logarithmic space. P has many other natural complete sets including variations on depth-first search.

There are many examples of problems with nontrivial polynomial-time algorithms such as matching, linear programming and primality.

Every language in P can be expressed as a first-order formula with ordering and a least-fixed-point operator.

Many of the major open questions in complexity ask about the power of P, for example P = BPP, P = BQP, P = PSPACE, P = L and of course P = NP. Note that we cannot have both P = L and P = PSPACE since L = PSPACE violates the space hierarchy.

Wednesday, October 09, 2002

Foundations of Complexity
Lesson 4: Noncomputable Computably Enumerable Languages

Previous Lesson | Next Lesson

In the last lesson we showed that the set

LD = { <M> | Machine M does not accept input <M>}
is not computably enumerable. In this lesson we show there are languages that are computably enumerable but not computable.

For a set A, let A be the complement of A, i.e., all of the strings of Σ* not in A. If A is computable then A is also computable--at the end if we accepted then we reject and vice-versa. Note this does not work in general for computably enumerable; switching reject and accept does not affect whether a machine halts on a given input.

Now consider the set

LA = { <M> | Machine M does accept input <M>}
LA is just LD.

Note the LA is computably enumerable as we can just simulate M on <M>. If LA were computable then so would LD which we know is not even computably enumerable. Thus we have LA as our first example of a noncomputable computably enumerable set.

Besides languages we can also consider computable functions. Consider a Turing machine with an output tape. We can view it as computing a function f from Σ* to Σ* where the value of f is the contents of the output tape after the machine enters a specified halting state. We say a total function f is computable if there is such a Turing machine computing f. We can also consider partially computable functions where f(x) may not be defined for some inputs x. On these inputs the corresponding machine does not halt.

In the next lesson we will use computable functions to show that other languages are not computable or computably enumerable.

Monday, October 07, 2002

Wigderson Paper on Work of Sudan

Avi Wigderson has posted on his publications page an article about the research of Madhu Sudan, the recent Nevanlinna Prize recipient. The paper, written for a broad mathematical audience, gives a nice description of Madhu's work on probabilistically checkable proofs and error-correcting codes.

Friday, October 04, 2002

Chaitin's Omega - The Halting Probability

Chaitin's Omega is the most compact way possible to encode the halting problem. Fix a prefix-free universal Turing machine U, that is if U(p) and U(q) both halt then p is not a prefix of q and vice versa. We define Chaitin's Omega by
Ω = Σp:U(p) halts2-|p|.
By Kraft's Inequality, Ω ≤ 1. Since U halts on some p but not others we have 0 < Ω < 1. Sometimes Ω is called the halting probability because it is the probability of halting if we run U at the start of an infinitely long randomly chosen string.

One can determine whether U(p) halts from only the first |p| bits of Ω. Let n=|p| and Ωn be Ω truncated to the first n bits. We have Ωn < Ω < Ωn+2-n. Define Ωs as the same as the definition of Ω as above except then we only sum over the p of length at most s such that U(p) halts in s steps. Note we have

lims→∞ Ωs = Ω.
and Ωs is computable from s. So we find the smallest s ≥ n such that Ωs > Ωn. Note that U(p) halts if and only if it halts in s steps, otherwise Ω ≥ Ωs+2-n > Ωn+2-n a contradiction.

Consider ΩA, which has the same definition as Ω but U now has access to an oracle for A. Rod Downey asks whether this is well defined in terms of being machine independent? Is it degree invariant, that is if A and B have the same Turing degree does ΩA have the same degree as ΩB?

If the answer is yes, then, according to Rod, it is a solution to a very long standing question of Martin. Note you cannot necessarily compute even A from ΩA.

Wednesday, October 02, 2002

Foundations of Complexity
Lesson 3: Universal Turing Machines and Diagonalization

Previous Lesson | Next Lesson

A Universal Turing Machine is a machine so powerful that it can simulate any other Turing machine. Initially it seems amazing that such a machine can exist. But think about the microprocessor that sits on the computer you are now using. Every program that you use, your word processor, the spreadsheet, the browser, the mp3 player all use code that runs on this processor. This processor acts like a universal Turing machine. Another example, is an interpreter for a language like Java. Suppose we had a program written in C++. The Java interpreter can run code that lets it interprets C++ and thus run any C++ program. This works for any other language and thus a Java interpreter is also a universal Turing machine.

What we have done is to consider programs as data themselves. Fix a programming language. For a machine M let <M> be the binary encoding of the program describing M. Let LU be the set of pairs (<M>,x) such that machine M accepts input x. LU is a computably enumerable set as we can create a machine U that simulates M on input x. The machine U is a universal Turing machine.

We now show that there is a language that is not computably enumerable. Let LD be the set of <M> such that machine M does not accept <M>. Suppose LD is computably enumerable. There must be some machine N such that N(<M>) accepts if and only if <M> is in LD. We have two cases

  1. N(<N>) accepts: <N> is in LD so by definition of LD, N does not accept <N>, a contradiction.
  2. N(<N>) does not accept: <N> is not in LD so by definition of LD, N accepts <N>, a contradiction.
This kind of argument is called diagonalization. It is the main technique to show that problems cannot be computed.

Step back for a second. We have shown that the language LD cannot be computed by a computer. Any computer. Ever.

Tuesday, October 01, 2002

SIGACT News

Lane Hemaspaandra's Complexity Column in the September SIGACT News has an interesting article by Marcus Schaefer and Chris Umans on problems complete in higher levels of the polynomial-time hierarchy. Also of interest for complexity theorists, Bill Gasarch's Book Review Column has a joint review of Computability and Complexity Theory by Homer and Selman and The Complexity Theory Companion by Hemaspaandra and Ogihara.

Monday, September 30, 2002

A Reader's Question about Kolmogorov Complexity

My first question from a reader:
I'm interested in Kolmogorov complexity and its relationship to traditional computational complexity theory. It seems that NP-complete problems all have a pretty low Kolmogorov complexity, so in this sense it seems KC will not serve as a good measure for problem instance hardness. I would like to know what you would say on this topic sometime in the future.
Andrei Nikolaevich Kolmogorov was born on April 25, 1903 and in 2003 there will be several events to mark the 100th anniversary of his birth. I plan to devote several posts next year to mark these events and discuss Kolmogorov complexity. If you cannot wait, I recommend the great textbook on the topic, An Introduction to Kolmogorov Complexity and Its Applications by Ming Li and Paul Vitányi.

As to the reader's question, initial segments of any computable set, including the NP-complete sets, have low Kolmogorov complexity and are not much use to measure the computational complexity of that set. However, one can use time-bounded Kolmogorov complexity to capture computational complexity.

Let p be a t-good program for a set A if for all inputs x, p(x) uses t(|x|) time and says "Yes", "No" or "I don't know". If p(x) says Yes then x is in A and if p(x) says "No" then x is not in A.

We define the t-time-bounded instance complexity of an instance x of a set A as

ict(x:A) = min{|p| : p is a t-good program for A and p(x) ≠ "I don't know"}

We have that A is in P if and only if there is a constant c and a polynomial q such that for all x, icq(x:A)≤ c. To prove P ≠ NP one only needs to that icq(x:SAT) is unbounded for all polynomials q.

For more information about instance complexity see Section 7.4 of Li and Vitányi.

The New York Times has finally taken notice that there has been a surprisingly number of plays, movies and now an opera about science.

Saturday, September 28, 2002

The Quantum Algorithms and Complexity workshop clearly showed a field matured. From at least a computer science point of view, researchers have created a strong set of techniques and questions that are leading to several new and exciting results. Let me give you an example of some of the results presented at the workshop.

There were new quantum algorithms. Sean Halgreen showed how to solve Pell equations, i.e. given x and y, find a value D such that x2-Dy2=1. Wim van Dam showed how to estimate Gauss sums, which I won't define.

Some new and interesting limitations on quantum computing. Ambainis presented Razborov's newest result giving a n0.5 lower bound on the communication complexity of set disjointness: Given two parties with two n-bit strings, they need to transmit even n0.5 quantum bits to determine if there is an i such that both strings have a 1 in the ith bit.

The most fascinating result was due to Ronald de Wolf who gave lower bounds on two-query locally decodable codes in the classical model using quantum techniques. He first showed how to simulate two classical queries with one quantum query and then gave a lower bound on the one quantum query model.

If you would like to learn more about quantum computing I suggest the textbook, Quantum Computation and Quantum Information by Nielsen and Chuang. Nearly all quantum computation papers can be found at the quant-ph web site.

On quant-ph you can find David Mermin's From Cbits to Qbits: Teaching computer scientists quantum mechanics which I mention just because I am the "newly enlightened computer scientist" quoted on the first page. Feel free to read my paper One complexity theorist's view of quantum computing based on my AQIP talk for another point of view.

Tuesday, September 24, 2002

Howdy from Canada. I have limited internet access here in Banff so there won't be many posts this week.

I've heard lots of interesting results about quantum computing here. John Watrous has the following nifty theorem about quantum interactive proofs:
Every language with a quantum interactive proof (and in particular all of PSPACE) has a proof system that works as follows: Merlin sends some qbits to Arthur. Arthur flips a single classical coin and sends it to Merlin. Merlin then sends over more qbits and Arthur performs some quantum operations and accepts or rejects. Even though Arthur's only communication is a single random coin, this protocol has perfect completeness and soundness near one-half.

Sunday, September 22, 2002

I'm off to beautiful Banff, Canada for the MSRI Workshop on Quantum Algorithms and Complexity. So no regular features this week but I hope to bring you the latest in quantum computation.

Friday, September 20, 2002

Ketan Mulmuley and Milind Sohoni have an interesting approach to separating complexity classes using algebraic geometry. Ken Regan describes this approach for the common complexity theorist in the October BEATCS Computational Complexity Column.
I saw Carl Pomerance yesterday give a wonderful presentation on the AKS primality algorithm. He made an interesting point about the algorithm. The algorithm runs in time O(n12) where n is the number of bits of the input to be tested. The big O notation hides a constant, i.e., the algorithm uses c n12 for some constant c. That c is unknown!

The AKS algorithm uses a result by Fouvry on the distribution of certain kinds of primes. Fouvry's result uses another result that is proven as such: First it is proved assuming the Extended Riemann Hypothesis is true. If the ERH fails, then the place where it fails can be used to prove the result. The constant c will depend on where the ERH fails. To determine c would require settling the Extended Riemann Hypothesis!

Agrawal, Saxena and Kayal did not cheat; they really gave a polynomial-time algorithm for primality. Their algorithm is fixed, only the analysis of the running time is affected by the ERH. Also there are other results one can use instead of Fouvry that get around this issue. Still I find it neat that this algorithm that gives a provably polynomial-time algorithm for primality has a running time that, while polynomial, is completely unknown.

Wednesday, September 18, 2002

Complexity Class of the Week: C=L

Previous CCW

Sometimes a class that seems a mix of concepts turns out to have significant meaning. C=L is one of these classes. First we need to define the class.

Consider a nondeterministic log-space Turing machine M that never repeats a configuration. This is not much of a restriction since we can just add a clock to the machine. Let #L be the set of functions such that f(x) is the number of accepting paths of M(x) for some M as described above. Let GapL be the closure of #L under subtraction.

We say a language L is in C=L if there exists
a function f in GapL such that for all x, x is in L if and only if f(x)=0. (*)

C=L is the log-space equivalent of the counting class C=P. There are many equivalent definitions to C=L where we can replace (*) by

  1. A function f in GapL and a function log-space computable function g such that for all x, x is in L if and only if f(x)=g(x).
  2. A function f in #L and a log-space computable function g such that for all x, x is in L if and only if f(x)=g(x).
  3. A probabilistic log-space machine M such that x is in L if and only if Pr(M(x) accepts) = 1/2.
The neat part of C=L is that it has a nice complete problem: singular integer matrices. This is because for every GapL function f there is a log-space computable g mapping strings to integer matrices such that f(x) is the determinant of g(x).

The big open question about C=L is whether it is closed under complement, i.e., is there a log-space computable function g mapping matrices to matrices such that M is singular if and only if g(M) is nonsingular?

For more information about C=L and related classes including references and proofs of the above see the paper of Eric Allender and Mitsu Ogihara, Relationships among PL, #L, and the Determinant, RAIRO - Theoretical Informatics and Applications Vol. 30, 1996, pp. 1-21.

Monday, September 16, 2002

Foundations of Complexity
Lesson 2: Computable and Computably Enumerable Languages

Previous Lesson | Next Lesson

In Lesson 1 we described the Turing machine model to answer the question, "What is a computer?" The next question is "What can we compute?" First we need some definitions to describe problems to be computed.

First we need an alphabet which we denote Σ. Any finite Σ would work; for most of complexity we assume that Σ = {0,1}. Machines take as inputs words or strings consisting of a finite sequence of alphabet symbols or characters. Examples: 0101, 000, 1101100. The length of the string is the number of characters in the string. We use ε to represent the special string with zero characters.

A language is set of strings. It could be finite or infinite. Examples include

  • {ε,0,1,001}
  • The set of strings of odd length
  • {1,10,110,1110,11110,...}
We use Σ* to represent the set of all strings and ∅ to represent the empty set of no elements. Note the difference between the empty set of strings and {ε}, the set consisting of the single string of length zero.

A class is a set of languages. Examples of classes include

  • The class of all finite languages.
  • The class of all languages containing the string ε.
  • The class of all languages that are subsets of {0,1,00,11,101}.
A complexity class is a special kind of class based on resource-bounded Turing machines. We will come back to complexity classes in a later lesson.

Consider a Turing machine M on some input string x which we denote M(x). We will focus on two possible outputs "yes" or "no". This yields three possibilities for M(x).

  1. M(x) outputs "yes" in which case we say M(x) accepts.
  2. M(x) outputs "no" in which case we say M(x) rejects.
  3. M(x) does not halt.
We let L(M) be the set of strings x such that the first case occurs. A machine M such that the third case does not occur for any x is called total.

The class of computably enumerable languages is the set of languages L such that L = L(M) for some Turing machine M. You might see recognizable or recursively enumerable as other names for the computably enumerable languages.

The class of computable languages consists of the set of languages L such that L = L(M) for some total Turing machine M. You might see decidable or recursive as other names for the computable languages.

Friday, September 13, 2002

Complexity Class of the Week: Factoring

Previous CCW

Factoring is not a class, it is not even a language. Nevertheless it has some interesting connections to complexity classes that are worth discussing.

Factoring is the problem of taking a number m and producing the prime factors of m. There are a number of algorithms for factoring but they all take time exponential in the length of the input (log m) in the worst case. Here is a good place to start reading about these algorithms.

Factoring gives a great example of an issue that often arises in complexity, search versus decision. It has always amazed me that we can easily determine whether a number has nontrivial factors but finding these factors is apparently much more difficult. This distinction and some other nice properties of numbers makes factoring very useful in cryptography.

To consider the computational complexity of factoring, we need to make a language version.

FACTOR = {(m,r) | There is an s, 1<s<r such that s divides m}

FACTOR is computationally equivalent to factoring: (m,r) is in FACTOR if and only if r is greater than the least prime factor of m. Given a black-box for FACTOR one can use binary search to find a prime factor of m.

If one insists on a complexity class, one could consider all the languages reducible to FACTOR but there is not much value beyond considering the complexity of the language FACTOR.

FACTOR is in NP∩co-NP: Guess the prime factorization of m. Since every number has a unique prime factorization we have FACTOR in UP∩co-UP where UP is the set of languages accepted by NP machines which have at most one accepting path for every input. The class UP∩co-UP is arguably the smallest interesting complexity class not known to have efficient algorithms and, assuming factoring is hard, really does not have efficient algorithms. I should note that FACTOR in UP∩co-UP was known before the recent AKS primality algorithm but the proof was more difficult.

Peter Shor almost singlehandedly justified the study of quantum computing by his efficient quantum algorithm for factoring, in complexity terms FACTOR in BQP. His proof used the fact that factoring of a number m can be reduced to finding the order of the multiplicative group (mod n). Shor then shows that quantum computers can solve this kind of group question.

Factoring does not appear to be hard for any nontrivial complexity class. This means that unlike Satisfiability we don't have any reason to believe that factoring is hard other than the fact that many smart people have failed to solve it. On the other hand the hardness of factoring is taken as a given in the study of complexity and cryptography and used to show the hardness of classes like UP∩co-UP.

Thursday, September 12, 2002

Who says you can't make money with mathematics? Here is an interesting Wired article on the MIT Blackjack team.

Science Screenplay Contest

Have you secretly been writing a movie script about a computer scientist? Now is your chance. Robert De Niro's Tribeca Film Institute in conjunction with the Sloan foundation is looking for movie scripts with a lead character who is a scientist, mathematician or engineer. No science fiction stories will be accepted.

Update 8/4/05: See the winners.

Wednesday, September 11, 2002

On September 11, 2001 we lost a member of the theoretical computer science community. Danny Lewin, an MIT graduate student who went on to co-found Akamai was on board the American Airlines flight that crashed in New York City. He and the other victims of that day will always be remembered.

Sunday, September 08, 2002

Foundations of Complexity
Lesson 1: What is a computer?

Next Lesson

This is the first of a long series of posts giving an informal introduction to computational complexity.

Computational complexity theorists try to determine which problem are efficiently solvable on a computer. This sentence already leads to many questions: What is a problem? What is efficiently solvable? Let us first start off with a truly basic question, what is a computer?

In 1936, Alan Turing invented a theoretical computing device now called a Turing Machine. This was before electronic computers existed. He tried to give a simple model of the thought processes of mathematicians. His model has stood the test of time and represents the official model of computation that we still study today.

Instead of giving the formal definition of a Turing machine, let us try a more modern approach. Consider some current programming language like Java. Let us consider the (imaginary) world where a Java program has access to a potentially infinite amount of storage. A Turing machine corresponds to a specific Java program. You might find it a little confusing to think of Turing machine = Java Program but that is the best way to think about it.

Does it matter which programming language we choose? What if we used C++ or Visual Basic or the original Turing machine model? No, it makes no difference. Consider a C++ program P. There are, at least theoretically, Java programs that will interpret C++ programs. If you consider a Java program that interprets P, it will have the same computational behavior as P. This idea holds between any two programming languages including the original Turing machine and leads to the Church-Turing thesis:

Everything computable is computable by a Turing machine.

Which you can interpret as saying everything is computable is computable by a Java program.

The Church-Turing thesis cannot be proven as it is a thesis but has lead us to define computable as computable by a Turing machine. Now after about half a century of having real computers, the Turing machine has really proven itself as the right model of computation.

Thursday, September 05, 2002

I heard of a nice new result from Russell Impagliazzo and Valentine Kabanets yesterday. Consider the language PI (Polynomial Identities) consisting of multivariate polynomials which are identically zero. PI is in co-RP by replacing the variables by random values and seeing if the result is zero. There is a lemma by Schwartz that implies this randomized algorithm will work with high confidence.

Since we now know Primality is in P, PI is the best remaining example of a problem with an efficient probabilistic algorithm but no known deterministic algorithm. Unlike primality, giving even a weak derandomization for PI will be difficult as it will lead to circuit lower bounds.

Theorem: (Impagliazzo-Kabanets) At least one of the following must be true

  1. PI requires deterministic exponential time to solve.
  2. The permanent does not have polynomial-size arithmetic circuits.
  3. NEXP does not have polynomial-size Boolean circuits.
Here is a sketch of the proof. For a matrix A let Aij represent A with the ith row and jth column removed. Consider the self-reduction from the permanent of a (k+1)×(k+1) matrix to permanents of k×k matrices
(*) Perm(A) = Σj a1j Perm(A1j)

Suppose that the permanent has polynomial-size arithmetic circuits. Then P#P is computable in NPPI by the following algorithm: Guess the arithmetic circuits for the Permanent for all sizes k×k up to n×n for some appropriate n. Use PI to verify (*) for each k<n. If the test is correct on all lengths than the circuits are correct. Use the circuits to compute the Permanent which is #P-complete.

Now suppose NEXP has polynomial-size circuits. By Impagliazzo-Kabanets-Wigderson this implies NEXP = MA ⊆ PH ⊆ P#P ⊆ NPPI. If PI is computable in subexponential time then NEXP is in nondeterministic subexponential time, contradicting the nondeterministic time hierarchy.

Wednesday, September 04, 2002

Complexity Class of the Week: PP

Previous CCW

A language L is in PP if there is a nondeterministic polynomial-time Turing machine M such that x is in L if and only if M(x) has more accepting than rejecting paths.

PP stands for "Probabilistic Polynomial-time" from the original definition by Gill where one uses a probabilistic poly-time TM where x is in L if and only if the probability of M(x) accepting is greater than 1/2. "Probabilistic Polynomial-Time" would be a better name for PP's cousin class BPP or "Bounded-error Probabilistic Polynomial-Time". PP is not much use as a probabilistic class since it would take potentially an exponential number of trials to distinguish accepting from rejecting with reasonable confidence

A better name for PP would be Majority-P as given by the definition above. Because of historic reasons we are stuck with the name PP for now. Don't hold the bad name against the class. It still has a natural definition and some amazing properties.

PP has similar complexity to the function class #P as PPP = P#P, proved using the old stalwart, binary search. I have never found the paper that first proves this equivalence. Toda's Theorem shows the amazing hardness of PP by reducing the polynomial-time hierarchy to PPP.

Valiant's proof that the permanent is #P-complete also gives a natural complete language for PP:

PERM = { (M,k) | The permanent of M is at least k}

NP is in PP by adding a larger number of dummy accepting paths. PP is clearly closed under complement so we have co-NP in PP. Beigel, Reingold and Spielman show that PP is closed under union, a far trickier proof than one would expect using the fact that rational functions approximate the sign function well. Fortnow and Reingold build on their techniques to show that PP is closed under truth-table reductions.

PP is the prototypical counting class, classes defined in terms of the number of accepting and rejecting paths. PP contains the counting class C=P where x is in the language if the number of accepting paths equals the number of rejecting paths. On the other hand there are relativized worlds where ⊕P and PP are incomparable.

The limitations of PP come from what I call the Beigel all-purpose oracle. Beigel exhibits a relativized world where PNP is not contained in PP. His proof works by showing that any polynomial whose sign is the ODDMAXBIT function must either have high-degree or very large coefficients.

PP has interesting relationships to other complexity classes. Vereshchagin shows that Merlin-Arthur games, the class MA, is contained in PP but gives a relativized world where AM is not in PP. Adleman, DeMarrais and Huang show that the quantum class BQP is contained in PP. Watrous shows that QMA, the quantum version of MA, is also contained in PP.

Fortnow and Rogers, building Lide Li's Ph.D. thesis, show that BQP is low for PP, i.e., PPBQP = PP. Köbler, Schöning and Torán also using the work of Li show that graph isomorphism is also low for PP.

Allender shows that PP is not equal to uniform-TC0, constant-depth circuits with threshold gates. Can one show that PP is not equal to log-space? All one has to do is show that Toda's theorem can be extended to get any nonconstant level of the polynomial-time hierarchy to fall in PPP.

Tuesday, September 03, 2002

Today's New York Times has an essay on primes and the new AKS algorithm. A bit rambling but worth a look.

Monday, September 02, 2002

Today is a major holiday in the US, Labor Day. The holiday's original meaning has been mostly lost-it now represents the unofficial end of summer. Welcome Fall.

Because of the holiday there is not much of a post today. Just let me point to the accepted papers of the FST&TCS Conference. FST&TCS is the main Indian theory conference.

Back on Wednesday with the next Complexity Class of the Week.