Wednesday, May 28, 2003

Open Questions from Hopcroft and Ullman

On page 281 of the 1979 edition the classic theory text of Hopcroft and Ullman lies two tables describing closure and decidability properties of various formal languages. Four entries in the table are labelled by "?" meaning the answer was not known. Those entries are
  1. Are context-sensitive languages closed under complementation?
  2. If L is context-sensitive is MIN(L) context sensitive? MIN(L) is the set of strings in L who do not have proper prefixes in L.
  3. Is it decidable whether the complement of a given context-sensitive language is context-sensitive?
  4. Is it decidable whether two given deterministic context-free languages are equal?
We now know the answers to all of these question are "Yes."
  1. In 1988, Neil Immerman and Róbert Szelepcsényi independently showed that nondeterministic space is closed under complement. Since CSLs are equivalent to nondeterministic linear space we have that CSLs are also closed under complement. Immerman and Szelepcsényi received the 1995 Gödel Prize for this result. We will cover Immerman-Szelepcsényi in the next Foundations lesson.
  2. By using Immerman-Szelepcsényi, one can create a nondeterministic linear time algorithm to check that x is in L and that all proper prefixes of x are not in L.
  3. Trivially true by Immerman-Szelepcsényi.
  4. This was the hard one. Only in 1997 did Géraud Sénizergues succeed in showing that the problem is decidable. For this work he received the 2002 Gödel Prize. Here is the simplified version of his result, a mere 58 pages.

Thursday, May 22, 2003

Computing's Lost Allure?

The New York Times today had an article on the shrinking number of computer science majors in American universities. Let me give you my take on this.

First of all this should be no surprise. People, consciously or unconsciously, follow the money. Computer science majors were a very hot property in the late 90's and now they are less so. So less people are going into computer science.

My advice: Don't follow the money. You will be trying to time the market four years down the road. I remember as an undergrad seeing many of my fellow freshman going into Chemical Engineering because it was a hot area. Four years later many of them had trouble finding or keeping a job and had to move to other areas. Some of them even became.......lawyers.

Best to do what you enjoy. If you enjoy it you will have a better chance to succeed. Worry about the job market when you get there.

Wednesday, May 21, 2003

Celebration for Walter Savitch

Just got this announcement. And we just discussed Savitch's Theorem in my last Foundations Lesson.

The Department of Computer Science and Engineering at the University of California, San Diego, is sponsoring an event on June 12, 2003, on the occasion of Professor Walt Savitch's birthday and in recognition of his thirty-four years of teaching Computer Science at UCSD. This event will feature two distinguished speakers, Professor Stephen Cook (University of Toronto) and Professor Aravind Joshi (University of Pennsylvania) with a reception in honor of Professor Savitch concluding the program. The event will take place at Eucalyptus Point (Room B) in Thurgood Marshall College on the UCSD campus. Please see the department web site for a link to more information.

This event immediately follows the STOC conference which is also being held in San Diego as part of FCRC.

Friday, May 16, 2003

Turing Machines and Godel's Theorems

A little recursion theory can make Gödel's Theorems intuitively easy.

Let A be the set of <M> such that M does not accept the input <M>. By diagonalization techniques we can show A is not recursively enumerable.

Now let us fix a logical theory like ZFC set theory. The actual theory does not matter much. Let B be the set of <M> such that there is a proof in ZFC of the statement "M does not accept input <M>." B is recursively enumerable since we can just try all possible proofs.

Now B is clearly a subset of A so there must be an input <M> in A-B. For this <M>, we have that M does not accept input <M> but there is no proof of this fact. We now have a true mathematical statement with no proof. This is Gödel's first theorem.

We can be more constructive. Since B is recursively enumerable there is some Turing machine N that computes B. Suppose that N accepts <N>. This means <N> is in B which implies there is a proof that <N> does not accept N, a contradiction.

So <N> cannot accept N which means <N> is not in B. So we have now constructed an N such that the statement "N does not accept <N>" is true but not provable.

But wait a minute, didn't I just give a proof that N does not accept <N>? Actually I had to make the assumption that ZFC is consistent. If ZFC is inconsistent then every statement (true or false) is provable so B would not be a subset of A and my whole argument falls apart.

If ZFC could prove that ZFC is consistent then I would not have to make any assumption and would have a contradiction. Thus ZFC cannot prove its own consistency. This is Gödel's second theorem.

Logicians tell me I'm cheating: I had to assume something technically stronger than consistency for this argument to work. Still these proofs illustrate the amazing power of Turing machines to make Gödel's theorems easier to understand.

Wednesday, May 14, 2003

Foundations of Complexity
Lesson 18: Savitch's Theorem

Previous Lesson | Next Lesson

Unlike what we believe for time, there is a polynomial relation between deterministic and nondeterministic space.

Savitch's Theorem (1970): NSPACE(s(n)) is contained in DSPACE(s2(n))

Proof: Recall the definition of tableau (as described in Lesson 12). Let N be a nondeterministic machine that uses s(n) space. We can represent the computation of an input x of length n by a tableau where each configuration has size O(s(n)) and there are at most m = cs(n) configurations for some constant c.

We will create a deterministic machine M(x) to determine whether the tableau is proper and thus N(x) accepts. First we need a subroutine CHECK to determine whether one configuration can reach another in 2t steps. We do not have enough space to write the entire tableau but instead we do a divide and conquer approach: Try all possible middle configurations and recurse on each half.

CHECK(CONF1,CONF2,t)
\* Output TRUE if CONF1 can get to CONF2 in at most 2t steps *\
If t=0 then 
        {if (CONF1 = CONF2
         or CONF1 goes to CONF2 in one step on machine N)
         then output TRUE else output FALSE}
For each CONF
  {If CHECK(CONF1,CONF,t-1) and CHECK(CONF,CONF2,t-1) then output TRUE}
Output FALSE

We can implement CHECK by only having to store a constant number of configurations at each level of the recursion with a recursive depth of t for a total space of O(ts(n)).

Let CONF0 be the initial configuration encoding x. We can now give our main routine.

MAIN
Let r be the smallest integer at least log2(cs(n))
For each CONF in an accepting state
   {If CHECK(CONF0,CONF,r) then output TRUE}
Output FALSE

Total space used: O(log2(cs(n))s(n)) = O(s2(n)).

Monday, May 12, 2003

The Nerd Shot

Many years ago I was commuting home on the train with my wife and one of her colleagues. I showed them the group picture from a Dagstuhl I had attended that they recently sent me. My wife and her friend spent much of the train ride laughing at the collection of nerdly looking people in the photo. Ever since then we have jokingly called a conference group picture the nerd shot.

Computer scientists, especially in Europe, love to take pictures of other computer scientists. Problem is computer scientists are not particularly photogenic nor, for the most part, are they good photographers. The worst is the group photo, where we are herded like cattle to some enclosed place where we stand until all the strays are rounded up and finally the photo is taken, sometimes several times by several people.

Some things have changed over time. Most photos are now taken digitally and get posted quickly, sometimes before the conference is over. My kids enjoy playing "Where's Daddy?" especially if I am out of town. But still the group photo experience remains the same.

I was talking to my wife on the phone at the last Dagstuhl and told her it was time for the nerd shot. She said I should fix my hair and I replied "What? You want me to stand out?"

Without further adieu here is that nerd shot.

Thursday, May 08, 2003

Foundations of Complexity
Lesson 17: Space Complexity

Previous Lesson | Next Lesson

In addition to time, computer scientists also worry about the memory or space that a Turing machine uses. Roughly one can measure space as the number of tape squares used by at Turing machine. We would like to talk about space bounds like log n and still be able to read the whole input so we need a slightly different model.

We now allow our Turing machine to have a read-only input tape, one or more work tapes and for a Turing machine computing a function, a write-only output tape. On the input tape the head can move back and forth but it cannot change the values in any cell. On the output tape the head can only move right, writing as it moves. The amount of space a Turing machine uses on input x is the number of cells of the work tape that it uses. We will assume a space bound of at least log n since we need log n bits to describe the location of the the input head pointer.

In real world terms, think of your computer accessing "the internet". You can still reach many pages even though you cannot store the entire internet on your computer.

Any machine that runs in time t(n) clearly also runs in space t(n). If a machine uses space s(n) and it halts then it will halt in time cs(n) for some constant c. Otherwise the machine will repeat a configuration and run forever.

We define the classes DSPACE(s(n)) as the set of languages that are accepted by Turing machines using at most O(s(n)) space. NSPACE(s(n)) is the same for nondeterministic machine. We will always assume s(n)≥log n and s(n) is an easily computable function.

Common space classes include L=DSPACE(log n), NL=NSPACE(log n), PSPACE=∪kDSPACE(nk) and NPSPACE=∪kNSPACE(nk).

Unlike the P versus NP question, we know quite a bit about the relationship of deterministic versus nondeterministic space through the following two results.

  • Savitch's Theorem: NSPACE(s(n))⊆DSPACE(s2(n)). In particular this means NPSPACE = PSPACE.
  • Immerman-Szelepcsényi Theorem: If L is in NSPACE(s(n)) then the complement of L is also in NSPACE(s(n)).
We will discuss these theorems in more detail in upcoming lessons.

Tuesday, May 06, 2003

Universal Search

Psst. Want to know the fastest algorithm for factoring? I can give you an algorithm that is within a constant multiplicative factor of the best possible factoring algorithms.

Actually this is an idea due to Levin. Let p1, p2, ... be a list of all programs. On some input m simulate program p1 for half of your computation time, p2 for a quarter of the time, p3 for an eighth of the time, etc., until one of these programs outputs a factor of m. If pi is the fastest algorithm for factoring then our algorithm will run in time at most 2i times the running time of pi. The multiplicative factor 2i is independent of m but unfortunately could be quite large.

Marcus Hutter gives another algorithm that has a multiplicative factor of 5 but has a large additive constant. The trick is to spend some of your time searching for a proof that an algorithm is correct and runs in certain amount of time. You then only need to simulate the provably fastest algorithm found so far.

Hutter's algorithm works only as fast as the provably best algorithm with a provable running time. It could very well be the case that there is some good heuristic for factoring that does not have a provable running time or proof of correctness. Levin's technique will capture this case.

Of course, neither of these papers gives a practical algorithm as the constants involved go beyond huge. Nevertheless it is still interesting to see the theoretical possibilities of universal search.

Sunday, May 04, 2003

FCRC

Even the largest theoretical computer science conferences draw at most a couple of hundred people. Many (but not all) other areas of computer science also do not draw large numbers of participants. To have a larger and more noticeable conference and possibly to get press attention, the CS community decided to hold a joint conference covering many different areas in computer science. In 1993, the first Federated Computing Research Conference was held in San Diego.

The press didn't show but with some cross-collaboration the conference was considered a mild success. After stops in Philadelphia and Atlanta, FCRC returns to San Diego June 7-14. The 2003 FCRC includes the main spring theory conference (STOC), Electronic Commerce (EC), Computational Geometry, Principle and Practice of Parallel Programming and many others. Registration deadline is May 7.

For the first time the Complexity Conference will not be part of FCRC as 2003 is a Europe year for us. I am planning to attend FCRC for STOC and the EC meeting. If you attend FCRC stop by and say hi.

Thursday, May 01, 2003

History's Loss/Mathematics' Gain

Some excitement at Schloss Dagstuhl this week. Localized high winds tore the metal plating off the roof of much of the new building Wednesday evening. Much of that roof sits in the courtyard--quite a sight. The good news is no one was injured and property damage, besides the roof, was minimal. A few of us had to switch rooms but the workshop goes on.

Let me finish off this centennial week with a story of Kolmogorov that I have heard from different sources so some variation of this story is likely true. Kolmogorov initially wanted to be an historian. He looked at the question as to whether taxes in Russia in the middle-ages were collected at the house level or at the village level. He analyzed tax data and showed that that data had a much simpler description if taxes were collected at the village level. (You can see the seeds of Kolmogorov complexity here). He presented these results to the history faculty to great applause. Asking them whether he could publish such a paper he was told, "You only have one proof. You cannot publish in a history journal without at least two more proofs of your hypothesis." And so Kolmogorov left history for a field where one proof suffices.