Saturday, April 30, 2005

Clemens Lautemann

Some sad news from Thomas Schwentick.
What we have feared during the last weeks and months came true yesterday morning: Clemens Lautemann died at the age of 53 from cancer.
Most recently Lautemann had been working on logical characterizations of complexity classes but I will remember him most for his beautiful proof (or here) that BPP, the class of languages with efficient probabilistical computatins, is in the second level of the polynomial-time hierarchy. In 1983 Sipser had shown BPP in the fourth level, and very soon after Gács and Lautemann independently showed BPP in the second level. Lautemann gave a very simple combinatorial proof that I consider one of the prettiest applications of the probabilistic method to complexity.

Friday, April 29, 2005

The Quality Thesis

Too often Ph.D. theses in computer science consist of not much more than a couple of "papers stapled together." A shame as one can use the thesis to truly bring out the importance of one's research.

There is no serious upper page limit on a thesis and you can truly spend the extra time to make your thesis stand out.

  1. Put the results of your earlier papers together in a common framework and add some new results you never bothered writing up. (Harry Buhrman's 1993 thesis has a large collection of results on exponential-time computations that I still often consult.)
  2. Take the time to expand the proof of complicated results to the right amount of intuition and depth. (For many years Madhu Sudan's 1992 thesis had the best write-up of the proof of the PCP theorem.)
  3. The initial chapters of your thesis can serve as an introduction to a relatively new research area, (Michael Kearns's 1989 thesis gave an early broad overview of computational learning theory.)
  4. Or give your own impressions of a more established field (Scott Aaronson's thesis expounds on his views of quantum computing.)
If you are looking for a job you'll be too stressed to do research anyway so why not take the time to write a quality thesis which will get your thesis widely cited and possibly even widely read.

Thursday, April 28, 2005

My Mouse and Me

Today is take your daughter (and son) to work day so today's post is written by Annie Fortnow (age 10) on the topic of her choice.

This is a Dell Computer. It has 3 different parts. The first part is the screen. The screen is were you see what you are typing. The next part is the keyboard. The keyboard is the place were you type. The last part is the mouse. It is my favorite part because it helps me to click on the things that I want to click on.

There are many other parts of a Dell Computer. Those are the parts that I recognize the most. But the mouse is my favorite part. The mouse comes in many different shapes and sizes. On a laptop the mouse is either a circle in the middle, a square at the end, or both. On a regular computer the mouse is either an oval, a trackball, or looking like a mouse.

The mouse is a really important part of the computer. I will always keep mine in handy.

Tuesday, April 26, 2005

The Story of Ribbit

Around 1980 for fun in New Jersey we would go visit the electronic video arcades to play various games like Asteroids, Pac-Man, Tempest, Missile Command and many others. I was not much of a player but I was fascinated by the games themselves and wondering what it was like to program them. Between high school and college in the summer of 1981, a high school friend Chris Eisnaugle and I tried programming up a few games on his Apple II. I remember getting a passable version of Asteroids working in a couple of days.
We talked with a computer magazine writer who said we could legally sell a game based on an arcade game as long as we changed the name and slightly changed the user interface. We focused on the game Frogger which was not yet available for the Apple and created Ribbit written mostly over winter break. That spring we sold the program through a local computer store before we got a cease-and-desist order from Sierra Online, who had bought the personal computer rights to Frogger. So we ceased and desisted but not before 1200 copies of the program got sold. I made about $2000 from the program, not bad for a college freshman in 1982. Also a computer magazine review of Frogger liked our program better!
"Sierra Online's Frogger is even worse than the game named after the sound a frog makes."
In the summer of 1982 I worked as a instructor/counselor at the Original Computer Camp, Inc. in Los Olivos, California, which had a series of two-week sessions. Early in the summer none of the campers had heard about Ribbit but later on quite a few did. Not because of the 1200 legal copies but because pirated versions of the game were widespread. At first I was quite upset at the piracy, even deleting the game from the disks of the campers who had the game with them ("There is no honor among thieves" one such camper complained referring to the fact that we had stolen the idea of the game from Frogger). But soon I realized that we weren't selling any more legal copies anyway and the game lived on through its pirated versions. Still it wasn't long before Ribbit was mostly forgotten. On the webpage I set up for Ribbit I posted a pirated version of the game I found on the web. To get the original version I would have to find the disks buried somewhere in my mother's house and then find a machine that can read floppy disks from a time when disks were floppy.

Monday, April 25, 2005

scIenCE Princess

My daughters saw the movie Ice Princess over the weekend. Based on what they told me here is the basic story: Casey decides to do a science project on figure skating and uses physics and computers to help some skaters improve their routines. Casey's mom is really pushing her to science and sets up an interview for an academic scholarship to Harvard (Note to Hollywood: Ivy League rules prohibit academic scholarships). But Casey falls in love with figure skating and goes against her mother, says no to Harvard and follows her new dream of skating.

I have nothing against "follow your dream" movies and Ice Princess does put science and computers in a good light, at least in the early part of the movie. But just once can't we have a movie where a young woman whose parents want her to be a great figure skater, gymnast or tennis player but instead she follows her dream of becoming a scientist.

Saturday, April 23, 2005

Favorite Theorems: NP-Completeness

March Edition

This month we honor the papers that gave us the first NP-complete problems and marked the official beginning of the P versus NP question. The P and NP notation did not originate with these papers but I will use them anyway for clarity.

Steve Cook, The Complexity of Theorem-Proving Procedures, STOC 1971.

Suppose a nondeterministic Turing machine M accepts a set S of strings within time Q(n), where Q(n) is a polynomial. Given an input w for M, we will construct a propositional formula A(w) in conjunctive normal form such that A(w) is satisfiable iff M accepts.
In this paper Cook gives the first formal treatment of the P versus NP problem. He gives several examples of NP problems and notes his notion of NP is equivalent to a class of extended positive rudimentary relations due to James Bennett. He introduces P-reducibility (now often called Cook reductions) where one reduces a problem A to a problem B by solving A using a polynomial-time machine with access to an oracle for B. His main theorems show that Tautology and Subgraph Isomorphism are hard for NP under P-reducibility. The quote above came from the beginning of the proof for Tautology.
The theorems suggest that it is fruitless to search for a polynomial decision procedure for the subgraph problem, since success would bring polynomial decision procedures to many other apparently intractable problems…The theorems suggest that Tautology is a good candidate for a set not in P, and I feel it is worth spending considerable effort trying to prove this conjecture. Such a proof would be a major breakthrough in complexity theory.
Indeed.


Leonid Levin, Universal'nyie Perebornyie Zadachi (Universal Search Problems), Problemy Peredachi Informatsii 1973. Translation in appendix of the Trakhtenbrot survey.

If we assume that there exists some (even if artificially formulated) problem of the search type that is unsolvable by simple (in terms of volume of computation) algorithms, then it can be shown that many "classical" search problems have the same property.
Levin claims that any NP problem reduces to any of a list of problems including tautology, subgraph isomorphism, tiling, set cover and a few others. As was the Russian tradition of the times, Levin's paper does not have fully formal definitions or any proofs. Still he has similar results and the same insight as Cook that one can reduce any NP problem to specific natural problems.
All of these problems are solved by trivial algorithms entailing the sequential scanning of all possibilities. The operating time of the algorithms, however, is exponential, and mathematicians nurture the conviction that it is impossible to find simpler algorithms…but not one has yet succeeded in proving it.
In today's world results proven today get transmitted around the world in minutes. But given the technological and even more important the political situation of the 70's, Cook and Levin did not learn of each other's work until years later. Today we give them both credit for taking the P versus NP problem out of the abstract and connecting it to solving natural problems. Now only three decades later the P versus NP problem has become one of the great open questions in all of mathematics.

Thursday, April 21, 2005

A Fine Line Between Prank and Fraud

You have probably heard this story by now. Some MIT students created a computer-generated paper accepted to a non-reviewed session of the Systemics, Cybernetics and Informatics conference. I haven't mentioned the story earlier because I didn't want to give the students extra publicity but now that the story has hit the AP wire something needs to be said: What these students did was just plain wrong.

I'm no big fan of the SCI conference but virtually none of the conferences in computer science fully referee their submissions. A clever student could write a paper with a bogus proof and have a chance of that paper being accepted at a major conference like STOC. I would consider someone who intentionally submits a bogus paper to STOC guilty of academic fraud. Why are these MIT students any different?

Students make mistakes and we should tell them what they did was wrong instead of just glorifying such activities.

Wednesday, April 20, 2005

And The Winner Is …

Haipeng Guo wins the math poetry contest with the poem below. Congratulations and thanks to all that participated.

When a P-man loves an NP-woman

Been a happy deterministic man
With a simple polynomial brain
I contented myself with P problems,
And always looked at NP with disdain.

Fell in love with a polynomial woman,
But with a non-deterministic wit,
She said she would marry me,
Only if I could show her that P=NP.

I rushed to the library and studied,
Asked Garey & Johnson for a hint to the truth,
They said "this is quite a hard question",
But none of them had a hint or a clue.

Went to church and prayed to The Almighty,
"Please Oh Lord, give me a lead the truth",
"Don't waste your time son", a voice said laughing,
For I myself on this wasted my youth.

First oracle says you will marry
Second one tells you you'll split
Time moves, paths branch, results may vary
Accept the state that finally fits

If you finally marry this girl,
And P=NP was true,
What a Chaos: E-banking unsafe, Salesmen traveling cheaply!
And mathematicians with nothing to do!

If I grant your happiness,
The precondition must be no witness,
Even you both did nothing completely wrong,
The punishments will be exponentially long.

If you really want to marry this woman,
Then randomness might be the only key,
But please stop praying for an answer to me,
For I could not decide on this P=NP!

Tuesday, April 19, 2005

A New PCP Proof

There is some buzz about a new construction of probabilistically checkable proofs by Irit Dinur. The PCP theorem, first proved by Arora, Lund, Motwani, Sudan and Szegedy in the early 90's, states that every language in NP has a proof that can be verified randomly using O(log n) random bits and a constant number of queries. The PCP theorem has had many applications to showing hardness of approximation results and has had many improvements such as Håstad's tight result that I highlighted last year.

The previous proofs used considerable algebraic techniques. Dinur takes a more combinatorial approach using a powering and composition technique (inspired by Reingold and the zig-zag product) to separate the gap in 3SAT without increasing the number of variables.

An upcoming STOC paper by Ben-Sasson and Sudan gives a PCP for SAT of only quasilinear size but requiring polylogarithmic queries to verify the proof. Dinur, by applying her construction to those PCPs, can now create quasilinear-size proofs which only need a constant number of queries for verification.

Sunday, April 17, 2005

Discussion Questions

New Balance has been heavily advertising some questions about sports so I'd thought I would give my own discussion questions about academics.
  • You've been working hard on a research problem and someone else solves it. Does that make you feel happy or sad?
  • Three people in an office. Two of them bounce ideas back and forth to prove a new theorem while the third just tries to keep up. Should the third person be a co-author?
  • You are reviewing a paper and see an easy but major improvement to the paper's main result. What do you do?
  • Your friend is applying to your university and you see that one of his recommenders wrote a weak letter. Do you tell your friend?
  • Your advisor of the opposite sex has two tickets to a concert you really want to see and invites you to join him/her. Would you go?
  • You discover a student wrote something strongly negative about a colleague on their weblog. What would you do, if anything?
  • Would you still be a scientist if you could do research but all your work had to be published anonymously?
Go forth and discuss.

Friday, April 15, 2005

The Translation Lemma

A simple trick that every complexity theorist should know but, based on some recent conversations, not every complexity theorist does know. Roughly speaking collapses for small resource bounds imply collapses for large resource bounds. Here is the result for NTIME (nondeterministic time) and DTIME (deterministic time) but the proof works for nearly every pair of complexity measures.

Translation Lemma: Let f(n), g(n), h(n) be reasonable (time-constructible) functions with h(n)>n. Then

NTIME(f(n))⊆DTIME(g(n)) implies NTIME(f(h(n)))⊆DTIME(g(h(n))).

The proof uses a technique known as padding. Let L be in NTIME(f(h(n))) via a machine M. Define A by

A = {x01h(|x|)-|x|-1 | x in L}
We can compute whether x01h(n)-|x|-1 is in A by simulating M(x) which takes nondeterministic time f(h(|x|))=f(m) where m=h(n) is the length of the input. So A is in NTIME(f(m)) and by assumption in DTIME(g(m)).
Now if we want to compute whether x is in L we can use the DTIME(g(m)) algorithm for A on x01h(n)-|x|-1 taking total time g(h(n)) since the input has size m=h(n). QED

As an immediate consequence we get that if P=NP then E=NE (where E=DTIME(2O(n))) by letting h(n)=2n.

The translation lemma works in only one direction. You need h(n)>n, you can't unpad an arbitrary x. It's open whether E=NE implies P=NP for instance.

The lemma has many applications. Here is one example.
Theorem: If P=NP then some language in E does not have subexponential-size circuits.

Proof: In the Σ4 level of the E-time hierarchy we can compute the lexicographically-first language A that cannot be simulated by any 2n/2-size circuits. If P=NP then the polynomial-time hierarchy collapse to P. By a version of the translation lemma with h(n)=2n the polynomial-time hiearchy collapsing to P implies the E-time hierarchy collapses to E. Thus A is in E but A does not have subexponential-size circuits.

Wednesday, April 13, 2005

A Modest Proposal

A guest post by Michael Mitzenmacher

Lance nicely invited me to expand on my views on the format for conference submissions. Currently, I am on a program committee using the standard theory call:

A submission for a regular presentation must be no longer than 10 pages on letter-size paper using at least 11-point font…additional details may be included in a clearly marked appendix, which will be read at the discretion of the program committee.
We have actually had discussions on whether to reject out of hand papers that use 10 point font or otherwise violate this standard.

The problem is that many, including myself, think that this formatting rule is silly, and so it has been widely ignored or at least painfully abused for many years. I would like to propose a simple and logical alternative: conference submissions should be in the same format (or as near an equivalent as possible) as the final conference version. Many other conferences (such as AAAI and Sigmetrics) use this approach with great success.

The advantages of this approach include:

  1. It reduces the work of the authors. Right now, authors have to create entirely distinct submission versions and final versions of conference papers using various formats. Most authors find this a hassle, and this is my main reason for the proposal. I hate writing the same conference paper multiple times just to cope with formatting issues.
  2. It gives the reviewer a more accurate picture of the conference paper. Reviewers will have a very good idea of what the paper will look like in the conference proceedings, making it easier to judge. When you're staring at 20+ pages of appendices, it is hard to tell what the final paper will look like.
  3. It enhances fairness. Because this is a standard with a clear reasoning behind it -- you cannot have a longer submission than conference paper -- people are more likely both to follow and enforce the rule, avoiding potential unfairness.
I have heard of some disadvantages of this approach. Let me attempt to dispense with them.
  1. The format is too hard for the reviewers to read.

    My response: If this is the case, then perhaps the conference paper format itself should be changed -- after all, don't we expect many people to actually read the conference version? If the conference paper is packed tight for other reasons (the publishers charge by the page), then for submissions design as near an equivalent format as possible. If we find 10 double-column 10 point pages with style file A essentially equals 20 single-column 11 point pages with style file B, then clearly state that in the call and ask for the latter. (Luca Trevisan pointed out this is done for the Complexity Conference already.)

  2. Appendices are necessary when there are long proofs that won't fit in the paper.

    My response: If the proofs won't fit in the final conference paper, this is something a reviewer should see and know. The program committee can either allow appendices, with the knowledge they won't have room to appear, or allow pointers to more complete versions (TRs, arXiv preprints) that the reviewers can examine if they desire.

  3. By having different formats, we force authors to revisit and hopefully improve their paper.

    My response: Nice intentions, but don't people already want to make their published work as good as possible? This seems unnecessary, and not worth the price.

I ask all program committee chairs to please consider this modest proposal.

Tuesday, April 12, 2005

Paper Pet Peeves

Little things that annoy me in research papers.
  • Declarative first sentences of the introduction, like "Analyzing Left-Handed 12-SAT is a key approach to solving the P versus NP question." Just because you say it doesn't make it true.
  • "We use novel techniques that might be of independent interest." A double faux pas. You don't get to call your own techniques novel. "Might be of independent interest" is such a meaningless statement.
  • Footnotes (and parenthetical statements) which interrupt the flow of the paper. If it's not worth mentioning in the text then don't mention it.
  • Using citations as nouns like "[13] using techniques of [6] showed the main result of [4] follows easily from [18]." I hate having to keep flipping to and from the references to read these papers.
  • Using the cliché "larger than the number of atoms in the known universe." It's big. We get it.
  • Using the word "respectively" which says "I'm going to give you something hard to parse because I'm too lazy to write two sentences."
  • Titles with symbols or complexity classes: If you can't describe your research with words you might consider becoming a mathematician.

Sunday, April 10, 2005

Does a book exist if nobody reads it?

A recent conversation with a graduate student.
Student: I couldn't find the paper online.
Me: So walk over to the library and get the paper there.
Student: But you can't take those books out and I don't want to spend hours at the library reading the paper.
Me: So make a copy and take it home.
Student: The library has copy machines?
Here is where I tell the story that when I was a graduate student and wanted a paper I walked five miles barefoot in blizzard conditions (actually two flights of stairs) to the library, or would send a stamped self-addressed envelope to an author. Not that we should go back to those times but don't ignore papers just because you can't find them online.

The next generation gets even worse. From a discussion in an undergrad class.

Student: I searched really hard for this topic and didn't find much. Are you sure it even exists outside of class?
Me: Really, I'm sure the math library [right down the hall] has several books on the topic.
Student: Oh, I just used Google.
Are we really getting to the point that if something isn't on the internet (or even on the internet but Google doesn't find it) then it doesn't exist?

Friday, April 08, 2005

The Battle of Grantsburg

From FYI:
Challenges to the teaching of evolution in public schools across the country have prompted National Academy of Sciences President Bruce Alberts to write to all members of the Academy. Warning of "a growing threat to the teaching of science," Alberts calls on Academy members, if such a controversy arises in their state or school district, to take actions against "attempts to limit the teaching of evolution or to introduce non-scientific `alternatives' into science courses and curricula."
Let me take you to the trenches in such a school district, Grantsburg, a small town in northwestern Wisconsin.

A good college friend of mine who grew up in suburban Connecticut, after finishing medical school and residency took a family practice position in Grantsburg. He got married, had kids and grew to like the rural life. But recently he got involved in a nasty battle with the school board.

The board last fall, after viewing an anti-evolution movie, had authorized teachers to teach "alternate theories of evolution" in the science curriculum. Last December, the board under some pressure changed the policy

Students are expected to analyze, review and critique scientific explanations, including hypotheses and theories, as to their strengths and weaknesses, using scientific evidence and information. Students shall be able to explain the scientific strengths and weaknesses of evolutionary theory.
Not much of an improvement. So the opposers brought in local professors to discuss the importance of evolution but this failed to sway the board.

So finally they tried to replace the board with a slate of science-friendly candidates but just this week they lost that fight as well.

And so my friend, who simply wanted to be a good country doctor, has made some enemies and worries about about the education his kids will receive.

Wednesday, April 06, 2005

Baseball is Back

A perfect spring day in Chicago and all of my afternoon meetings mysteriously canceled so I took visiting Portuguese professor and avid soccer fan Luis Antunes to his first baseball game, the Cleveland Indians against the White Sox.

Luis was sure he would be bored. We got awesome seats behind home plate and I introduced him to the full baseball experience with Polish sausages for lunch and Take me out to the ballgame during the seventh-inning stretch. Luis knew little about baseball but pretty soon he concentrated on every pitch counting balls and strikes. During the game he even said "baseball is not quite as boring as I had feared." The experience became complete when the Sox scored four runs in the bottom of the ninth to win the game.

Yes, baseball is back and all is good in the world.

Monday, April 04, 2005

Math Poetry Contest

From a poster in my building:
What is the longest song?

"ℵ0 bottles of beer on the wall."

Happy Mathematics Awareness Month!

April is also National Poetry Month. In honor of April I am running my first (and perhaps last) annual math poetry contest. Winner will receive a copy of Complexity of Computations and Proofs (Jan Karjicek, editor), volume 13 of Quaderni di Matematica, Dipartimento di Matematica della Seconda Universitá Napoli, 2004.

Submit your new original poem on a mathematics or theoretical computer science theme in the comments section of this post with your name and/or email. One entry per person. Entries due by 11:59 PM CDT on Monday April 18. A panel of celebrity judges will choose the winning poem based on whatever criteria they deem fit. The decision of the judges are final.

Update: And the winner is…

Sunday, April 03, 2005

What happened to the PRAM?

When computational complexity gets accused to having no connection to reality, I bring up the story of the PRAM (Parallel Random Access Machines), a complexity model killed by technology.

Today we think of the class NC in terms of circuits: NC contains the problems solvable in polynomial-size and polylogarithmic-depth circuits. But Nick Pippenger originally defined the class to capture parallel computation: problems solvable on a PRAM with a polynomial number of processors and polylogarithmic time. The PRAM model had several processors that shared a polynomial amount of random-access memory. There were three main variations:

  • EREW–Exclusive Read/Exclusive Write: Every memory cell can be read or written only by one processor at a time.
  • CREW–Concurrent Read/Exclusive Write: Multiple processors could read a memory cell but only one could write at a time.
  • CRCW–Concurrent Read/Concurrent Write: Multiple processors could read and write memory cells. This variation had several subvariations depending on how one handled conflicting writes.
PRAMS got criticized due to the unrealistic nature of immediately addressable shared parallel memory. Areas like VLSI and Parallel Models (like the butterly network) worked to address these concerns. However while algorithmicists worried about the various PRAM models, achieving the better networks only causes a logarithmic factor increase in time and doesn't affect the complexity class NC.

So why don't we think PRAM anymore when we look at NC? Moore's Law. Processors got faster. Much much faster. The ideas of having many many processors each doing a tiny bit of work seems wasteful these days when we can just as cheaply have each processor do a lot of work.

We still see active research in parallel computing and one can speed up many computations using a large number of machines sometimes far away from each other just connected via the internet. But the best one could hope for is perhaps a quadratic improvement, not the exponential improvement that comes from PRAMs.

Friday, April 01, 2005

Another Breakthrough!

Speaking of space complexity, Adam Kalai, Adam Klivans and Rocco Servedio have extended Reingold's result to show that every language in randomized logarithmic space has a deterministic log-space simulation, i.e., RL = L. Cool.

You can find a copy of their paper here.