Tuesday, December 30, 2003

End of Year Thoughts

We had another solid year for theoretical computer science and computational complexity with many exciting results such as the polynomial algorithm for perfect graphs developed in a series of papers by Chudnovsky, Cornuéjols, Liu, Seymour and Vuškovic. We also saw a number of strong papers in derandomization, extractor construction, dimension reduction and many other areas of complexity.

In 2003 we celebrated the 100th anniversary of the births of Alonzo Church, Andrey Kolmogorov, John von Neumann and Frank Ramsey, mathematicians who work played a big role in complexity. Next up, Kurt Gödel in 2006.

Trends to watch for in 2004:

• How will the reorganization of CISE and the end of the ITR affect NSF funding for theoretical computer science?
• Will a hopefully improving economy push universities to increase their resources in computer science?
• What trends will we see in recruiting this year? More and more Ph.D.'s seem to prefer postdoc positions though the number of these positions continue to decrease, especially in industry.
• How will the choice of the US president in 2004 affect the future of scientific research in America? A critical question, but not one that a journalist will ask in any debate.
Happy new year to all! May we prove P≠NP in 2004.

Sunday, December 28, 2003

John von Neumann

Today is the 100th anniversary of the birth of John von Neumann, one of the greatest mathematicians of the 20th century. Let me discuss two aspects of his work, one big, one small, that have greatly affected computational complexity.

The von Neumann min-max theorem showed that every finite zero-sum two-person game has optimal mixed strategies. More formally, let A be the payoff matrix of a game, then

maxx miny xTAy = miny maxx xTAy
where x and y are probability vectors.

Andrew Yao used the min-max theorem to prove what we now call the Yao Principle: The worst case expected runtime of a randomized algorithm for any input equals best case running time of a deterministic algorithm for a worst-case distribution of inputs. The Yao principle has proven invaluable for proving upper and lower bounds for deterministic and probabilistic algorithms.

How can you get a fair coin by flipping a coin of unknown bias? You use the von Neumann coin-flipping trick: Flip the biased coin twice. If you get heads then tails output HEADS. If you get tails then heads output TAILS. Otherwise repeat. This procedure will output HEADS or TAILS with equal probability and if the bias is not too close to zero or one the expected number of repetitions is relatively small.

The von Neumann coin flipping trick is the first in a long line of research in complexity extracting random bits from weak random sources.

John von Neumann passed away February 8, 1957 in Washington, DC.

Friday, December 26, 2003

That's My State

Nothing like this Chicago Tribune story to ruin your holidays. Apparently the Illinois Board of Higher Education has launched a study of faculty productivity looking at what kind of research projects faculty undertake to how much time they spend in the classroom. According to board chairman James Kaplan "there's got to be a tangible, measurable benefit for the people of the state of Illinois for a professor doing research." Nothing good can come from this.

The University of Chicago is a private school unaffected by this study, but I would hate to see my friends at the various U. Illinois campuses to have to take unpaid leave to go to conferences.

Update 12/29: Daniel Drezner dissects this article. Also, I talked with a University of Illinois-Chicago professor who worries less about this study than budget cuts in the University of Illinois system due to the continual fiscal crisis in the state.

Monday, December 22, 2003

Scientific Superstars

On Saturday I visited the Einstein Exhibit at Chicago's Field Museum. Some manuscripts and letters and a nice exhibit explaining why time must vary if the speed of light remains a constant made this an interesting but not a must-see exhibit. The biggest surprise for me came from seeing how Einstein's fame happened overnight instead of the more gradual fame I would have expected. In 1919 a solar eclipse showed that light from stars do bend from gravitational forces. Einstein's fame grew immediately and his name became synonymous with genius.

This superstardom for a scientist doesn't seem to happen today. When Andrew Wiles proved Fermat's last theorem he did get some deserved attention but he never became a true household name. When you realize Wiles has hit the upper limit of fame a mathematician can receive (ruling out people like Ted Kaczynski and John Nash) one can see the Einstein effect of science may never return.

On the other hand, University of Chicago paleontologist Paul Sereno headlines the social page of the Chicago Tribune at the "Party With Giants." Perhaps scientists can still achieve more than fifteen minutes of fame after all.

Thursday, December 18, 2003

Does NP=UP?

Time for another of my favorite open problems.

Does NP=UP imply the polynomial-time hierarchy collapses?

UP is the class of languages accepted by nondeterministic polynomial-time Turing machines that have at most one accepting computation for all inputs.

This problem has loose connections to Valiant-Vazirani but Hemaspaandra, Naik, Ogiwara and Selman have the most closely related result. Consider the following proposition.

(*) There is a set A in NP such that for all satisfiable formula φ there is a unique satisfying assignment a of φ such that (φ,a) is in A.

Hemaspaandra et. al. show that (*) implies the polynomial-time hierarchy collapses to the second level.

For all we know, (*) and NP=UP are incomparable. If (*) holds for some A in P then NP=UP by just guessing a. We don't know whether NP=UP implies (*) since the accepting computations of a UP machine accepting SAT need not reveal a satisfying assignment of a formula.

There exist an oracle relative to which UP=NP≠co-NP. A relativized world with UP=NP and Σ2p≠Π2p remains open.

Monday, December 15, 2003

Does PowerPoint Make You Dumb?

Yesterday the New York Times Magazine had its annual "The Year in Ideas" issue. Ideas Futures Markets in Everything and Proving You're Human I had discussed in earlier posts here and here. Instead let us discuss whether PowerPoint Makes You Dumb?

As an avid PowerPoint enthusiast, I found the Times discussion quite misleading. The NASA scientists followed the advice of Tufte, filling the slides full of information that made them impossible to follow. One should not try to give complex ideas in a presentation; rather give an overview of these ideas and leave the messy details to technical articles. PowerPoint makes it easy to give overview talks and hard to give talks full of ugly details, crammed slides and detailed formulas, almost always leading to better talks.

Friday, December 12, 2003

Ordering Authors

Every scientific field has their own rules for the order of authors in a paper. In theoretical computer science, tradition dictates that we list the authors alphabetically by last name. I don't agree with this tradition; rarely do all the co-authors of a paper play an equal role. The decision whether to add someone as a co-author, and thus an equal, often becomes difficult.

But breaking with tradition can have its own problems. I have three papers that break the alphabetical rule though two were in biology which has its own rules. In the other back in 1990, Carsten Lund, a graduate student at the time, made the key step in developing an interactive proof system for the permanent. For that we made him first author in the Lund-Fortnow-Karloff-Nisan paper. In retrospect I regret this decision. It only added confusion to those who cited the paper. Also did Lund not play as important a role in other papers where we kept alphabetical order? Breaking with tradition, even with the best of intentions, can often cause more harm than good.

Wednesday, December 10, 2003

Seven Naughty Words

Want an easy rule to greatly improve your writing? Just avoid the following words, particularly in the abstract and introduction of your papers.

am is are was were be been

Avoiding these seven forms of "to be" will force you to write in the active tense instead of the passive making your sentences less boring. For example, instead of "It is known that all functions can be computed securely in the information theoretic setting" use "We can compute all functions securely in the information theoretic setting."

Taking this rule to the extreme can lead to some very convoluted sentences but, I promise, forcing yourself to think actively about every statement you write will make a great difference in your prose. In almost all cases the right answer is "not to be."

Monday, December 08, 2003

Two Uses of Randomness

Over the last 15 years, two very active research areas seem at odds. Derandomization results have shown us that we can often remove randomness from computation but interactive proof systems and PCPs exhibit incredible power from randomness. There is no contradiction here, just two very different ways we use randomness in complexity: for searching and for hiding.

Typically we think of randomness for searching, for example finding disagreements with Fermat's little theorem to show a number is composite or taking random walks on graphs to show they are connected. Derandomization results have given us reasons to believe we can replace the randomness in these computations with pseudorandom number generators.

Randomness can also play the role of hiding, since no one can predict future coin tosses. In interactive proofs we make the jump from NP to PSPACE because of randomness. For PCPs with O(log n) queries the jump goes from P to NP and with poly queries from NP to NEXP, in the later case classes which are provable different. In all these cases the prover cannot cheat because it cannot predict coin tosses not yet made by the verifier. A verifier using a pseudorandom generator will fail here, since the prover could then predict the verifier's actions.

AM protocols have the verifier flip coins first so no hiding going on, rather searching for a statement Merlin can prove and we expect some derandomization for AM. The result that MA is in AM says that sometimes we can replace hiding randomness with searching randomness.

Thursday, December 04, 2003

A Regular Problem Solved

Paz Carmi, Yinnon Haviv and Enav Weinreb from Ben-Gurion University have solved the regular language problem I posted last month.

The problem came from Janos Simon based on a homework question in Kozen's book. Let L(A)={x|x^m is in A for some m}. The homework question asked to show that L(A) is regular if A is regular. The question Janos asks was how many states do we need for a DFA for L(A) if A has n states. Carmi, Haviv and Weinreb show that an exponential number of states are required.

Not only did they solve the problem but also sent me this nice write-up of the proof. I believe this is the first time someone has directly solved a problem I've given on this weblog. I owe each of them a beer (or non-alcoholic beverage of their choice).

Update 12/9: I received the following today from Markus Holzer.

It seems that I have missed your posting about the problem last month. The problem you have stated was already solved in June by Jeff Shallit and co-authors. They have given a lower bound on the DFA accepting root, by considering the (largest) syntactic monoid induced by two generators. The latter problem on syntactic monoid size is of its own interest and I was working on that for a while, therefore I know the result of Shallit et al on the root descriptional complexity. Maybe you also owe the beers to Shallit et al.

Wednesday, December 03, 2003

The Elsevier Dilemma

The Cornell University Library has announced it will drop a substantial number of their Elsevier subscriptions, part of a general problem Cornell and other libraries are facing with higher costs and different pricing models from commercial academic publishers.

I have wanted to write a post on this topic for a while but I find it difficult to truly understand the problems or the potential solutions. Elsevier does a nice job with their portal and their publishing, but because of consolidation and cheap distribution via the internet, they have changed their pricing model in ways that make it difficult for many libraries to afford all of the journals that they need.

This poses some moral questions: Should we avoid submitting our papers to Elsevier journals? Is it wrong for me to serve on the editorial board of the Elsevier-published Information and Computation? I just don't know.

Tuesday, December 02, 2003

The NSF and SIGACT News

First an update on NSF program solicitations: The Formal and Mathematical Foundations cluster has posted its solicitation which includes computational complexity. The deadline is March 4. The program announcement for the Emerging Models and Technologies for Computation Cluster, which includes quantum and biological computing, is still under development. Also the ITR solicitation has also been posted, with some major changes from previous years.

Donald Knuth's tribute to Robert Floyd highlights the December SIGACT News. Also reviews of a bunch of crypto books, a column on sublinear-time algorithms and the complexity theory column on "Post's Lattice with Applications to Complexity Theory."

In my mailbox yesterday was not one but five copies of SIGACT News shrink-wrapped together. Once I unwrapped them and looked at the labels, only the outer one belonged to me. There were two for other professors in my department, one for our library and one for the library of Loyola University Chicago, which is on the other side of the city. I'm not sure if it was a mistake or some attempt by the ACM to reduce mailing costs, but I hope this is a one-time occurrence.

Monday, December 01, 2003

Show Us Your Papers

I hope you all got your Complexity submissions in on time last week. Now let the rest of the world see your work. Make them into technical reports, put them on your home page, submit them to ECCC and/or CoRR (neither of which conflicts with submitting to Complexity). People won't judge you, okay maybe they will, but no one gets famous by keeping their work a secret.

And since I know you'll ask: I didn't have any Complexity submissions this year. It happens.

Wednesday, November 26, 2003

Critical Mass

In the left column of this weblog's homepage, I've added a new "weblogs" section to list the academic weblogs that I have been following. Not very long right now, so start up a relevant weblog and I'll add you to the list.

What is Critical Mass? This is a weblog written by U. Penn English Professor Erin O'Connor. Not much computer science but she covers a number of issues about academic freedom, an important issue to every academic. Take a look.

Don't forget that the 2004 Complexity Conference submissions are due today. Have a great Thanksgiving!

Monday, November 24, 2003

Theory and XML

XML (eXtensible Markup Language) has become quite a popular data format in recent years. XML roughly corresponds to a tree. For example,

<person><name>Harry</name><age>29</age></person>
<person><name>Jane</name><major>Computer Science</major></person>

represents a tree. The root having two children, each labeled "person". The first of these children has two children named "name" and "age". The first of those children has a leaf node labeled with the phrase "Harry". For a larger example, see the RSS feed for my weblog.

XML was designed as a flexible way to present documents for later displaying. Since the XML format can be easily produced and parsed, XML also serves as a standardized method for transferring data between databases, far better than the old CSV (Comma-Separated Values) format.

Recently there have been some work on directly manipulating and querying the XML data. As a theorist, this seems like a bad idea, particularly for larger databases. While XML completely represents the underlying tree, it is not a good implementation of that tree. Basic tree operations like parent and sibling are very expensive just looking at XML. About the only thing one can do quickly with XML is depth-first search. Far better to "shred" the data into a better tree implementation like a DOM (Document Object Model) or a full-fledged database and do the work there, rewriting a new XML if needed.

One issue though is when the XML file is on the order of 5-10 GB, a bit larger than what can be stored in the memory of a typical desktop machine. One can stream through the file rather quickly but cannot recreate the tree. This opens up some interesting theoretical questions:

1. Given a stream of data in XML format, how can one do efficient analysis and manipulations of the underlying tree? I suspect one would want to sometimes shred subtrees, but you cannot determine the size of the subtree until after its been streamed. Perhaps some randomness or streaming the file multiple times might be helpful.
2. XML might not be the right model of a tree for this purpose. What is the best way to stream a tree or other data structure to allow an efficient implementation of the basic operations of the data structure? Perhaps some redundancy might be useful.

Friday, November 21, 2003

Rational Functions and Decision-Tree Complexity

I thought I should mention some of my favorite and most frustrating open questions over the years. Here's one of them:

Let f:{0,1}n→{0,1}. Let h and g be n-variable degree d polynomials over the reals. Suppose for all x in {0,1}n, g(x)≠0 and f(x)=h(x)/g(x). Is there a constant k such that the decision-tree complexity of f is bounded by dk?

The decision-tree (or query) complexity is the number of bits of x that need to be viewed to determine f(x). The queries to the bits of x can be adaptive. I'm particularly interested in the case where d is poly-logarithmic in n.

Nisan and Szegedy answer the question in the affirmative if g(x)=1. Their result holds even if f(x) is only approximated by h(x). However if we allow arbitrary g(x), h(x)/g(x) can closely approximate the OR function which requires looking at all of the bits. The case where we require exact equality of f(x) and h(x)/g(x) is the open question at hand.

Wednesday, November 19, 2003

Midwest Theory Day

Over the years a number of localized theory workshops have developed which bring together researchers from around the area to mingle and learn about some of their research. For example BATS (Bay Area Theory Seminar-San Francisco area), CATS (Capital Area Theory Seminar-Washington DC), DIMACS Mixers (New Jersey), Netherlands Theoriedag and others including many I've never heard of.

In the middle of America we have the Midwest Theory Day. Generally since 1980 this is a day of talks held in the fall in the Chicago area and in the spring somewhere else in the Midwest such as the University of Wisconsin, Indiana, Illinois, Purdue, Notre Dame, etc. The next Midwest theory day will be held December 13 at DePaul University and in the spring on April 24 at the University of Iowa.

So come to Chicago or attend (or organize) you own local area theory seminar and meet your theory neighbors.

Monday, November 17, 2003

Editors Needed

Back in my science-fiction reading days, I particularly remember one editorial written in one of those anthology magazines about 1980: In the near future, you will be able to access, via your personal computer, any science fiction story right after it has been written. If you like a certain author, you can read other stories from that author, even if we didn't decide to put it in this magazine. In this future world, will you still need me, the editor? The answer is yes, for there will be way too much dreck out there for you to find the good stories within, and you will need people like me to point them out to you.

The future is now and though I haven't kept up with science fiction, the same issue applies to academic publications. Recent posts by Michael Nielsen and on Slashdot have asked: With nearly all new papers in physics and computer science easily accessible on the web, how do you find the ones worth reading?

Conferences have traditionally played this role in computer science. But, by definition, paper choices are decisions by committee and with the massive growth in the field, many good papers do not appear in the major conferences.

What we need are "editors"! You can help. Write a survey paper, or spend a page in your research papers discussing the important earlier results in a field. Maintain a web page pointing to papers you find interesting. Start a weblog saying what you find interesting--you don't have to post long or often, just to say, hey, this paper is worth looking at. This way people with similar interests to you can find out what at least you think is important. Only by many of us working together can we make the interesting papers stand out.

Friday, November 14, 2003

A Regular Problem

Here's a problem from Janos Simon. For a set A, let
L(A)={x | for some m≥0, xm is in A}
Simon asked, as a homework problem, to show that if A is regular then L(A) is also regular. The standard solutions require an exponential increase in the number of states. Simon asks whether this is needed in general. If A is accepted by an n-state DFA, can one find a poly(n)-state DFA for L(A) or is an exponential state DFA for L(A) required?

As an aside the submission deadline for the Complexity Conference is right around the corner. Get your papers ready.

Wednesday, November 12, 2003

Opportunities at Chicago (Blatant Plug)

[I will do this just once. Feel free to use the comments link to mention your own institution.]

We are seeking theory graduate students and faculty (all ranks) in the Computer Science Department of the University of Chicago. The Toyota Technological Institute, located on the University of Chicago campus, is looking for both faculty and postdoctoral candidates.

The Theory Group has expanded over the past few years and will continue to grow. We have a long tradition of outstanding doctoral students and notable research accomplishments.

Tuesday, November 11, 2003

25 Years of Science Times

It happened right after I started high school in suburban New Jersey, the start of the Science Times section in Tuesday's New York Times. The Science Times not only helped get me excited about science but made me feel others could get excited over science as well. I've kept reading it off and on during these past 25 years. The Science Times has reported on a fair amount of research in complexity and theoretical computer science, for a time some joked that a result was not important until it appeared in the New York Times.

Today the New York Times celebrates the 25th Anniversary Issue of the Science Times. It features 25 questions such as Does Science Matter? and What Is the Most Important Problem in Math Today? (Hint: It's not P versus NP).

I'll end this post with a quote from the essay of Alan Lightman:
All of the scientists I've known have at least one quality in common: they do what they do because they love it, and because they cannot imagine doing anything else. In a sense, this is the real reason a scientist does science. Because the scientist must. Such a compulsion is both blessing and burden. A blessing because the creative life, in any endeavor, is a gift filled with beauty and not given to everyone, a burden because the call is unrelenting and can drown out the rest of life.

Sunday, November 09, 2003

CISE Reorganization

The Computer and Information Science and Engineering Directorate of the NSF has completed it reorganization. The CISE web site details the new structure.

CISE now has four divisions. Instead of each division have a large number of specific programs, each division contains a smaller number of clusters covering a broader research area. I'm happy to see "Computational Complexity" specifically mentioned in the Formal and Mathematical Foundations Cluster in the Division of Computing & Communication Foundations. However it shares that cluster with such diverse topics as "computational algorithms for high-end scientific and engineering applications" and "analysis of images, video, and multimedia information." Hopefully funding panels will meet in the more specific areas to avoid trying to compare proposals from vastly different areas of computer science.

Quantum and Biological Computing sit in a different CCF cluster, Emerging Models and Technologies for Computation. This shows NSF's hopes for these new technologies but may also give them a way to phase out these areas if the technologies don't show promise.

Program announcements for the CCF clusters are still under development. The ITR solicitation is still not expected until Thanksgiving. So if you plan a grant proposal this year, you'll still need to wait.

Friday, November 07, 2003

The Web and Research

I received the following email from David Molnar, a Berkeley graduate student. He asks some interesting questions about the use of the web and research. Instead of just answering them immediately, I though I would share his email (with his permission) and let you think about it. Feel free to comment. I'll also address some of his points later in future posts.

Molnar's Letter:

I wanted to follow up on your comment about theoretical computer science and e-mail/IM. Also, use of weblogs.

1. There's a piece of folklore about how IP=PSPACE was proved pursuant to some race between many researchers carried out over e-mail. I've once or twice found myself in e-mail races with other people on mailing lists when we're responding to something technical. Do you think the day will come when some major theorem is proved in an IM chat room? :)

2. Weblogs as place for ideas. A friend of mine asked me if I thought researchers in theory would be interested in putting up a web page (actually, we were discussing a wiki) about open problems. My response was pretty much like what you see in Oded Goldreich's FAQ -- "well that's nice, but good research problems are hard to find, and people seem to reserve them for themselves or their close colleagues."

The point: I find that I'm starting to use my weblog as a place to deposit ideas for problems that I just don't have time to pursue, or aren't in what I think is my main line of research. In the old days they would have gone into a notebook somewhere and no one would have seen them. Now maybe someone will pick them up -- and I think that's great (especially since I usually don't know how to progress). At the same time, I wonder how peoples' concerns about ownership of ideas will affect the way they use weblogs. What do you think?

Wednesday, November 05, 2003

The PCP Theorem

This week I started teaching the PCP theorem: Every language in NP has a proof that can be probabilistically checked with only O(log n) random coins and a constant number of queries. One graduate student asked me which came first, the PCP theorem or the connection to hardness of approximation. That seemed like a strange question; after all why would so much effort be put into reducing parameters like number of random coins or queries unless there was some application behind it. But then again, so much emphasis is now in the literature on these parameters that he probably thought these were always the parameters people cared about the most.

To put the record straight: Feige, Goldwasser, Lovász, Safra and Szegedy made the connection between the interactive proof literature and approximation. Arora and Safra showed the importance of reducing random coins and queries and proved a weaker version of the PCP theorem. The theorem in the above form was proved by Arora, Lund, Motwani, Sudan and Szegedy.

I have been teaching off the notes of Madhu Sudan from a short course he taught on the PCP theorem. The notes don't go through every detail, but they give one of the better expositions of what is still quite a complex proof.

Monday, November 03, 2003

What happened at NEC?

The NEC Research Institute (NECI) died just over a year ago. I didn't feel comfortable talking about it then so let me say a few words now.

I joined NECI in 1999 just after its tenth anniversary. When I joined its mission and focus was basic research in computer science and physics. NECI gave me considerable time and resources to study fundamental questions in computational complexity. It was an exciting place to be.

Soon thereafter some changes were occurring. NEC modified the mission of NECI to focus on producing technologies with basic research secondary. Some researchers (though not us theorists) were encouraged to join "technology groups" to find practical applications of their research. But during this time, the administrators always supported basic research and I never felt uncomfortable doing theory.

But then on November 1, 2002, NECI merged with NEC CCR&L, a more applied research lab in the same building to form NEC Laboratories America. The new mission makes no mention of basic research. The scientists in charge were replaced by engineering/management types. Many of the research scientists, particularly physicists, were let go.

My job was never in immediate danger but NEC was no longer the place for me and so I went on the job market; no one was surprised when I decided to leave.

A corporation like NEC needs to make decisions for the health of the company. I do not fault NEC for the decisions that it made and they gave me a few great years. Still I mourn the NEC Research Institute, quite a special place during its thirteen year run.

Friday, October 31, 2003

Lessons from the IM Experiment

Last week I started an experiment using instant messaging. I thank the many of you who sent me IMs, a great way for me to meet you, the readers of this weblog. I plan to keep trying IM for a while but I have had learned a few lessons which seem obvious in retrospect.

Instant messaging can be a time sink. I love communicating with people, which is the main reason I keep this weblog going. However, as most academics, I have much going on and can't afford to have many lengthy discussions. I've also learned there is no clean way to end an IM conversation. So please feel free to IM me but don't take it personally if I rudely keep the conversation short.

Just because the nice icon on the home page says I'm online it doesn't mean that I am at my computer and available to chat at the moment. Often I am and I will but if not I will eventually see your message and respond. If there is really is something important that you want to discuss with me via IM we can setup a scheduled time via email. I often do this with phone calls so why not IM too?

I've also discovered IM conversations can be recorded, posted on the web and could be used in a court of law. I need to be careful about what I say.

I have already had some interesting research conversations and ideas for weblog posts via IM. The last post came in part because of some IM questions about the Feigenbaum-Fortnow paper. Email became a powerful research tool when email use hit a critical mass among computer scientists sometime in the mid-late 80's. I believe IM will also follow that curve and I hope to keep ahead of it and perhaps nudge it a little bit.

Wednesday, October 29, 2003

Changing Strings but not Distributions

Let f be a function that maps Σn to Σn. Let U represent the uniform distribution on Σn and D be the distribution that one gets by applying f to a string drawn from U.

We wish to find f that change x but keep the underlying distribution close to the same, in particular we want the following properties,
(1) Prob(f(x)≠x)≥2/3 when x is drawn from U.
(2) U and D should be statistically close, informally no algorithm making a polynomial number of samples will be able to distinguish, with high confidence, whether those samples came from D or U.

Achieving such an f is easy, consider f that just flips the first bit of x. (1) holds all the time and U=D.

Suppose we add a restriction to f:
(3) In the bits where x and f(x) differ those bits are 1 in x and 0 in f(x). For example, f(011)=010 is allowed, but f(011)=f(111) is not.

An f fulfilling (1), (2) and (3) is impossible. (1) and (3) means that f will reduce the number of ones on most of the strings and taking say n3 samples we will notice a statistical difference in the number of bits which are 1 depending on whether the samples were drawn from U or D.

Suppose we replaced (3) with a weaker restriction:
(3') In the first bit where x and f(x) differ, that bit is 1 in x an 0 in f(x). So f(110)=011 is allowed but f(001)=010 is not allowed.

Can an f fulfilling (1), (2) and (3') exist? Not so clear, but Peter Shor found a simple example: f(0n)=0n, and for the other x, f(x)=x-1 where x is viewed as a nonnegative integer written in binary. D is indistinguishable from U yet f changes nearly every string.

These questions are related to an old paper I had with Joan Feigenbaum which has gotten some recent attention because of a nice new FOCS paper by Bogdanov and Trevisan that builds on our paper. The proofs in these papers work partly because (1), (2) and (3) cannot all happen even for arbitrary distributions U. Both papers give a negative result for a nonadaptive case; the adaptive case corresponds to (1), (2) and (3') and Shor's example shows that the proof techniques will not directly lead to a solution in the adaptive case which remain open.

Monday, October 27, 2003

Quantum Leaping to Conclusions

A quantum computing graduate student sent me email over the weekend. He had thought he had proven some surprising results about the class PP and was wondering if he was making some mistake. After some discussion here was his reply:

Ok I get it. Somehow I jumped to the conclusion that PPP was PP.
There is one more for your blog: A⊆ B implies B⊆ AB but not AB⊆ B (duh!)

He goes on to say he made his quantum leap to conclusions since for the quantum class BQP, PBQP=BQP, he thought the same property must hold for all classes.

I present this because he suggested it for my weblog and as a public service for those who might make a similar mistake. Yes, in case you were wondering, for reasonable classes A (like A=P), B⊆AB without needing to assume A⊆B.

Friday, October 24, 2003

When Good Theorems Have Bad Proofs

Here is one of my favorite examples of a bad proof for what turns out to be a correct theorem.

Theorem: If NP is in BPP then the whole polynomial-time hierarchy is in BPP.

Let's focus on simply showing Σ2 is in BPP if NP is in BPP. The rest is straightforward induction. Here is our first proof:

Σ2=NPNP⊆ NPBPP⊆BPPBPP=BPP.
Do you see the problem with this proof?

To get a correct proof (first due to Zachos) we need to use Arthur Merlin games. Consider a Σ2 language L as an ∃∀ expression. Since NP is in BPP, we can replace the ∀ with a probabilistic test. This gives us what is known as MA or a Merlin-Arthur game where the powerful Merlin sends a message that Arthur can probabilistically verify. A now classic result shows that MA is contained in AM, where Arthur provides a random string to Merlin who must then provide a proof based on that string. Once again we apply the NP in BPP assumption to allow Arthur to simulate Merlin probabilistically and now we have a BPP algorithm for L.

Wednesday, October 22, 2003

Talk to Me

How has the internet most affected the study of science? In one word: communication: the ability for scientists to discuss and share their research with each other quickly and cheaply. So I strive to find new ways to use the internet to improve communication. Starting this weblog is one such example. I'd thought I would try another: Instant Messaging.

Now many of you are thinking I am crazy, but for different reasons. Some of you out there have been using instant messaging for years and wondering how I could consider it s "new" technology. But many of you out there have barely figured out how to read your email attachments and have hardly even heard of IM.

On a trial basis, for my weblog readers, I will welcome your instant messages. Talk to me about this weblog, about complexity and computer science in general or about whatever you want. Maybe I'll start a trend and all computer scientists will IM each other. Maybe not but it's worth trying out.

I'm using Yahoo Instant Messaging; my Yahoo id is the imaginative "fortnow" (note: I do not read email sent to fortnow@yahoo.com). I put a button on the left column of the weblog home page that tells you when I am online and you can click to connect. I look forward to hearing from you.

Monday, October 20, 2003

What's happening at the NSF?

There is a big reorganization in the CISE directorate of NSF. To understand what's happening, let's review the previous structure..

The National Science Foundation, like most government bureaucracies, has a tree-like structure. At top is the office of the director (Rita Colwell). Below that are several directorates including the Directorate for Computer and Information Science and Engineering (CISE) headed by Peter Freeman. By law every organization in NSF cannot be just "science" but "science and engineering" except for the Foundation itself.

Below CISE were several divisions, including Computer-Communications Research (C-CR) headed by Kamal Abdali. C-CR ha several programs including the Theory program headed by Ding-Zhu Du.

Peter Freeman, who recently became head of CISE, has decided to reorganize the whole directorate. Exactly what it will become should be announced next week but there are some hints in this presentation. Change is always scary but I'm hopeful theory will survive. I'll give more details when I know them.

To overcome the tree structure of NSF, there are a number of cross-disciplinary programs. One such program, Information and Technology Research (ITR), has produced several large, medium and small grants to a variety of projects, including many applications of theory. This is the last year of ITR solicitations and the calls have been well behind schedule, probably not unrelated to the CISE reorganization. This year's topic will be on "ITR for National Priorities" with more details promised by Thanksgiving. Unconfirmed rumors have the program will be more focused and only making medium sized grants.

Friday, October 17, 2003

TTI

There are two computer science departments on the University of Chicago campus. The one I belong to, a department in the physical sciences division of the University and the other, the Toyota Technological Institute at Chicago (TTI-C). What is TTI-C?

The Toyota Technological Institute, a university covering various engineering disciplines located in Nagoya City, Japan, was founded in 1981 from funds from the Toyota Motor Corporation as directed by the Toyoda family. They decided to start a computer science department and locate it in the states to have a broader access to computer science faculty and students. For various reasons they settled on Chicago and set up an agreement with the University of Chicago, using space in the University of Chicago Press building. TTI-C has just officially started up and have already signed up a few strong faculty members including theorist Adam Kalai and Fields medalist Stephen Smale. TTI-C plans to increase its faculty size and start up a graduate program in the near future.

Although there will be some sharing of courses and a few of our faculty (including myself) sit on a Local Academic Advisory Council for TTI-C, TTI-C will formally maintain itself as a separate institution from the University. Nevertheless close collaborations between our department and TTI-C has already established an exciting research environment for our combined faculty and students.

Thursday, October 16, 2003

The Agony of Defeat

This is for my friends in Boston who suggested I do a sports post.

One of the great parts of my job is working with people from around the world. I was working with a graduate student, Luis Antunes, from Portugal when we found out that Portugal would play the US in the 2002 World Cup. We had various rounds of taunting back and forth with me fully knowing the US didn't stand a chance in that match. When the US did win, Luis tells me the whole country went into a deep depression. By contrast, for the most part people in the US didn't care.

I can now understand Portugal's pain as the city of Chicago has gone into a similar kind of quiet depression over the Cubs failure to advance to the world series. Impressive what sports can do to the psyche of a city or a country.

Memo to my friends in Boston: Hope things go well for the Sox so your city doesn't end up feeling tomorrow like Chicago does today.

Wednesday, October 15, 2003

Tutorials on Machine Learning and Mixing via Markov Chains

Just prior to the FOCS conference, Avrim Blum gave a tutorial on machine learning and Dana Randall gave a tutorial on mixing. As the talks were computerized, even if you, like me, missed the conference, you can view Blum's slides here and Randall's slides here. Randall also has a companion paper that appeared in the proceedings.

Eli Upfal also gave a tutorial on Performance Analysis of Dynamic Network Processes which I haven't found online yet.

Monday, October 13, 2003

Columbus Day

Today is Columbus Day, a minor holiday in the states. There is a great saying about Columbus: He is not famous for being the first person to discover America, rather he is famous for being the last person to discover America.

The same principle holds for proving theorems. Think about it.

Friday, October 10, 2003

Advice for Students Going to a Conference

With FOCS starting this weekend, here is some advice for students attending a conference.

What is the purpose of a conference? Disseminating information is the obvious answer, but there are better ways these days to announce new results. No, the major reason for conferences is to bring a large segment of our community together in one place. Keeping that in mind,

• Meet People: Don't just hang out with students from your own department. Talk to other students, they will be your future colleagues. Don't be intimidated by senior people whose names you have seen on famous papers. Deep down they are just like you.
• Don't go to too many talks: You will burn out and not remember much of anything. And too much time in talks detracts from the main purpose of the conference. Pick and choose your talks based on your research interests, the abstracts in the proceedings and those given by someone with a reputation for giving good talks.
• Go to the business meeting: It will give you a flavor of the inner workings of the theory community. With luck there will be a lively discussion of some arcane issue. And don't forget the free beer.
Remember, networking is as important in academics as it is in business. But perhaps the most important piece of advice I can give: Have fun.

Wednesday, October 08, 2003

FOCS

Next week is the 44th FOCS Conference in Boston. FOCS, the Symposium on the Foundations of Computer Science is, with STOC, one of the two major American general theory conferences. Okay, usually American. STOC was in Crete in 2001 and FOCS will be in Rome next year but the other 77 FOCS and STOC conferences were held in North America.

I won't be attending the FOCS conference this year, but will definitely be at the next STOC being held in Chicago. I'll likely ask a grad student to pick me up a FOCS proceedings. I'm not sure why; most of the papers are available on-line, the proceedings take up valuable bookshelf space and I've barely opened the proceedings of the last few conferences. But I have a complete set of STOC, FOCS and Complexity proceedings going back to 1986 and I'm a sucker for tradition.

Sunday, October 05, 2003

A New Genius

While on the topic of awards, we have a new "genius" in the theoretical computer science community. Computational geometer Erik Demaine just received a MacArthur Fellowship. Congrats!

Friday, October 03, 2003

Waiting for Nobel

The Nobel science prizes will be awarded next week: Medicine on Monday, Physics on Tuesday and Chemistry and Economics on Wednesday. Nothing against the Turing Award and Fields Medal but it is the Nobel that is a household name. Given the increasing connections between computer science and the other sciences, you never know who you might know who might win.

On the Nobel home page one can find some interesting articles on life as a scientist by some Nobel Laureates. Paul Samuelson ends his story with "Always I have been overpaid to do what has been pure fun," which nails the point that the most important requirement to being a successful researcher is to have fun doing it.

Wednesday, October 01, 2003

Happy Fiscal New Year

I have tried to keep politics out of this weblog with the exception of issues related to science, in particular science funding and immigration. To celebrate America's fiscal new year, let's talk about immigration.

Congress has declined to renew the higher annual cap on H1-B visas, rolling them back to 65,000 for the fiscal year starting today from 195,000 in 2000. H1-B's allow "employers to hire foreign workers with special skills they can't find among American job applicants," typically for high-tech jobs. But H1-B's are also used for visiting researchers at industrial research labs and some university positions. When the limit is reached, the government will no longer issue more visas until the start of the next fiscal year.

At NEC, we had postdocs who had to delay their start date until October for this reason, including in some cases those who wanted to start at the beginning of summer. With the limit dramatically decreased, if the job market starts perking up, we could hit the limit much earlier. This could make a real dent in international cooperation in science.

Monday, September 29, 2003

One-Way Functions

What is a one-way function, intuitively a function that is easy to compute and hard to invert? Taking this intuitive idea to a formal definition has yield two quite different meanings, sometimes causing confusion.

The first directly tries to translate the intuition. A function f is one-way if

1. f is 1-1 (so an inverse is unique),
2. f is length increasing (so the output of the inverse function is not too large),
3. f is computable in polynomial time, and
4. there is no polynomial-time computable g such that for all x, g(f(x))=x.
This is a nice clean definition that fulfills the intuition but is not that useful for cryptography, since f could be easily invertible on all but a small number of inputs, or with stronger adversaries. To handle these issues we have a different looking definition.

A function f is r(n)-secure one-way if

1. There is a function l(n)≥n such that f maps strings of length n to strings of length l(n),
2. f is computable in polynomial time, and
3. for all probabilistic polynomial-time algorithms A, the probability that f(A(f(x)))=f(x) is at most r(n) where the probability is taken over x chosen uniformly from the strings of length n and the random coins used by A.
There are many variations on both definitions and a considerable amount of theory devoted to each. Grollmann and Selman show that one-way functions of the first kind exist if and only if P ≠ UP. On the other hand Håstad, Impagliazzo, Levin and Luby show that from any one-way function of the second kind, one can create a pseudorandom generator.

At one point I tried using complexity-theoretic one-way functions and cryptographic one-way functions to distinguish the two, but this only caused confusion. So we have to live with the fact that we have these two definitions with the same name and we'll have to just use context to figure out which definition is appropriate. If you give a talk or write a paper about one-way functions, it never hurts to distinguish which version you are talking about.

Friday, September 26, 2003

A Speech and A Weblog

UCLA Professor Andrew Kahng gave this wonderful speech to the incoming computer science graduate students. While designed for UCLA, with minor changes it applies to every research institution. Every graduate student or anyone interested in graduate school should read it.

Michael Nielsen, coauthor of my favorite book on quantum computing, has his own weblog. Worth checking out.

Thursday, September 25, 2003

Proof, The Movie

We just got email that Proof, a movie adapted from the play, will be filmed in and around the University of Chicago campus over the next few weeks. I saw the play a couple of years ago in New York, a powerful story about an aging mathematician and his daughter. Given the cast, director and source material I would expect great things from the film version as well, and it certainly should best out the last movie I remember them filming here.

Wednesday, September 24, 2003

Turing, The Novel

Christos Papadimitriou has been exercising his creative talents. He has a new book Turing (A Novel about Computation) building a love story around some of Turing's lectures. He is now working on Logicomix, computer science in comic book form. I can imagine it now, our hero NP-Man and his trusty sidekick P-Girl battle the evil Execution Time.

Monday, September 22, 2003

The Buzz

There has been some buzz about the paper Resource Bounded Unprovability of Computational Lower Bounds by Tatsuaki Okamoto and Ryo Kashima. They claim to show that no polynomial-time machine can produce a proof of a super-polynomial-time lower bound for NP and as a consequence there is no proof that P≠NP in any reasonable proof system such as Peano Arithmetic. The first part I don't really understand but the latter would be a major breakthrough.

But don't get too excited. I have heard from multiple sources that Razborov has found serious problems in their paper, in particular that Theorem 22 is just false. So as one complexity theorist put it, "This is probably not the most important paper on your stack."

Friday, September 12, 2003

Creative Commons

In a conversation I had last week, a professor planned to use slides from lecture notes he found on the web in his own slides. He planned to give attribution to the original producers of the slides but did he need to contact them beforehand for permission. Technically yes, material put on the web is copyrighted by default. But if he sent them email most responses would be of the form, "Please don't waste my time!"

To deal with this issue, Creative Commons has a nice service giving an easy way to release some rights for online material. So I've added a link on the left column of the web log that basically allows you to modify and distribute the material in this weblog for noncommercial use with attribution. No need to ask me, not that anyone has.

On another note, I'm on vacation next week and off the internet. Back after that.

Thursday, September 11, 2003

Balanced NP Sets

Last week I posed the following question:

(1) Exhibit an NP-complete language L, such that for all lengths n≥1, L contains exactly half (2n-1) of the strings of length n.

This question was posed by Ryan O'Donnell and solved by Boaz Barak. Here is a proof sketch.

By using standard padding and encoding tools, (1) is equivalent to

(2) There is an NP-complete language L and a polynomial-time computable function f such that for every n, there are exactly f(n) strings in L of length n.

First we show how to achieve (2) if we replace "strings" with "total witnesses." Consider pair of formulas φ and ¬φ. The total number of satisfying assignments between them total 2n if the have n variables. We just create an encoding that puts φ and ¬φ at the same length. The total number of witnesses at that length is equal to 2n times the number of formula pairs encoded at that length.

We now prove (2) by creating a language L that encodes the following at the same length for each φ

1. φ, where φ is satisfiable.
2. (φ,w) where w is a satisfying assignment for φ and there is another satisfying assignment u<w for φ.
You can check that the language is NP-complete and the total number of strings in L for each φ is just the number of satisfying assignments of φ.

Tuesday, September 09, 2003

Is P versus NP formally independent?

As promised back in March, the October 2003 BEATCS Complexity Column is on whether we can truly settle the P versus NP question. Scott Aaronson gives quite an interesting survey on this topic.

Monday, September 08, 2003

STOC 2004

I have received some requests for the call for papers for STOC 2004 which will be held in Chicago. Maybe it is because I gave the presentation for STOC 2004 at the 2003 business meeting, or because if you google 'STOC 2004' this weblog comes up in the list, or just because I'm in Chicago now, but I am not officially connected to STOC 2004.

Having said that, here is the tentative call for papers.

Thursday, September 04, 2003

Institute of Advanced Study

The Institute will be quite a complexity powerhouse this year. In addition to IAS fixtures Wigderson and Razborov, visiting are Russell Impagliazzo, Manindra Agrawal (with his students Kayal and Saxena at Princeton) and postdocs Boaz Barak, Subhash Khot, Ryan O'Donnell and Nate Segerland. These are just the ones I met yesterday during a short visit several weeks before their semester officially begins. We'll expect great things from them.

Here is a question we thought about yesterday, posed by Ryan O'Donnell and ultimately settled by Boaz Barak:

Exhibit an NP-complete language L, such that for all lengths n≥1, L contains exactly half (2n-1) of the strings of length n.

Think about it. I'll post the proof next week.

Tuesday, September 02, 2003

Scheduling Research

A colleague of mine, who shall remain nameless, likes to schedule time for research, a certain set block of time during the day where he puts off all his todo's and concentrates on science. Sounds good but often his chair will stop by for some discussion or an impromptu meeting. The colleague will say, "Sorry, but I reserved this time for research", but that argument didn't fly, the chair said he could do research anytime. One day he said instead, "Sorry I have a squash game" and the chair replied that they would talk at a future time. Welcome to the academic world, where research gets trumped by a meeting that itself can be trumped by a squash game.

Is scheduling time for research a good idea? It depends on your personality and your research style. If you find yourself with no time to think about an interesting problem because too much else is happening then yes, best to schedule a few hours where you promise yourself you will do nothing else but research during those times. This means more than not preparing for class but also ignoring your computer. Checking email and surfing the web are themselves great time sinks.

In my case, I find it difficult to just start thinking about research at a given time. So I use the rule that research trumps all and when inspiration hits me, or someone comes to my office with a research question, I drop everything I can to work on the problem. Okay, I can't skip a class for research but email, weblog posts, referee reports, etc., should never stand in the way of science.

Wednesday, August 27, 2003

Electronic Commerce

The call for papers for the 2004 ACM Conference on Electronic Commerce is now available. I'm posting this note as my duty as a program committee member to spread the word of the conference.

Why would an electronic commerce conference want me, a complexity theorist, as a PC member? Electronic commerce has many surprising connections to computational complexity. Consider complex auction situations where different buyers wish to purchase different items with varying needs for combinations of these auctions. One needs to design such auctions which decisions made by the buyers, as well as determining the winner must be computationally efficient. This in addition to the usual needs of auctions to be revenue generating, robust against players trying to cheat the system and other notions of "fairness."

In a more philosophical sense, what is a large financial market but some sort of massive parallel computation device that takes pieces of information and produces prices for securities. How can we model and analyze this process? Computational complexity should play a major role in understanding this model of computing and allow us to develop more efficient financial markets.

Monday, August 25, 2003

Hello Chicago

Here I am, my first day back on the campus of the University of Chicago. It's quiet here, Chicago is on the quarter system and classes don't start here until September 29.

I was on the road during the August 22 anniversary of my first post on this weblog. I hope you have enjoyed reading it this year as much as I have had fun writing it.

Wednesday, August 13, 2003

Goodbye New Jersey

My office is all packed up and ready to be shipped. On Friday we move out of our house. I've moved my web pages to Chicago. Burned my files to CD's. It is disconcerting to see all my life's work (papers, talks, programs, grant proposals, editorial and conference stuff, course material, etc.) fit so easily on one CD-ROM. Uncompressed.

Moving is always sad. I've made many good friends at NEC, in and out of theory. But with change comes the excitement of the new chapter of my life ahead of me.

With being on the road and settling in, posting on this site will be quite sketchy over the next several weeks. But don't worry, as one California gubernatorial candidate would put it, I'll be back.

Friday, August 08, 2003

A New-To-Me Pumping Lemma for Regular Languages

I have a gap in my knowledge of work in theory done between 1979 (the publication of Hopcroft and Ullman) and 1985 (when I started graduate school). So every now and then I see a new result from this time that I should have known years ago. Here is an example from the Winter 1982 SIGACT News, a variation of the regular language pumping lemma due to Donald Stanat and Stephen Weiss.

Theorem: If L is regular then there is a positive integer n such that for every string x of length at least n, there are strings u, v and w with v nonempty such that x=uvw and for all strings r and t and integers k≥0, rut is in L if and only if ruvkt is in L.

What surprises me about this result is that w does not appear in the conclusion and that the initial r could put the finite automaton in any state before it gets to u. Here is a sketch of the proof.

Let s be the number of states of a finite automaton accepting L. Let yi be the first i bits of x. For any initial state a, yi will map it to some state b. So one can consider yi as a function mapping states to states. There are at most ss such functions so if |x|≥ss there is an i and a j, i<j such that yi and yj represent the same function. We let u=x1...xi-1 and v=xi...xj-1. The rest follows like the usual pumping lemma.

Using a result of Jaffe, Stanat and Weiss show that this condition is not only necessary but also sufficient to characterize the regular languages.

Wednesday, August 06, 2003

Splitting Sets

Can every infinite set in P be partitioned into two infinite subsets, each also in P? Uwe Schöning answers this question in the affirmative in the Winter 1982 SIGACT News.

Let's generalize the question by saying that a set A splits a set B if both A∩B and A∩B are infinite. Schöning really shows the more general result that any infinite recursive set can be split by a polynomial-time computable set.

Playing with this splitting notion yields lots of potential homework questions. Try these:

1. Show that every infinite regular language can be split by another regular language. Can every infinite context-free language be split by a regular language?
2. Show there is an infinite recursive set that cannot be split by any regular language.
3. Prove Schöning's result above: Every infinite recursive set can be split by a set in P.
4. For a real challenge, show that there is an infinite co-r.e. set that cannot be split by any r.e. set.
In recursion theory, sets that cannot be split by r.e. sets are called cohesive, and r.e. sets whose complements are cohesive are called maximal. For question 4 you are showing that maximal sets exist, a result first proven by Friedberg in 1958.

Monday, August 04, 2003

SIGACT News and The Cold War

Cleaning out my office I came across some old SIGACT News that Bill Gear had given me when he cleaned out his office after his retirement. The Winter 1982 edition is quite interesting. I was a freshman in college that year, well before I was part of the theory community.

There are some interesting technical articles that I will get to in future posts. But the first two pages were letters to the editor that are chilling reminders of the cold war during that time.

On page two was the following short note from Witold Lipski, Jr. and Antoni Mazurkiewicz from the Polish Academy of Sciences.

We are very sorry to inform you that due to the situation in Poland we do not see any chance to organize our 1982 Conference on Mathematical Foundations of Computer Science.

MFCS started in 1972 as an annual conference rotating between Poland and Czechoslovakia, and now between Poland, Slovakia and the Czech Republic. There was no conferences in 1982 or 1983 and the conference did not return to Poland until 1989.

Talking about the Czechs, there was a much longer letter on page one from James Thatcher of IBM. Here are some excerpts.

On a recent trip to Europe, I visited Prague and had the pleasure of talking with Dr. Ivan M. Havel who is a friend and colleague of many years. Ivan Havel received his Ph.D. in CS from Berkeley in 1971. He joined the Institute for Applied Computer Technology in Prague in 1972 and then in 1974 became a member of the Czechoslovakian Academy of Sciences, in the Institute of Information Theory and Automation.

Ivan's brother, Vaclav Havel, an internationally known playwright, was imprisoned in 1979 for four and a half years for his activities in connection with the Charter 1977 movement.

In 1980, possibly related to his refusal to denounce his brother, Ivan Havel was removed from his position in the Academy of Sciences and was unemployed for several months. Last May, he and Vaclav's wife were arrested and charged with "subversion" for allegedly "collecting and distributing written material oriented against the socialist state and social establishment, with hostile intentions." After four days detention, they were released.

He is employed as a programmer-analyst by META, a home-worker program for the handicapped.

Ivan Havel remained a programmer until after the Velvet Revolution in 1989. After some political work in 1990, he became a docent (associate professor) at Charles University and director of the Center for Theoretical Study where he remains today.

His brother Vaclav went on to become president of the Czech Republic.

Friday, August 01, 2003

My Life in Email

When I move back to Chicago, I will go back to my old email address . I got to thinking about how my career can be described by my email addresses.

As an undergrad at Cornell, I spent several years working for computer services writing an email system in assembly language for the IBM 370. The system was scrapped shortly after I left for grad school at Berkeley. After a year at Berkeley, I followed by advisor, Michael Sipser, to MIT.

I had email addresses at Cornell and Berkeley but I have long since forgotten them. At MIT I wanted the userid "lance", but the name was taken by Lance Glasser, then an MIT professor. So my email became .

When I graduated and went to Chicago, I decided to stick with the userid "fortnow" for an email of . This bucked the trend at the time of having first names for email at Chicago so I had to have aliased to . When the university started system wide email I got though also works.

When I did a sabbatical in Amsterdam my email became or simply . When I moved to the NEC Research Institute my email because aliased to and when the NEC Research Institute became NEC Laboratories America I got my current email .

In addition to this, the ACM has created permanent email addresses, permanent as long as you are an ACM member and I did create an address though I never did give it out (until now). My brother and I now own the domain fortnow.com and I have what I do call my permanent address, . I also am the default receiver for fortnow.com mail, which means that addresses like , or even will all go to me.

All of the email addresses in this post still work and forward to me. But I will stick to using two main email addresses, for work related email and for non-work emails.

I used javascript to generate the emails in this post to avoid adding even more to my heavy spam load. We'll see if it works or whether I start getting spam sent to .

Wednesday, July 30, 2003

Information Markets for Fighting Terrorism

A few months ago I had a post describing information markets, a system of buying and selling securities that pay off if a given future event happens. Based on the price of a security, one can get an estimate of the probability that that event will occur. Studies have shown that information markets are better predictors than polls or experts.

Information markets have taken a blow in the past few days. The US Department of Defense has cancelled a program that would have set up limited futures markets on securities based on terroristic activities. They bowed to pressure from senators who consider it morally wrong to bet on events on future terrorist attacks. I understand their concerns but computer scientists and economists have produced what could have been a powerful tool in controlling terrorism and it is quite a shame to see it discarded so easily.

David Pennock sent me some links on a more positive point of view from CNN, Fortune and Wired and a fun CNN piece on the Tradesports Poindexter future.

Update (8/1): A well-written New York Times column A Good Idea with Bad Press and a nicely argued opinion piece by David Pennock.

Monday, July 28, 2003

The Tour

I know this is not a sports weblog and I don't even like bicycling but anytime an American named Lance wins a major championship I can't let it go unnoticed.

Thursday, July 24, 2003

The Move

Way back when I was a graduate student, I moved from Berkeley to MIT. I put what few belongings I had into boxes and shipped them via UPS. My brother flew out and we drove across the country together. Those were the days.

Now making the move back to Chicago is not nearly so simple. We have houses to sell and buy. Getting our kids ready for a new school. Real estate agents, lawyers, mortgage and insurance people to deal with. Meanwhile there is academic work that needs to get done before the real move. Conference and grant deadlines don't move to accommodate my move.

So this weblog might get a little spotty until I get settled into Chicago, sometime in mid-September. I'll try to find some time for some posts during that time but don't expect too much. If you are having complexity weblog withdrawal check out the archives. Nice thing about complexity--old stuff doesn't (usually) get stale.

Monday, July 21, 2003

Juntas

Eldar Fischer gave a talk today on his paper on testing juntas. So what is a junta? According to the American Heritage Dictionary, a junta is a "A group of military officers ruling a country after seizing power", named after such groups in Central and South America in the 80's. In this paper a junta refers to a function that depends on a constant number of variables.

Back when I was young (those 80's) we had a different name for a function that depends on a constant number of variables: NC0, a specification of NCk, functions computable in constant fan-in circuits of depth O(logk n). If k = 0 we have constant fan-in and constant depth, which means it depends on a constant number of variables.

Digression: NC stands for "Nick's Class" named by Steve Cook in honor of Nick Pippenger. Pippenger repaid the favor with the class SC but enough of that for now.

Juntas are just one example I'm seeing of a trend of new names and definitions of concepts defined years ago. Makes me feel old. Of course back then we likely studied concepts were thought were new but might not have been. Just part of the great circle of math.

Wednesday, July 16, 2003

Citeseer

Perhaps the NEC project most valued by computer scientists is Citeseer developed by Steve Lawrence, with help from Lee Giles and Kurt Bollacker and others. Citeseer is a free web-based service that scans the internet for computer science technical reports, parses the citations and cross-links the papers. You can use Citeseer to see, based on citations, which papers are most relevant to a particular topic. You can sort papers or even computer scientists (not necessarily a good thing). Since the papers are cached you also quickly get hold of old versions of many papers.

With the changes at NEC, I often get asked what will happen to Citeseer. Don't worry--Citeseer will soon have a new safe home and will continue to provide computer scientists with an easy way to track down papers.

Monday, July 14, 2003

Another rump session talk by Scott Aaronson showed that BQP/qpoly is contained in EXP/poly. In other words, everything efficiently quantumly computable with a polynomial amount of arbitrarily entangled quantum advice can be simulated in exponential time with a polynomial amount of classical advice.

Let me try to put this in context while avoiding quantum mechanics. Advice is method for encoding a different program for each input length. We define the class P/poly as those languages computable in polynomial time with access to a polynomially-long advice string an where the string an depends only on the length of the input. P/poly is equivalent to those problems having nonuniform polynomial-size circuits.

Quantum advice is a bit more tricky, since it can be in a superposition of regular advice strings. Formally, quantum advice is an exponentially long vector of numbers βa where βa is the amplitude of advice string a. For simplicity let us assume those numbers are real and we'll also have the restriction that the sum of the squares of the amplitudes is one.

You can see there are far more ways to give quantum advice than classical advice. But the quantum machines are limited in how they can use the advice. Harry Buhrman asked whether one can give any limit at all to what one can do with quantum advice. Scott Aaronson gives an answer: No better than classical advice as long as you are allowed (classical) exponential time.

Ideally one would like that efficient quantum algorithms with quantum advice can be simulated with efficient quantum algorithms with classical advice. Still Aaronson's result shows that even with fully entangled advice one cannot get all the information out of it.

Friday, July 11, 2003

A Combinatorial Theorem Proven by Kolmogorov Complexity

During the rump session of complexity, Nikolai Vereshchagin presented a combinatorial theorem that he proved using Kolmogorov complexity. Let A be a finite subset of N×N where N is the set of natural numbers. Let m be the size of A, r be the number of nonempty rows of A and c the number of nonempty columns.

We say A is good is every nonempty row has m/r elements and every nomempty column has m/c elements of A. A rectangle has this property, as does a diagonal. We say A is k-good if every nonempty row has at most km/r elements and every nonempty column has at most km/c elements. A is good if it is 1-good.

Vereshchagin's Theorem: There is a constant c such that for all finite subsets B of N×N with n = log |B| there is a partition of B into at most nc sets each of which is nc-good.

Vereshchagin asks whether there is a purely combinatorial proof of this theorem. If you know of one let me know.

For those who know some Kolmogorov complexity, let me sketch the proof: We label each point (x,y) of B with the following five values: KB(x,y), KB(x), KB(y), KB(x|y) and KB(y|x). We partition the points into sets with the same labels. Standard counting arguments from Kolmogorov complexity show that each partition is nc-good for some c.

Wednesday, July 09, 2003

The Rump Session

One of the nice aspects of the Complexity Conference, the rump session, I don't see at many other conferences. Here anyone who wants to, mostly students, get ten minutes to lay out their latest research.

This year we had one of the best rump sessions ever with five really neat results, listed below. Don't worry if you don't immediately understand the results, I will talk about some of them in more detail in future posts. All but Vereshchagin are students.

1. Nikolai Vereshchagin: A combinatorial theorem proven by Kolmogorov complexity with no known direct combinatorial proof.
2. Troy Lee: For every finite set A and x in A, CNDA(x) ≤ log |A| + O(log3|x|). CND is the nondeterministic version of Sipser's distinguishing complexity.
3. Scott Aaronson: Everything efficiently quantumly computable with a polynomial amount of arbitrarily entangled quantum advice can be simulated in exponential time with a polynomial amount of classical advice.
4. Kristoffer Hansen: Constant-width planar circuits compute exactly ACC0, constant depth circuits with a mod gate for some fixed composite.
5. Samik Sengupta: If co-NP has an Arthur-Merlin game with a polylogarithmic number of rounds then co-NP is in NP with quasipolynomial advice and the exponential hierarchy collapses at the third level.

Monday, July 07, 2003

Coming Home

Howdy from Aarhus, Denmark where I am attending the 18th Annual IEEE Conference on Computational Complexity. As readers of this weblog probably have figured out, this is, by far, my favorite conference. Complexity is smaller and more relaxed than the broader theory conferences like STOC and FOCS. I know most of the participants and have a genuine interest in the most of the papers here. While elsewhere theorists try more and more to find connections to applied areas, here we are happy to focus on the fundamental power of efficient computation.

During the rest of the year I attend some conferences in broader areas or in areas that are not in my major focus. But coming to Complexity is coming home, and it always feels great to be back where I belong.

Thursday, July 03, 2003

Complexity Abstracts

In the early years of the complexity conference, some researchers complained that if they had new results or their papers were not accepted in the conference, their results would not be known. So we started up Complexity Abstracts, an electronically available document where anyone can publish a one-page abstract of their work. The abstracts were made available right before the conference.

Last year, the first abstracts editor, Bill Gasarch, resigned his job after 11 years of collecting the abstracts. Personally I thought that in this new internet age, with sites like ECCC, the Complexity Abstracts were no longer as valuable a resource. But at the business meeting of last year's conference the Abstracts were greatly supported and Steve Fenner volunteered to take on the job of editor.

Fenner's first collection has just been posted ahead of the Complexity Conference next week in Denmark.

Wednesday, July 02, 2003

For the Love of Math

A doctor, lawyer and mathematician were discussing whether it was better to have a wife or a girlfriend. The doctor said it was better to have a wife because it is medically safer to have a single partner. The lawyer said it was better to have a girlfriend to avoid the legal hassles of marriage. The mathematician said it was better to have both.

"Both?" said the doctor and the lawyer. "Yes," said the mathematician, "That way the wife thinks I'm with the girlfriend, the girlfriend thinks I'm with the wife and I can do some math."

I was reminded of that joke by the recent New York Times article Pure Math, Pure Joy and the accompanying slideshow. Those pictures look all too familiar.

The greatest lovers of math though are not the famous mathematicians at places like Berkeley and Harvard. Rather the mathematicians who take low-paying jobs with high teaching loads at less-strong colleges or move from visiting position to visiting position just to have some occasional time to do math. They have a dedication (or perhaps an addiction) I can never fully appreciate.

Monday, June 30, 2003

Expanders from Epsilon-Biased Sets

Expander graphs informally are graphs that given any subset S that is not too large, the set of vertices connected to S contains a large number of vertices outside of S. There are many constructions and applications for expander graphs leading to entire courses on the subject.

The adjacency matrix A of a graph G of n vertices is an n×n matrix such that ai,j is 1 if there is an edge between vertices i and j and 0 otherwise. Noga Alon noticed that a graph that has a large gap between the first and second eigenvalue of the adjacency matrix will be a good expander.

We can use ε-biased sets to get expanders. Let S be a ε-biased set for Fm for F the field of 2 elements. Consider the graph G consisting of 2m vertices labelled with the elements of Fm and an edge from x to y if y=x+s or x=y+s. This kind of graph G is known as a Cayley graph.

By looking at the eigenvalues the adjacency matrix A of G we can show G is an expander. The eigenvectors v are just the vectors corresponding to the functions g in L described earlier. For any vector a we have

(Ag)(a) = Σs in S g(a+s) = g(a) Σs in S g(s)
since g(a+s) = g(a)g(s). Let g(S) = Σs in S g(s). We now have that
Ag = g(S) g.
So g is an eigenvector with eigenvalue g(S). If g is the constant one function then g(S)=|S|. Since S is an ε-biased set, g(S)≤ε|S| for every other g, so the second eigenvalue is much smaller than the largest eigenvalue and G must be an expander.

Wednesday, June 25, 2003

SIGACT

The June 2003 SIGACT News is out. Aduri Pavan wrote this months Complexity Theory Column on "Comparison of Reductions and Completeness Notions".

As I have mentioned before in this weblog, I heartily encourage joining SIGACT, the ACM Special Interest Group on Algorithms and Computation Theory. You get the SIGACT News, discounts on conferences and as I discovered last night from home, you apparently get online access to the STOC proceedings. Not to mention supporting the theory community. All this for the low price of $18 ($9 for students).

What about the ACM itself? I have been an ACM member since graduate school since I feel it is important to support the main computer science organization. But for the additional $96 ($42 for students) there are no real significant benefits over joining SIGACT alone.

Tuesday, June 24, 2003

Epsilon-Biased Sets

ε-biased sets are an interesting concept that I have seen recently in a few papers but never seemed to have a clear description. At FCRC Eli Ben-Sasson gave me a good explanation and I will try to recreate it here.

Let F be the field of 2 elements 0 and 1 with addition and multiplication done modulo 2. Fix a dimension m. Let L be the set of functions g mapping elements of Fm to {-1,1} with the property that g(x+y)=g(x)g(y). Here x+y represents addition done coordinate-wise modulo 2. One example of a g in L is g(x1,x2,x3)=(-1)x1 (-1)x3.

There is the trivial function g in L that always maps to 1. For every non-trivial g in L exactly half of the elements in Fm map to 1 and the others to -1. If one picks a reasonably large subset S of Fm at random then high probability, g will map about half the elements to 1 and the rest to -1. In other words the expected value of g(x) for x uniformly chosen in S is smaller than some small value ε. If this is true we say S is ε-biased for g.

An ε-biased set is a set S such that for all nontrivial g in L, S is ε-biased for g. Formally this means that

Σx in S g(x) ≤ ε|S|.
Not only do reasonable size ε-biased sets exists but they can be found efficiently. Naor and Naor found the first efficiently constructible ε-biased sets of size polynomial in m and 1/ε.

One can extend the notion of ε-biased sets to fields F of p elements for arbitrary prime p. L would now be the set of functions g mapping elements of Fm to the complex pth roots of unity, e2π(j/p)i for 0≤j≤p-1 again with the property that g(x+y)=g(x)g(y). Various constructions have created generalized ε-biased sets of size polynomial in m, 1/ε and log p.

For applications let me quote from the recent STOC paper by Ben-Sasson, Sudan, Vadhan and Wigderson that used ε-biased sets to get efficient low-degree tests and smaller probabilistically checkable proofs. You can get more information and references from that paper.

Since the introduction of explicit ε-biased sets, the set and diversity of applications of these objects grew quickly, establishing their fundamental role in theoretical computer science. The settings where ε-biased sets are used include: the direct derandomization of algorithms such as fast verification of matrix multiplication and communication protocols for equality; the construction of almost k-wise independent random variables, which in turn have many applications; inapproximability results for quadratic equation over GF(2); learning theory; explicit constructions of Ramsey graphs; and elementary constructions of Cayley expanders.

Monday, June 23, 2003

Whose hands are on the FOCS cover?

Sometimes I learn interesting aspects of the history of theory from this weblog. On an old post on conference covers, Pekka Orponen left a comment saying that Alvy Ray Smith originally designed the FOCS cover back in 1973, when the conference was SWAT: Switching and Automata Theory. Here is Smith's story about the cover where he states the hands are his.

Smith had three papers in SWAT before moving into computer graphics and co-founding companies such as Pixar. So next time you are looking up a FOCS paper take a look at the hands of a one-time theorist who made a difference.

Wednesday, June 18, 2003

Walter Savitch

After the FCRC meetings I attended were concluded, I headed up to UCSD for the celebration of Walter Savitch for his sixtieth birthday and upcoming retirement. He gained his fame in complexity for Savitch's Theorem that shows "P=NP" for space.

I learned quite a bit at the meeting. Walt Savitch was Steve Cook's first student, his only student while Cook was at Berkeley in his pre-Toronto pre-"SAT is NP-complete" days. Also as Cook said, Savitch is the only student he has had with a theorem named after him. That theorem made up a good part of Savitch's Ph.D. thesis. At the celebration Cook gave an overview on propositional proof systems.

After coming to UCSD, Savitch did some work on computational linguistics and one of the leaders of the field, Aravind Joshi, gave a talk on combining trees to keep the structure when parsing sentences.

Savitch is probably best known now in computer science for his textbooks in introductory programming that likley many of you have used.

Congrats Walt on a fine career and here's hoping retirement doesn't slow you down.

Tuesday, June 17, 2003

FOCS 2003

The list of accepted papers for FOCS 2003 has been posted. As usual, many (but not all) of the papers are available on the authors' web sites.

Monday, June 16, 2003

Boosting

As promised I added links to the papers in the post on the STOC business meeting. Let me say some more words on the winner of the Gödel prize

Valiant developed the concept of PAC (Probably Approximably Correct) learning as roughly where a learner sees a small number of labelled examples from a distribution and with high confidence will generate a hypothesis that with high probability will correctly label instances drawn from the same distribution.

A strong learner has confidence close to 100%; a weak learner has confidence only slightly better than 50%. Schapire, using a technique called boosting, showed how to convert a weak learner to a strong learner. This is a wonderful theoretical result but the algorithm had problems that made it difficult to implement.

In their Gödel prize winning paper, A decision-theoretic generalization of on-line learning and an application to boosting, Freund and Schapire develop the adaboost algorithm that solves many of these issues and has become a staple of the theoretical and practical machine learning community.

Boosting has its own web site where you can find much more information about the algorithms and applications.

Saturday, June 14, 2003

Alonzo Church

Alonzo Church was born a hundred years ago today in Washington, DC. Church is best known for the λ-calculus, a simple method for expressing and applying functions that has the same computational power as Turing machines.

With Rosser in 1936, he showed that λ-expressions that reduce to an irreducible normal form have a unique normal form. In that same year he showed the impossibility of decided whether such a normal form existed.

Church's thesis, which he states as a definition: "An effectively calculable function of the positive integers is a λ-definable function of the positive integers."

Again in 1936, Kleene and Church showed that computing normal forms have the equivalent power of the recursive functions of Turing machines. And thus the Church-Turing thesis was born: Everything computable is computable by a Turing machine.

The λ-calculus also set the stage for many of the functional programming languages like lisp and scheme.

Alonzo Church passed away on August 11, 1995 in Ohio.

Friday, June 13, 2003

Thoughts on FCRC

I have mixed feelings about the Federated Computing Research Conference. It is a good idea to get many different areas of computer science together. I do get to see many people I haven't seen in years who went into non-theoretical areas of CS.

On the other hand 2200 participants made the place quite crowded and it seemed to take away from the informal atmosphere of most theory conference. Since STOC and Electronic Commerce had nearly a complete overlap I jumped back and forth between talks never really feeling fully part of either conference.

For the first time the Complexity conference was not part of FCRC because 2003 is a Europe year for Complexity. In an informal poll I took of STOC people interested in complexity most liked having both conferences at the same place but would rather that happen in isolation, like last year in Montreal, rather than as part of the much larger FCRC meeting.

In what seems to be a trend in CS conferences, wireless internet was made available at the conference site. As you walked around you would pass many people sitting on chairs and on the ground hunched over their laptops disconnected from the conference and connected into another world. Seemed a bit depressing but I too found the net hard to resist--it is always tempting to simply open my laptop and connect, checking email and posting to this weblog.