Computational Complexity: 2004

Thursday, December 30, 2004

Complexity Year in Review

Theorem of the year goes to Omer Reingold who shows Undirected Connectivity in Logarithmic Space, settling the long-standing open problem. Also of note, Ran Raz's lower bounds on multilinear formulas and a series of paper extracting randomness from independent sources I had mentioned earlier as well as new improvements by Raz. We had many other great results throughout the year, check out the ECCC to sample many of them.

The National Science Foundation and computer science funding face a challenging future. We have seen the end of the ITR program that has poured needed money into IT research. CISE reorganizes (putting complexity theorists and numerical analysts into the same program for example) as they struggle to meet a new reality of CS funding. To add to the burden the NSF had an actual decrease in its funding for FY 2005.

The decrease in foreign graduate students as US schools made news over the past year. The difficulty in obtaining visas since the terrorist attacks in 2001 certainly play a role but I give more weight to stronger local alternatives manned by many of those foreign students we have educated over the last several decades.

We lost three prominent members of our community: Shimon Even, Carl Smith and Larry Stockmeyer.

Outside of complexity we had an election that divided this country (but not the scientists), the Red Sox winning the world series and an ongoing struggle in Iraq. Unfortunately the year ends with a terrible natural disaster and we all should support the efforts to help those affected recover from this tragedy.

Tuesday, December 28, 2004

Copyrights on Conference and Journal Papers

A guest post from Alan Selman

It is a common and acceptable practice to present a preliminary version of a paper at a conference and then to submit the full paper to a refereed journal. The Foreword of a recent STOC proceedings states, for example,

The submissions were not refereed, and many of these papers represent reports of continuing research. It is expected that most of them will appear in a more polished and complete form in scientific journals.

That is the ideal. In practice, many of us have been in the habit of putting complete versions of accepted papers in conference proceedings. This is not a wise strategy. Assuming that the publisher of the conference proceedings holds copyright of the papers that appears in the proceedings, the paper that one submits to a refereed journal must be substantially different from the one that appears in a conference proceeding in order to avoid copyright infringement. I asked my contact at Springer, the publisher of the journal Theory of Computing Systems, how different the journal version of a paper should be from the conference version, and was told to use 25% as an acceptable guideline. This is a reasonable response that should not too difficult to manage for theorists, and can be taken up by proofs and explanations that were not included in the conference version. I should add however, that Lance Fortnow asked the same question of IEEE, the copyright holder for papers that appear in the Conference on Computational Complexity, and the response was far more onerous. Lance's contact at IEEE called for 30% to 70% new material, and material from the original paper should be there only to support new ideas. I would rather abide by Springer's recommendation than IEEE's. The essential point is that the journal version must be essentially different from the conference version.

Computation and the Socio-Economic Sciences

Fred Roberts, Rakesh Vohra and myself are co-charing the DIMACS Special Focus on Computation and the Socio-Economic Sciences, a series of workshops and other activities designed to bring together computer scientists and social scientists. The focus has already hosted some good workshops on topics like Electronic Voting and Auctions. Let me put in a plug for a couple upcoming workshops.

At the end of January, Richard McLean, Daijiro Okada and myself are organizing a workshop on Bounded Rationality at Rutgers. Bounded rationality looks at game theory questions like equilibria with computationally-bounded participants.

Immediately following the Bounded Rationality workshop is the workshop on Markets as Predictive Devices (Information Markets). The organizers Robin Hanson, John Ledyard and David Pennock have created a program with a set of speakers that spread the range from theory to practice in markets designed to estimate the probability of future events. The program ends with a panel discussion on the Policy Analysis Markets that came under fire a year and a half ago for its "terror futures".

In mid-April, Rakesh Vohra and I are organizing a workshop on Large-Scale Games to be held at Northwestern. This workshop will examine questions in game theory that deal with issues in large scenarios like the internet with many players, incomplete information, lack of common knowledge, asynchrony and related problems.

We have two other scheduled workshops in the focus, Polyhedral Combinatorics of Random Utility in June and Yield Management and Dynamic Pricing in August, and many more workshops under development.

Sunday, December 26, 2004

Tragedy in Asia

Chennai, the location of the recent FSTTCS conference reviewed in the last post, was one of the areas hit hard by today's earthquake caused tsunami. The globalization of science really makes these tragedies seem close even to those of us on the other side of the planet.

Our condolences and prayers go to those in all areas affected by today's events.

Thursday, December 23, 2004

FSTTCS 2004 at Chennai

By Prahladh Harsha and Jaikumar Radhakrishnan

It was nice to be back in Madras after a long time and even nicer to meet friends from MIT and elsewhere during this year's Foundations of Software Technology and Theoretical Computer Science Conference. FSTTCS is the longest-running international conference on computer science in India, and is organized under the aegis of the Indian Association for Research in Computing Science (IARCS).

The 24th edition was held in Chennai between Dec 16 and Dec 18. The conference was preceded by two workshops on Algorithms for dynamic data (Dec 13 - 14) and Logics for dynamic data (Dec 15). Here are a few statistics about this year's FSTTCS: 38 accepted papers (out of 178 submitted from 38 countries), attendance of about 175 (nearly 35% from outside India). The invited speakers: Pavel Pevzner (UCSD), Javier Esparza (Stuttgart), Piotr Indyk (MIT), John Reynolds (CMU) and Denis Therien (McGill). The proceedings of the conference were published by Springer in LNCS series.

FSTTCS attracts papers in Logics & Semantics and Algorithms & Complexity. Biased by our interests, we attended only the presentations on Algorithms & Complexity. In our view, the highlights were:

M Fürer, S P Kasiviswanathan, An almost linear time approximation algorithm for the permanent of a random (0-1) matrix.
A K Ponnuswami, H Venkateswaran, Monotone Boolean circuits for bipartite perfect matching require exponential size.
L Radamacher, S Vempala, Testing geometric convexity. This paper demonstrates that testing whether a given set S in R^n is convex or far from convex (in the property-testing sense) can be determined by queries, whose number is independent of the actual size of the set S, but is however exponential in the dimension of the space (i.e., n). The set S is presented to the tester by a membership oracle and a random oracle.
J Hitchcock, A Pavan, Hardness hypotheses, derandomization and circuit complexity. This paper shows that the NP-machine hypothesis is implied by both the Measure hypothesis and pseudo-NP hypothesis. These 3 hypotheses are some of the commonly used hypotheses in derandomization. Furthermore, the authors show that several of the derandomization and circuit lower-bound results that are known to follow from the latter two hypotheses also follow from the NP-machine hypothesis.
J Kempe, A Kitaev, O Regev, The complexity of the local Hamiltonian problem. The local Hamiltonian problem is a natural complete problem for the complexity class QMA, the quantum analog of NP. It was known that the 1-local Hamiltonian problem is in P while the k-local Hamiltonian problem is QMA-complete for k >= 3. This paper settles the case for the 2-local Hamiltonian problem and shows that it is also QMA-complete. Thus, the k-local Hamiltonian problem is similar in spirit to MAX-k-SAT which is NP-complete for k > = 2.

Some authors could not attend the conference. The paper Minimum weight pseudo-triangulations by J Gudmundsson and C Levcopolous was presented beautifully by Mark de Berg. The paper on No, coreset, no cry by S Har-Peled, was presented by Kasturi Varadarajan. The author did not get a visa in time from the Chicago consulate (there are some advantages to traveling on an Indian passport), and had sent a recording of his presentation. In the end, Kasturi did a wonderful job using only the slides, but one was left wondering what the video looked like.

Prof. Rani Sirmoney, who turned 75 in July 2004, was honored in a special session at the conference. Prof. Sirmoney is a senior and highly respected theoretical computer scientists in India who works in the area of Formal Languages, Automata Theory, Machine Learning and DNA Computing. She has nurtured several generations of researchers through her teaching and collaborations. In her speech, she thanked her colleagues at the Madras Christian College for their support and encouragement, and mentioned, among other things, that she never faced any gender-based discrimination at that institution.

Tuesday, December 21, 2004

Favorite Theorems Recap

In 1994, I listed My Favorite Ten Complexity Theorems of the Past Decade in conjunction with an invited talk at that year's FST&TCS conference. Over this past year in this weblog I went back and picked my favorite ten complexity theorems of the decade since that first paper.

Derandomization (Impagliazzo-Wigderson)
Primality (Agrawal-Kayal-Saxena)
Probabilistically Checkable Proofs (Håstad)
Connections (Trevisan)
Superlinear Bounds on Branching Programs (Ajtai)
Parallel Repetition (Raz)
List Decoding (Sudan)
Learning Circuits (Bshouty-Cleve-Gavaldà-Kannan-Tamon)
Quantum Lower Bounds (Ambainis)
Derandomizing Space (Saks-Zhou)

The purpose of these endeavors is not just to have a top ten list but to use these great results to highlight several areas in computational complexity where we have seen significant progress in recent years. Ten areas do not do justice to complexity and we also have great results in proof complexity, Kolmogorov complexity, constructions of expanders and extractors, zero-knowledge proofs and structural complexity.

We'll do this again in ten.

Sunday, December 19, 2004

Numb3rs

Coming in January to the American television network CBS, a series about a FBI agent who recruits his brother, a math genius, to help solve crimes. I don't hold much hope for realism in Numb3rs (anyone remember Q.E.D.?) but any show that shows mathematicians in a good light helps shape public perception of our field.

I would like to know the "actual events" that inspired this series.

Friday, December 17, 2004

Ooscla

One of my Indian graduate students mentioned the university yookla and gets confused when I talk about Texas (not UTA) or Illinois (as opposed to UIUC which is too easily confused with UIC).

So for the foreigners out there, here is a quiz for the common names for some major American universities, which all native-born Americans know because of their sports teams.

USC (not ooscla or USCLA)
South Carolina (not USC)
Wash U (not woostle) as opposed to Washington or George Washington or Wash State
California or just Cal. Though I would never say "Cal Prof Luca Trevisan"
UTEP this time pronounced U-tep.
Bama as opposed to UAB
Mizzou
Ole Miss
UVM
Wisconsin as opposed to UWM
Maryland not UMCP as opposed to UMBC
Miami (FL) as opposed to Miami (OH)

I could go on forever. Sometimes the name depends on where you say it. In Illinois we have Northern, Southern, Western, Northeastern as well as Northwestern.

Thursday, December 16, 2004

Quality Matters

A commenter yesterday asked about a Crooked Timber post arguing that many strong departments place an emphasis on hiring students from other strong departments more than their actual publication record. The Crooked Timber post focuses on the social sciences and the hard sciences can and do put more focus on research itself.

In computer science, particularly in theoretical computer science we can really judge the quality of a student's research in a more objective way than a paper in say sociology. It does not matter where you got your Ph.D., if you didn't have strong results as a graduate student you won't find employment at a top university. If one does have solid results from a lesser known school one can find a top academic job though admittedly this happens less often.

I won't deny a strong correlation between where you got your Ph.D. and where you get employed: The best undergraduates choose the best graduate schools. At the strong schools you can usually find stronger faculty as potential advisors, stronger fellow students to work with and push you and you will have letter writers with stronger credentials when you do enter the job market. There is a social network among stronger schools that we cannot ignore. And some places, particularly those without experts in an area, will put too much emphasis on a student's department.

But we can better measure the quality and importance of a student's research and impact and that always plays the most critical role in whether one offers that person a position in a particular field.

Tuesday, December 14, 2004

Decryption by Clancy

Consider the following paragraphs from Tom Clancy's 1998 novel Rainbow Six.

The phone they spoke over was the Russian version of the American STU-3, the technology having been stolen about three years before by a team working for Directorate T of the First Chief Directorate. The internal microchips, which had been slavishly copies, scrambled the incoming and outgoing signals with a 128-bit encryption system whose key changed every hour, and changed further with the individual users whose personal codes were part of the insertable plastic keys they used. The STU system had defined the Russians' best efforts to crack it, even with exact knowledge of the internal workings of the system hardware, and they assumed the the Americans had the same problems—after all, for centuries Russia had produced the world's best mathematicians, and the best of them hadn't even come up with a theoretical model for cracking the scrambling system.
But the Americans had, with the revolutionary application of quantum theory to communications security, a decryption system so complex that only a handful of the "Directorate Z" people at the National Security Agency actually understood it. But they didn't have to. They had the world's most powerful supercomputers to do the real work. They were located in the basement of the sprawling NSA headquarters building, a dungeonlike area whose roof was held up with naked steel I-beams because it had been excavated for just this purpose. The star machine there was one made by the company gone bankrupt, the Super-Connector from Thinking Machines, Inc., of Cambridge, Massachusetts. The machine, custom build for NSA, had sat largely unused for six years, because nobody had come up with a way to program it efficiently, but the advent of quantum theory had changed that, too, and the monster machine was now cranking merrily away while its operators wondered who they could find to make the next generation of this complex machine.

I doubt anyone could turn a souped-up ConnectionMachine into a quantum computer. But one could imagine some smart NSA scientists finding a way to simulate a variation of Shor's quantum factoring algorithm on such a device, well enough to break moderately long RSA codes. Just thinking.

Monday, December 13, 2004

Favorite Theorems: Derandomizing Space

November Edition

We started this list of favorite theorems with derandomization of time classes. Now we end the list by looking at derandomizing space-bounded classes.

Over the last fifteen years we have seen several exciting results on removing randomness from space. Unlike time they require no hardness assumptions but until very recently didn't achieve full derandomization.

The techniques of Savitch's Theorem give us that randomized logarithmic space sits in deterministic log² space. We will focus on results that reduce the space needed to simulate randomized algorithms and the space for the most well-known randomized space algorithm, undirected s-t-connectivity.

Aleliunas, Karp, Lipton, Lovász and Rackoff showed that one can solve s-t connectivity in randomized logarithmic space by taking a random walk on the graph. Nisan, Szemeredi and Wigderson give a log^3/2 algorithm for s-t connectivity. Saks and Zhou give the same bound for every problem in randomized log space.

BP_HSPACE⊆DPSACE(S^3/2) by Michael Saks and Shiyu Zhou

Saks and Zhou's result remains the best known bound for randomized logarithmic space but we can do better for s-t connectivity. Armoni, Ta-Shma, Wigderson and Zhou improved the s-t connectivity bound to log^4/3 space. And just a few weeks ago we had Reingold's result putting s-t connectivity in the optimal logarithmic space, currently a front runner for the next favorite theorems list in 2014.

Friday, December 10, 2004

Can We Fairly Teach and Grade?

Academic bloggers (like Jeff and Suresh) are up in arms about a column by Ailee Slater in the University of Oregon student paper.

We are currently paying a large amount of money to attend this University and receive an education. If I have paid to be taught something, shouldn't there be a repercussion for the teacher rather than, or at least as well as, the student when knowledge has not been taught?
Although teachers cannot be responsible for the self-failings of their students, it still seems unfair that they are allowed to judge how much a particular student is learning. I pay the teacher to teach me, and then I get slapped with the label of failure if the teacher deems that I haven't learned the correct information?

If students only paid for education they why would my grade matter? Whether I give you an A or a D won't affect how much you actually learned over the quarter. Students also want a certificate of quality of that education (a diploma and a transcript) to further their future careers. No legitimate university would give such certification just for tuition money. They need to verify that the students have indeed learned. That's why we have a grading system.

But Ailee does have a good point: I do have a conflict grading the students based on what I tried to teach them. Most of us try our best to grade fairly but deep down we know if a hard-working student hasn't mastered the material perhaps we should have taught differently. We often tend to overcompensate for this bias leading to grade inflation.

Unfortunately, other than standardized exams, I see no other reasonable alternatives of evaluating students. So professors need to grade students as fairly as they can and students need to acknowledge that education is not just a right but also a responsibility even when they are paying for it.

Thursday, December 09, 2004

A Map of Complexity Classes

Iowa State graduate student Chad Brewbaker created a graphical map of the complexity classes in the zoo. Who says you can't have fun with complexity?

Wednesday, December 08, 2004

The Other Nigerian Scam

Volunteer to organize a conference and I guarantee you will get an email like the following

I am a computer scientist from Nigeria. I am interested in attending your conference. I come from a very poor university and I am in need of funds and/or please write me a letter for visa.

They play to our vanity and our desire for diversity. Wow, Africans interested in computational complexity! Don't be fooled, in nearly every case they have no interest in computer science, they just want an easy way to come to Europe, the US or Canada.

There is that small chance a poor African student really is interested in complexity so I typically reply to such letters by

Thank you for your interest in our conference. Please send us your latest CV and a statement explaining your reasons for attending the meeting and we will be happy to consider your request.

In every case I have never heard from the letter writer again.

Monday, December 06, 2004

The NSF Gets Some Attention

The National Science Foundation rarely gets mentioned in the popular press. After the recent budget cut, the NSF now gets some mention, including a New York Times article, editorial and Thomas Friedman's column and an editorial of the the San Jose Mercury News. Perhaps taking a cut this year will help NSF in the long run as congress should no longer feel it can cut science funding without anyone noticing.

However, the US budget process takes the greater part of a year and we need to put pressure at many points, from the initial proposed budget from the White House to the various committees to the final vote in the fall. Complaining after the fact alone will not help the NSF but instead we need to make the case during each part of the process particularly when the final budget bills get approved starting in September.

Sunday, December 05, 2004

When Average Case is the Worst Case

Birthday Boy Rocco Servedio gave a talk at TTI on Friday on learning random functions over random inputs. John Langford asked about distributions that put more weight on simpler string, those of lower Kolmogorov complexity. That reminded me of a nifty 1992 result of Ming Li and Paul Vitányi.

Fix your favorite programming language and let K(x) be the length of the smallest program generating string x. Define μ(x)=2^-K(x), known as the universal distribution for reasons I won't get into here. Notice how strings of smaller complexity get more weight.

Theorem (Li-Vitányi) Any algorithm using time T(n) on average over the universal distribution uses time O(T(n)) in the worst case.

Li and Vitányi prove their theorem by showing that if the set of worst-case inputs is small they will have a short description and thus more heavily weighted under μ. Read all the details in their paper.

Thursday, December 02, 2004

Bad Timing

Texas student Vladimir Trifonov gives An O(log n log log n) Space Algorithm for Undirected Connectivity. If it came out a few months earlier (assuming the proof is correct) it would have been an amazing result, greatly improving the previously best known O(log^4/3 n) bound. Unfortunately for Trifonov, Omer Reingold announced his result giving the stronger O(log n) space algorithm a few weeks earlier. Trifonov's work was done independently from Reingold.

Trifonov's proof uses techniques based on PRAM algorithms for connectivity completely different from Reingold's zig-zag methods.

Wednesday, December 01, 2004

Use Google to Search Your LaTeX Files

In the month of November 44.3% of the hits on this website came from windows machines. If you are in this minority and don't mind a small amount of hacking, read on.

GDS Plus allows Google Desktop Search to search text files. Use this GDSPlus.conf where I've added the "tex" and "bib" extensions and your LaTeX and BibTex files will be indexed and searchable. A great benefit for those of us who can't keep track of which theorems and definitions we put in which papers.

Monday, November 29, 2004

Suggestions to Program Committees

A program committee member who asked me to review a paper pointed me to Oded Goldreich's Suggestions to Program Committees, one of his Essays and Opinions. In his suggestions he gives a reasonable interpretation to the usual ten point scoring system. A few minor quibbles: I would never score a paper a ten (which represents unattainable perfection) nor would I ever resign from a program committee, certainly not over a single paper.

The ends of the scale are not so important; a PC does not distinguish between the excellent papers from the merely great nor does it distinguish between the merely bad and the truly lousy but instead a PC must distinguish the pretty good papers from other pretty good papers.

Later on Oded strongly supports using "ex-submission information" in evaluating proposal even to the point of actively contacting authors for clarification during the review process. While PC members do not live in a vacuum and will be exposed to results they have to later review, actively contacting authors is patently unfair as it favors authors with whom you have a working relationship. Authors who cannot give enough information in the submission for the PC to properly evaluate the work deserve to have their paper rejected.

Sunday, November 28, 2004

The State of CISE

While we have seen a increase in computer science funding over the past few decades, it has not kept up with the great increase in the number of researchers in our field making it difficult for many computer scientists to find funding. Peter Freeman, who heads the Computer and Information Science and Engineering (CISE) directorate of the National Science Foundation, co-authored this piece in the current Computing Research News describing has CISE funding has changed over the past decade.

A companion piece written by the CISE division directors describes how CISE is adapting to the large number of proposals. The article emphasizes some reorganization and making connections to other funding centers in and out of the NSF. Until CISE sees a large budget increase, too many computer scientist will remain unfunded or under funded which will hamper CS research in the long term.

Wednesday, November 24, 2004

Conflicts

Yesterday a program committee member sent one of my submissions to one of my current students to review. Heh Heh. I'll make sure he gives an unbiased positive review.

A good excuse to talk about conflicts of interest. You should not review, edit or referee a paper if one of the authors is

a relative and/or someone you are romantically involved with, or
a member at your institution.

The second because departments use major conference publications for bragging rights and we have seen some abuse in the past.

The NSF has more stringent restrictions for reviewing proposals. If we used these restrictions for paper reviews, like no close collaborators or no former students/advisor, there might not be any one left to properly evaluate the paper. If the restrictions (1) and (2) don't apply to you, you should state any potential unknown conflicts but feel free to say whether the paper soars like an eagle or gobbles like a complete turkey.

Speaking of turkey–Have a great Thanksgiving everyone!

Monday, November 22, 2004

NSF Budget

Over the weekend the US Congress passed an omnibus spending bill for the fiscal year that started on October 1. The CRA has a roundup of the budget for the National Science Foundation and other funding agencies. Bottom line: NSF down 1.9% overall, with a 0.7% drop for "research and related activities."

It's going to be another tough year for grants.

Sunday, November 21, 2004

Letting Go

You just submitted your paper, working hard right up to the final deadline hour. Congratulations! Now what?

Forget about it. Oh you should distribute the paper: make it a technical report; email it to friends and family; put it on your web page and send it to an archive site. But don't keep working on it. All the tension you had getting the paper finished for the deadline needs a release. And no matter how much effort and time you continue to put into it, your paper will never be perfect.

Instead catch up on the todo's you've been ignoring. Say hi to the friends you have deserted while you bunkered down working. Take some time for you: catch a movie; eat a slow meal at a nice restaurant; take a walk.

Afterwards get back to research so you have another paper to write for the next deadline. You know you have successfully put that first paper out of your mind when you get caught off guard from the email from the PC chair letting you know your paper is

accepted: Great. Now you have to fix the paper up again for the proceedings deadline.
rejected: No problem. Now you have to fix the paper up again for the next conference deadline.

And it starts anew.

Friday, November 19, 2004

Zig-Zag Connectivity

The Zig-Zag Graph Product (Reingold-Vadhan-Wigderson) gave a new way to create expander graphs, graphs on n vertices with constant degree such that for some ε>0 and every subset of the vertices S with |S|≤n/2, |S∪N(S)|≥(1+ε)|S| where N(S) is the set of neighbors of vertices of S.

The zig-zag expander construction was not as simple as previous constructions nor did it have as good an expansion property. It did have one thing the other constructions lacked: a simple and beautiful proof of expansion.

The zig-zag construction had another property, a compact representation of the expander from the original graph. Reingold used this property to convert an arbitrary undirected graph to an expander in logarithmic space, the basis of his new result giving a log-space algorithm for undirected connectivity.

Why did George Lucas wait so long between the third and fourth Star Wars movies? He wanted the technology of movie making to catch up to his vision for the movie. Computational Complexity can also tell this story. We hit a wall a decade ago in reducing the space needed to solve graph connectivity. We needed a new technology (zig-zag product) and someone (Reingold) realizing they could use this technology to solve the problem.

Thursday, November 18, 2004

Google Scholar

Google has just launched a new search engine for academic papers scholar.google.com. Google has received permission from some publishers to allow searching through their papers that would normally not be available for Google to crawl. Based on random checking, it seems ACM for example is allowing Google to see their papers but not Elsevier.

Wednesday, November 17, 2004

How about Informaticians?

A dinner conversation with my daughters.

Annie (age 9): How many computer scientists are there in the world?
Me (Wildly Guessing): About 100,000.
Molly (age 6): I know there are at least two. You and our computer teacher at school. Someone asked him if he was a scientist and he said, "yes, I am a computer scientist."

I don't think Molly's computer teacher tried to mislead; he just did a play on words. But it was enough to fool a six-year old and I suspect many adults as well. Maybe we need a new name for people in our field.

Monday, November 15, 2004

The Polar Express

I went with my family to see the movie The Polar Express. This movie blurs the line between between acting and animation. Tom Hanks played several characters including a young boy by wearing sensors on his face and body that recorded his expressions and then used for the computer generated characters on the screen. I had trouble explaining to my children (and my wife) how Tom Hanks played the boy even though someone else did the voice. Where in a more traditional computer animated movie like The Incredibles, the actors doing the voices get the credit.

We saw The Polar Express on an IMAX screen in 3-D. Since computers stored the entire movie, converting the film to 3-D could be an automated process. I predicted to my unbelieving children that when they grow up they will not be able to distinguish between computer generated film and those filmed live. Perhaps all movies then will be stored in some specific format like papers in PDF today that could be easily converted and optimized to whatever display device a person has.

Unless you have small children, I would recommend The Incredibles over The Polar Express. Technology does not win over substance. But I have seen the future and it is cool.

Friday, November 12, 2004

Does Google Make Us Lazy?

The search wars are brewing. Microsoft starts a beta test of their new search engine. Yahoo keeps improving their searching as well and Google does not stand still with doubling the number of pages it indexes and their new desktop search (which is rather useless to me because it doesn't index LaTeX files). In addition, search engines now do more than just web searching, for example you can directly use dictionaries, maps, time, a calculator and more with the right shortcuts on Google, Yahoo and Microsoft. What does it mean though when we need a manual to use a search engine?

All this searching power leads to the mistaken belief that the best way to find anything is by a direct search. For example many people look for a paper by Googling on the name of the paper. Usually that does lead to a version of the paper to download. But Google searches the deep recesses of the internet and often returns old versions of papers, even technical reports well after conference and journal versions have appeared. Google says "Don't be evil"; I say "Don't be lazy." If you have a reference to a paper in a particular conference or journal, search for that conference or journal and find the paper there if you have access. Or use a site like DBLP. Otherwise look on the author's web pages; good authors will keep versions of their papers up to date. If all this fails you can fall back on Googling the name of the paper. Working off the newest version of the paper will save you far more than the extra fifteen seconds you'll spend searching for it.

Wednesday, November 10, 2004

Undirected Connectivity in Log Space

Omer Reingold has shown Undirected s-t Connectivity in Deterministic Log-Space. This settles a long-standing open question for a problem previously known to be in randomized logarithmic space (Aleliunas-Karp-Lipton-Lovász-Rackoff) and deterministic log^4/3 space (Armoni-TaShma-Wigderson-Zhou). Wow!

In layman's terms, suppose you have a big maze, far too big to fit into your computer's memory. You can only ask questions of the form: given two locations A and B, are A and B directly connected? Reingold's result shows that you can determine whether there is a path from the entrance to the exit using only an amount of memory equivalent to storing a constant number of maze locations.

Tuesday, November 09, 2004

How Did We Survive Before the Internet?

A grad student asked me how we managed before the internet specifically in relation to submitting to conferences? I cannot completely answer that question as I started graduate school in 1985, a few years after the birth of the internet and already when most computer scientists reliably read email. But let me explain how it worked when I was a student.

In the second half of the 80's we generally still did not distribute papers electronically and the world wide web remained several years away. Nearly everyone in theory subscribed the the Theorynet mailing list (not so true today) and the call for papers were emailed to this list as well as sent out to all SIGACT members by postal mail. To submit to a conference we had to make ten copies of a paper and physically send them to the program committee chair who then sorted the papers into ten stacks and sent each stack to each program committee members. A few months later the program committee would meet and choose the papers for the conference sending out accept/reject letters through postal mail. Since MIT always had one or more PC members on the faculty we found out about our papers from them well before we got the official word.

Initially the STOC and FOCS PCs did not take deadlines seriously. For some reason about 1987 they decided to strictly enforce the deadline first by postmark and then by receipt. Federal Express made considerable money from us theorists and I can still tell you which Fed Ex offices in Boston and Chicago remained open the latest. One year an MIT faculty memeber hired a same-day service to send the papers to the PC chair giving us at MIT an extra night to work on our papers. Not many of us showed up for classes the next day.

Electronic submissions, starting in the early 90's, leveled the playing field and made the process slightly more sane. The STOC call still has the line

Authors who cannot submit electronically must send 21 printed copies (double-sided preferred) of an extended abstract, together with a cover letter, to:

When I served as the Complexity PC chair in 1999 exactly one person sent me their submission by Federal Express. "Am I going to have to send a copy of this paper to each of my PC members?" I thought. Instead I emailed the author and luckily he sent me a postscript file and I could manage the PC electronically. We have since removed the non-electronic submission instructions from the Complexity call.

Monday, November 08, 2004

Favorite Theorems: Quantum Lower Bounds

October Edition

Alice and Bob have subsets of {1,…,n}. How much communication do Alice and Bob need to determine if their subsets have a empty intersection. In classical communication complexity even with probabilistic players, Alice and Bob need Ω(n) bits of communication to solve this Set Disjointness Problem.

What if we allow communication with quantum bits? A clever protocol by Buhrman, Cleve and Wigderson based on Grover's quantum search algorithm shows how Alice and Bob can solve Set Disjointness communicating with (up to log factors) n^1/2 quantum bits. Could Alice and Bob do better? The best known lower bounds were about O(log n) until Razborov showed the Buhrman-Cleve-Wigderson protocol was essentially optimal.

Quantum Communication Complexity of Symmetric Predicates by Alexander Razborov.

In another important paper in quantum lower bounds, Andris Ambainis developed the hybrid method for proving lower bounds on quantum query complexity. This method gave researchers a new tool in proving stronger and sometimes tight bounds in the number of queries a quantum algorithm needs to an input to solve a problem like inverting a permutation. Ambainis' work formed the basis for many other papers applying the techniques and improving them including work recently presented at Dagstuhl.

Friday, November 05, 2004

Public Referee Reports

From Paquet via Nielsen comes an idea of publicly posting referee reports. Let's kill this idea quickly.

A review of the usual referee process: The editor-in-chief of a journal receives a submission and assigns a member of the editorial board to act as editor for that paper. The editor will contact potential referees and ask them to review the paper. The referees read the paper and write a report on whether to recommend the paper for publication and give suggested changes. They send this report to the editor who relays this information to the authors without revealing the identities of the referees. The authors will comment on the referee reports and revise the paper. The paper often goes back to the referees and the process repeats until the editor is happy with the paper or decides the paper is not worthy of the journal. The editor then sends his recommendation back to the editor-in-chief.

An argument for posting the reports goes as follows: If we make the reports public (though still anonymous) referees will feel a greater responsibility to make their report thorough and accurate.

The process of refereeing requires considerable back and forth conversation between the three parties: the authors, the editor and the referees. Posting the original reports will give a misleading view of the process and will cause referees to act far too cautiously about pointing out problems in a paper.

We have an editor responsible for making sure that he has enough quality reports to properly make a decision about publication. Using the public as an editor will often lead to some very difficult and messy situations.

Keep in mind the distinction between the referee process and reviews of published papers. We should encourage publicly available reviews of research papers whether in Mathematical Reviews or informal settings like weblogs. These reviews help us find the gems among the sea of research papers out there.

Wednesday, November 03, 2004

Life Goes On

Our department, like most other academic departments, felt a strong sense of depression today. Just remember there are still theorems to prove, papers to write, talks to give, students to advise, meetings to attend and classes to teach. Life goes on and so must we.

Memo to the future 2008 STOC PC Chair: Don't even think of scheduling the STOC deadline immediately following the elections again. There is only so much stress one can take in a week.

Tuesday, November 02, 2004

Vote Complexity

Complexity theorists have an important choice to make this week. That's right, I'm talking about whether to send your papers to STOC or Complexity. So let me put on my hat as chair of the Complexity conference committee and tell you why you should be thinking about our conference.

With the great growth in theory the STOC conference can now only accept a small number of papers in computational complexity theory. For the same reasons we have seen many strong papers appear in Complexity over the past several years from a quite broad range of complexity topics. Don't take my word for it and look at our last three conferences.

With your help we can add even more strength to the Complexity conference and continue to make it the primary venue for exposition of results in computational complexity. So submit your papers and I look forward to seeing you in San Jose.

And after you submit your paper to either conference, I highly recommend also sending the paper to ECCC. You will get quick publication and more exposure than any single conference.

I'm Lance Fortnow and I approve this message.

Sunday, October 31, 2004

Election Day

A true story on Election Day 2000: An Israeli postdoc in the US came to work and said "I have watched the conventions and seen the debates. I have studied the platforms and as much news analysis as I could get hold of. After serious consideration I decided that, if I were allowed to vote, I would vote for Bush." An American computer scientist walked in soon thereafter and said "I woke up this morning and decided to vote for Nader." Draw your own moral.

With the US elections on Tuesday and politics on everyone's mind, let's open the comments for anyone who has anything they want to say about the presidential race. Get it off your chest. I only ask you to be civil. And don't forget to vote.

Friday, October 29, 2004

The Co-Author Conundrum

In theoretical computer science we traditionally list the co-authors of our papers alphabetically. Done this way for "fairness" it leads to a binary notion of author. Either you are an equal author of a paper or you are off the paper. There is no middle ground.

In our publish or perish society, authoring papers helps you succeed, in getting hired, promoted and receiving grants and awards. So choosing who is an author of a paper, particularly important papers, can be an important and sometimes messy decision complicated by the fact that the authors have to do the choosing.

An author should have made significant contributions to a paper. But how do we define significant? A person who produces key ideas in the proof of a main result certainly becomes an author. A person who simply writes up the proof should not be. But what about the person who works out the messy but straightforward details in a proof? What about the person who poses the questions but has no role in the proof? Tricky situations that one needs to handle on a case-by-case basis.

An advisor should hold him or herself to a higher standard. A good advisor guides the research for a student and should not become a co-author unless the advisor had made the majority of the important ideas in the proofs. Likewise we hold students to a slightly lower standard to get them involved in research and exposition of their work.

Computer scientists tend to add co-authors generously. While seemingly nice, this makes it difficult to judge the role authors have played in a paper, and sometimes makes who you know or where you are more important than what you know.

Thursday, October 28, 2004

Solving an 86-Year Old Open Problem

Tuesday, October 26, 2004

Nothing Like a Deadline to Ruin Your Day

Some upcoming deadlines: STOC (11/4), Complexity (11/18), Electronic Commerce (12/7), the new NSF program Theoretical Foundations (1/5), and ICALP (2/13). Feel free to comment if I've missed something.

Since computer science takes its conferences more seriously than the journals and most conferences have hard deadlines, we have become a deadline-driven field. Most authors submit their papers on the deadline day, if not in the last hour or two. Papers get written to meet a deadline which has both good and bad aspects: good in that papers get written and bad in that they get written quickly and often incompletely.

Sometimes conference organizers see a lack of submissions (forgeting that most papers come on deadline day) and extend the deadline by a few days or a week. I've often heard people complain about losing their weekends if a deadline moves from Friday to Monday. Why? You could still submit on Friday. People feel their papers are never complete and they need to keep fixing it up until the last possible second even though these last minute changes will not likely affect acceptance.

One person I knew once turned down an opportunity to attend a workshop because of a grant deadline on the same week. This was six months beforehand. A little planning is all that's needed to submit the grant one week early but some in our field cannot pull this off even months ahead of time.

Remember that deadlines are upper bounds, no shame in submitting early. And it's not the end of the world if you miss a deadline; there's always another conference with another deadline right around the corner.

Monday, October 25, 2004

What Would the Martians Think?

In Bill Gasarch's post last week, he discusses what makes a problem natural. You used to hear the argument that a complexity class was natural if it contained an interesting problem not known to be contained in any smaller class. But then would SPP be natural simply because it contains graph isomorphism? On the other hand I find BPP a natural way to define probabilistic computation even though it fails this test. Does a class go from natural to unnatural if a new algorithm for a problem is found?

I prefer to use the Martian rule. Suppose we find a Martian civilization at about the same level of scientific progress that we have. If they have a concept the same or similar to one of ours than that concept would be natural, having developed through multiple independent sources.

Of course we don't have a Martian civilization to compare ourselves with so we have to use our imagination. I firmly believe the Martians would have developed the P versus NP question (or something very similar, assuming they haven't already solved it) making the question very natural. On the other hand I suspect the Martians might not have developed a class like LWPP. Other classes like UP are less clear, I guess it depends whether the Martians like their solutions unique.

Applying the Martian rule to Gasarch's post, WS1S is probably more natural than regular languages without squaring that equal Σ^*. At least my Martians would say that.

Saturday, October 23, 2004

Favorite Theorems: Learning Circuits

September Edition

Computation Complexity and Computational Learning share many aspects and goals. We both analyze and compare different models of computation either to understand what they can compute or how to find the appropriate model to fit some data. No surprise that many of the tools developed in one area have use in the other. This month's paper exemplifies the connection; it uses tools in complexity to get a nice learning theory result which in turn has very nice implications in complexity theory.

Oracles and Queries that are Sufficient for Exact Learning
Nader Bshouty, Richard Cleve, Ricard Gavaldà, Sampath Kannan and Christino Tamon

The main result shows how to learn circuits probabilistically with equivalence queries and an NP oracle. An equivalence query given a circuit C trying to learn a language L, either says C is correct or gives an input where it fails. The proof uses a clever divide and conquer argument built upon Jerrum, Valiant and Vazirani's technique for random sampling with an NP oracle.

Kobler and Watanabe observe that if SAT has small circuits we can answer equivalence queries to SAT with an NP oracle. Applying Bshouty et. al. they show that if SAT has polynomial-size circuits than the polynomial-time hierarchy collapses to ZPP^NP.

Sengupta noticed that old results give a consequence of PH in S₂^p if SAT has small circuits. This strengthens Kobler and Watanabe because of Jin-Yi Cai's result showing that S₂^p is contained in ZPP^NP. Cai's paper uses many techniques similar to Bshouty et. al. Shaltiel and Umans have a recent result giving weaker assumptions for derandomizing S₂^p by derandomizing random sampling.

If SAT does not small circuits, the Bshouty et. al. algorithm produces a small set of inputs that give a co-NP proof of this fact. Pavan, Sengupta and myself applied this fact to the two queries problem.

Thursday, October 21, 2004

Go Sox!

Saturday Evening, October 25, 1986: I huddled with about a dozen of my fellow MIT graduate students (and a couple of faculty) watching game six of the baseball World Series in a Toronto hotel room right before the start of FOCS. The Boston Red Sox led by two runs with two out and none on in the bottom of the tenth against the New York Mets. One more out and the Sox would win their first championship since 1986.

The Red Sox didn't win the series that year and failed to return until this year. After an amazing comeback against their rivals, the New York Yankees, the Red Sox will host the first game of the World Series on Saturday against the St. Louis Cardinals.

By far baseball is the favorite team sport among American computer scientists (at least of those that care about sports at all). Why? Mabye because it's a discrete game with a small state space. At Fenway Park (Boston's home field) they use lights to give the number of ball, strikes and outs in unary notation. The game has many nice mathematical properties and not just the myriad of statistics. For example, it is a theorem of baseball that at any point in a half inning the number of batters is equal to the sum of the number of outs, the number of runs scored and the number of men on base. Proof by induction.

The real reasons I love baseball are less tangible. Both a team sport and a one-on-one contest between pitcher and batter. A strategic game dealing with balancing probabilities. Suspense on every pitch. And much more.

By far the plurality of baseball fans in our field seem to root for the Red Sox. Probably because most of us spent at least part of our academic career in the Boston area and Boston takes its baseball far more seriously than any other city. In full disclosure, my favorite team is the Chicago White Sox but I root for the Red Sox in their absence.

Nothing beats attending baseball game live, especially in Fenway. Alas I never managed to attend a world series game though I've come very close.

October 14, 1992: The Pittsburgh Pirates won the National League East and the World Series was scheduled to open during FOCS in Pittsburgh. I wrote for and got tickets to the first game if Pittsburgh made the series. In the NLCS Atlanta scored three runs in the bottom of the ninth of game 7, meaning Atlanta and not Pittsburg would host the series. When Cabrera hit the single scoring those final two runs, I sat staring at the TV and cried.

Wednesday, October 20, 2004

Conversations with Bill

A guest post from William Gasarch

Why is it hard for us to explain to the layperson what we do? The following true story is telling. I will label the characters MOM and BILL.

MOM: What kind of thing do you work on?

BILL: (thinking: got to give an easy example)
Lets say you have n, er 1000 numbers. You want to find the — (cut off)

MOM: Where did they come from?

BILL: Oh. Lets say you have 50 numbers, the populations of all of the states of America, and you want to find — (cut off)

MOM: Did you include the District of Columbia?

BILL: No.

MOM: Why not?

BILL: Because its not a state. But for the example it doesn't matter because — (cut off)

MOM: But they should be a state. They have been oppressed to long and if they had their own state then — (cut off)

BILL: But none of that is relevant to the problem of finding the Max of a set of numbers.

MOM: But the problem of Statehood for DC is a more important problem.

BILL: Okay, lets say you have 51 numbers, the populations of the 50 states and the District of Columbia.

MOM: What about Guam?

BILL: I should have majored in Government and Politics…

To the person on the street the very definition of a problem is … problematic. Abstraction that we do without blinking an eye requires a conceptual leap that is hard or at least unfamiliar to most people.

Even people IN computer science may have a hard time understand what we are talking about. Here is another real life story between two characters who I will call BILL and DARLING. DARLING has a Masters Degree in computer Science with an emphasis on Software Engineering.

DARLING: Bill, can you give me an example of a set that is provably NOT in P.

BILL: Well, I could say HALT but you want a computable set, right?

DARLING: Right.

BILL: And I could say that I could construct such sets by diagonalization, but you want a natural set, right?

DARLING: Right.

BILL: And I could say that the set of true statements in the language WS1S, but you want a natural set.

DARLING: What is WS1S?

BILL: Weak Monadic Second order with one successor, but I think you agree that if you don't know what it is then it's not natural.

DARLING: Right. So, is there some set that is natural and decidable that is provably not in P?

BILL: AH, yes, the set of regular expressions with squaring that equal Σ^* is EXPSPACE complete and hence provably not in P.

DARLING: Why is that problem natural?

BILL: Good Question! A problem is natural if it was being worked on before someone classified it.

DARLING: Okay. What is the first paper this problem appeared in?

BILL: It was in THE EQUIVALENCE PROBLEM FOR REGULAR EXPRESSIONS WITH SQUARING REQUIRES EXPONENTIAL SPACE by Meyer and Stockmeyer, From FOCS 1972. Oh. I guess that proves that its NOT natural.

This story raise the question—what is natural? Do we need that someone worked on a problem beforehand to make it natural? Is it good enough that they should have worked on it? Or that they could have? Logic has the same situation with the Paris-Harrington results of a result from Ramsey Theory that is not in Peano Arithmetic, but the first time it was proven was in the same paper that proved it was not provable in PA.

Incidentally, there are more natural problems that are not in P. Some games on n by n boards are EXPSPACE or EXPTIME complete and hence not in P. Would have been a better answer, though it would not have made as good a story.

Tuesday, October 19, 2004

More FOCS News

Fair and balanced coverage from Adam Klivans

To answer one of Lance's previous posts, the Internet is definitely harming conferences: most everyone who stayed up until 5 AM to watch the Red Sox beat the Yankees in 14 innings on MLB.TV has not made it to James Lee's talk at 8:30 AM on embeddings (in fact I think I'm the only one who did make it here). Krauthgamer, Lee, Mendel, and Naor presented a powerful new method for constructing embeddings of finite metric spaces called measured descent which, among other things, implies optimal volume-respecting embeddings (in the sense of Feige).

I checked the registration numbers and indeed only 172 people have officially registered for the conference—that's 100 less than the registration at STOC in Chicago.

Yesterday I mentioned the winners of the best paper award. I should also mention the best student paper award winners: Lap Chi Lau's An Approximate Max-Steiner-Tree-Packing Min-Steiner-Cut Theorem shared the prize with Marcin Mucha and Piotr Sankowski's Maximum Matchings via Gaussian Elimination. Lau's paper gives the first constant factor approximation to the problem of finding a large collection of edge-disjoint trees that connect an undirected multigraph. Mucha and Sankowski give a nice method for finding maximum matchings in general graphs in time O(n^ω) where ω is the matrix multiplication exponent. Lovász showed how to test for a matching in a graph using matrix multiplication, and Mucha and Sankowski extend this and actually recover the matching.

Valiant's talk on Holographic Algorithms was well attended: he described a new, quantum-inspired method for constructing polynomial-time algorithms for certain counting problems. The algorithms are classical (no quantum required) and give the first efficient solutions for problems such as PL-Node-Bipartition: find the cardinality of a smallest subset of vertices V' of a max-degree 3, planar graph such that deletion of V' (and its incident edges) causes the graph to become bipartite. At the end he gave a simple criterion for proving P = P^#P via these techniques!

Monday, October 18, 2004

FOCS News

Adam Klivans reports from Rome.

Rome is the host city for this year's FOCS conference. While everyone enjoys visiting one of the world's great capitals, attendance at the sessions can occasionally suffer, and the sessions this year do seem noticeably smaller. Another explanation could be the the high cost of traveling to and staying in Rome. On the plus side, I get to see many European theorists of whom I had known in name only.

For those who did make the trek to the southern tip of the Villa Borghese, the first day featured the presentation of the two results which won best paper: Subhash Khot's Hardness for Approximating the Shortest Vector in Lattices and Applebaum, Ishai, and Kushilevitz's Cryptography in NC⁰. Subhash was an author of two other impressive hardness results in the same session: Ruling Out PTAS for Graph Min-Bisection, Densest Subgraph and Bipartite Clique (the title is self-explanatory) and Optimal Inapproximability Results for Max-Cut and Other 2-variable CSPs? (with Kindler, Mossel, and O'Donnell) which gives evidence that the Max-Cut approximation algorithm of Goemans-Williamson is the best possible.

The cryptography session featured the above Cryptography in NC⁰ paper which Lance mentioned in an earlier post as well as an intriguing result due to Salil Vadhan, An Unconditional Study of Computational Zero Knowledge showing how to establish important properties of computational zero knowledge proofs without assuming the existence of a one-way function.

The controversial topic of what to do with the special issue of FOCS continued at last night's business meeting. It appears as though Elsevier will lose another opportunity to publish a special issue of STOC/FOCS, as a vote last night indicated a strong desire to give SICOMP the responsibility instead (a similar thing occurred at STOC this year).

Saturday, October 16, 2004

Some Dagstuhl Presentations

At Dagstuhl Manindra Agrawal presented recent work of his students Neeraj Kayal and Nitin Saxena (the trio that showed a polynomial-time algorithm for primality testing) on rings given by a matrix describing the actions on the base elements. They show a randomized reduction from graph isomorphism to ring isomorphism and from factoring to #RI, counting the number of ring isomorphisms. They also show a polynomial-time algorithm for determining if there are any non-trivial automorphisms of a ring and that #RI is computable with an oracle for AM∩co-AM. Agrawal conjectured that #RI is computable in polynomial time, a conjecture that would imply factoring and graph isomorphism have efficient algorithms.

We also saw a series of presentations by Andris Ambainis, Robert Špalek and Mario Szegedy. Ambainis described his improved method for showing lower bounds for quantum algorithms that provably beats the degree method. Špalek talked about his work with Szegedy showing that Ambainis techniques as well as different tools developed by Zhang, Laplante and Magniez, and Barnum, Saks and Szegedy all gave the same bounds. Szegedy, in his presentation, called this measure the div complexity and showed that the size of a Boolean formula computing a function f is at least the square of the div complexity of f.

Wednesday, October 13, 2004

Is the Internet Harming Dagstuhl?

Dagstuhl was designed as a place to bring a small group of researchers to an isolated environment where they could give some talks, discuss research and otherwise socialize among themselves free from other distractions. No televisions though a radio bought to hear news during the 1991 Gulf War. We could get two-day old news from America via the Herald Tribune. While they had computer rooms, in the early days we had no world wide web and email was far less used. Instead we had rooms for coffee, rooms for beer and wine, rooms for billiards and music and rooms just to hang out. Everyone stayed on premises and we had no phones in rooms, just a couple communal phones to call home.

Although Dagstuhl has expanded, rooms not only have phones but WiFi throughout. We can answer email, read news, write weblog posts (as I am doing now) from the comfort of our own isolated desks. We're watching baseball games and the debate over the internet. But worse than being connected, the rest of the world knows we're connected. I find myself having to take time to fix problem sets for my class and deal with departmental issues as do many of my other colleagues here.

The internet has greatly helped science by bringing us closer together but also prevents us from being disconnected losing many of the advantages of these workshops. A sign here proclaims "Are you here for computer networking or human networking?" Something to remember next time you go to a conference.

Tuesday, October 12, 2004

Back in Dagstuhl

I'm back in Dagstuhl for the workshop on Algebraic Methods in Computational Complexity. The roof looks great. I have attended several Dagstuhl workshops for over twelve years now since the workshop on Structure and Complexity Theory in 1992. I have seen Dagstuhl expand and evolve over these years and this is the first time I feel that Dagstuhl has achieved its completed state. I love coming here; Dagstuhl has a contained environment in a pretty but boring part of Germany where we complexity theorists give seminars, eat and drink together and talk science and other stuff. Politics and baseball seem to dominate the discussions this week.

A group of German software engineering professors share Dagstuhl with us this week. They are meeting to discuss future directions of German software engineering research and to find ways to increase student enrollment. The drop in students desiring a computer science degree is not just an American phenomenon.

Monday, October 11, 2004

A Celebration of Women in Computing

A report from Varsha Dani.

The Grace Hopper Celebration of Women in Computing was held in Chicago on October 6-9. This was the fifth such conference, since its inception in 1994. This year there were over 800 attendees from all over the country.

This conference is a forum for discussion of issues faced by women and a showcase for achievements of women in the fields of computing and technology. There were a number of talks on social issues, some technical presentations by young investigators, and a few invited technical talks. There were also a number of social and networking events hosted by IBM, Microsoft, Google and others.

Among the invited talks, there were three I particularly enjoyed.

Jessica Hodgins of CMU talked about connections between ideas in robotics and computer graphics and animation, especially simulation of human movements.

Cynthia Dwork of Microsoft Research spoke about the problem of publishing (transformed) data from public databases (such as census data) so as to maintain a balance between the utility of the published information and the protection of the privacy of individuals represented by the data. Her approach to privacy is influenced by ideas from cryptography.

Daniela Rus of MIT spoke about self-reconfiguring robots. These robots are distributed systems, consisting of a number of identical modules which can dynamically adapt the way that they are connected to each other to best fit the task at hand.

Thursday, October 07, 2004

NP-Completeness for a 9-Year Old

My nine-year old daughter had a homework problem with the following diagram.

She had no problem solving questions like: Beginning and ending at the entrance, describe the shortest route you can take that will allow you to see 4 different kinds of animals.

"You're doing computer science," I said.

"I don't see any computers," she responded.

"Computer science is problem solving like finding the shortest path."

"Then computer science is pretty easy."

"OK, is there a way to visit every animal exactly once?"

She found this question much more challenging.

Groups versus Departments

In the US the terms Assistant Professor, Associate Professor and Professor represent different stages in one's career but they all play a similar role in research and advising students. An assistant professor is nobody's assistant.

The names get their meaning from a structure you still see in many other countries (Germany is a good example). Here you have research groups, where a lead professor has nearly complete control of hiring and the budget. The equivalents of assistant and associate reflect the temporary and permanent faculty members of those groups.

How does this affect graduate studies? In Germany a grad student joins a group and works within that group. In the United States a student joins a department usually without a specific advisor in mind and often not initially committing to a specific subfield of computer science.

So to those who send me and other American computer scientists requests to join our groups, the US system doesn't work that way. Instead go to the departmental web page and follow the appropriate links to find information on how to apply to that department. If you have a specific researcher that you want to work with, use the personal statement to say this and your reasons for it.

Trust me, we read the applications carefully and choose Ph.D. candidates as best as we can. It just doesn't help to send personal requests, I just point to our web page and trash the email.

Wednesday, October 06, 2004

Holy Trefoils, Math Fans!

Can one use a comic book and a toy to teach a complicated subfield of mathematics? Why knot?

Tuesday, October 05, 2004

Larry Stockmeyer Commemorations

Ron Fagin asked me to announce two public commemorations of Larry Stockmeyer and his work.

The first will be held at the IBM Almaden Research Center on Monday, October 25, 2004.

The second will be held in conjunction with STOC '05 in Baltimore on May 21-22, 2005.

Please join the community in honoring the memory of one of the great complexity theorists.

Sunday, October 03, 2004

Are we too nice?

Steve Smale talked about his experiences in the Economics Theory Workshop at Chicago, particularly the aggressive questioning. I didn't attend his talk but I did go to a few of the econ theory seminars years ago and it forms an interesting contrast to the CS theory talks which have few usually technical questions followed by polite applause.

The econ theory seminar took place in a medium-size conference room with a long table. Graduate students sat in chairs along the walls. The speaker was at one end of the table and econ professors, usually including multiple Nobel prize winners, around the rest of the table. A copy of the paper was sent with the talk announcement and almost from the first slide the faculty aggressively attacked the speaker with questions about the validity of the model or the importance of the results. (Remember this was the theory seminar, imagine the questions at the political economics seminar). At the end of the seminar time, the talk ended and everyone left the room. No applause.

I don't recommend that we follow this model in theoretical computer science. However we usually go to the other extreme and (outside of crypto) rarely ask negative questions in a seminar. Typically the only negative feedback we get in our field is from anonymous referees and reviewers. If we were forced to defend our research in an interactive setting, we would establish a better understanding of the importance of the models and results of our own research.

Thursday, September 30, 2004

The Specialization of Computer Science

I heard the following from a senior economist recently.

A researcher at the beginning of his career has to please others. In order to receive a Ph.D., get a job and eventually tenure, he has to not only produce good research but research of value to others. The researcher might think that once he gets tenure he can do the research he wants, but by that time he has become so specialized it is impossible to change direction.

Looking at economics can give us a glimpse into the near future of computer science as a discipline. Though economics as a field has been around for a very long time, only since the 1950's did economics develop as a rigorous mathematical discipline. This gives economics about a 15-year head start on computer science.

When I started in grad school in the mid-80's, I could follow nearly every talk at the major theory conferences, STOC and FOCS, though I would not have followed any computer science talk which might have been true 10-15 years earlier. These days I can understand the importance of the main results of maybe half of the talks and have actual interest in only a small fraction of these. The growth of specialty conferences such as Complexity, SODA (algorithms), SoCG (Computational Geometry), Crypto, RANDOM/APPROX, COLT (Learning Theory), LICS (Logic in CS), QIP (Quantum) and so on have only increased this divide. We get less and less crossbreeding between these subfields of theory.

On the other hand, some researchers still can and do (with some effort) change research areas in theory and general computer science even after tenure. But in 15 years will we look like economics where a late change in field is difficult to impossible and soon after that like mathematics where it often takes years of graduate school just to understand the major open questions in an area.

Tuesday, September 28, 2004

Computer Science's Newest Genius

Congratulations to Daphne Koller, this year's MacArthur Fellow in computer science. Koller uses a strong mathematical approach to decision making and learning.

Monday, September 27, 2004

Republicans and Democrats on Science Research

From a comment on my last post.

I think most computer scientists, even conservatives vote Democrat for one reason. Democrats fund the NSF, and the NSF gives us fat paychecks.

From discussion I have with other computer scientists, I don't find science funding a major factor in their voting decisions. On top of that the preface doesn't hold water. I went and computed the average yearly increase in the NSF budget during the tenures of the last several presidents.

Carter, 7.9%
Reagan, 11.0%
Bush Sr., 10.6%
Clinton, 7.6%
Bush Jr., 9.1%

The Democratic and Republican platforms have similar goals in scientific research though the Republican platform goes into more detail. From the Democratic Platform:

We will invest in the technologies of the future, from renewable energy to nanotechnology to biomedicine, and will work to make permanent the research and development tax credit. We will achieve universal access to broadband services, which could add $500 billion to our economy, generate 1.2 million jobs, and transform the way we learn and work. And we will put science ahead of ideology in research and policymaking.

The Republican Platform takes two pages to give the same ideas (except for that last sentence). Here is the section on Research and Development.

America's economy is undergoing a fundamental transition from one based primarily on manufacturing to one based on innovation, services, and ideas. Two-thirds of America's economic growth in the 1990s resulted from the introduction of new technology and 60 percent of the new jobs of the 21st century require post-secondary education, yet only one-third of America's workforce has achieved that level.
In order to maintain America's global leadership, Republicans have provided unprecedented support for federal research and development to help spur innovation. Federal R&D funding is up 44 percent from 2001 to $132 billion in 2005, which includes a 26 percent increase in support for basic research. The President has doubled the budget for the National Institutes of Health and increased the National Science Foundation budget by 30 percent. President Bush and the Republican Party also support making the R&D tax credit permanent.
The rapid pace of technological development demands that we remain on the leading edge of innovation and science. Republicans are committed to providing the investment and incentives needed to foster next generation technologies. The 21st Century Nanotechnology Research and Development Act, passed by a Republican Congress and signed by President Bush, increased funding for nanotechnology research. In addition, the President has dedicated $1.7 billion over five years to develop hydrogen fuel cells and related next-generation energy technologies. The President's support for NASA and vision for space exploration will also enhance scientific development and technological breakthroughs.

In short the parties do not differ much on a future research investment. Both platforms also push science education. The Republicans have had a better historical record of science funding but Bush has come under fire for ignoring science in policy making. Better not to worry about science and use other factors in your choice of president.

Sunday, September 26, 2004

The Blue State of Science

I usually avoid politics in this weblog but I cannot totally ignore the US presidential election happening slightly more than a month from now. But nothing I will say would make much difference; nearly all computer scientists, and I expect most scientists, will vote for Kerry (or would vote for Kerry if they were US citizens). It's not based on a single issue. If we never went to war in Iraq, the economy were booming, Osama Bin Laden was behind bars and Bush supported a woman's right to choose, you would all still vote for Kerry.

What makes scientists so liberal? Why doesn't a field like computer science draw from a wide political spectrum? Does this liberal attitude stifle real political debate in scientific departments or even worse discourage those with more conservative views from entering our field?

Thursday, September 23, 2004

NP-Completeness is Illuminated

Another literary reference to a hard combinatorial problem. Jonathan Safran Foer describes plans for a wedding reception in his disjointed novel Everything is Illimuniated.

The hardwood floors were covered in white canvas, and tables were set in a line stretching from the master bedroom to the kitchen, each feathered with precisely positioned name cards, whose placement had been agonized over for weeks. (Avra cannot sit next to Zosha, but should be near Yoske and Libby, but not if it means seating Libby near Anshel, or Anshel near Avra, or Avra anywhere near the centerpieces, because he's terribly allergic and will die. And by all means keep the Uprighthers and Slouchers on opposite sides of the table.)

Tuesday, September 21, 2004

The Beauty of the Magic Number

Baseball has a large number of mathematical nuggets but since my childhood I have always liked the simplicity of the magic number. In a division race, the magic number is the minimum number of wins by the first place team and number of losses of the second place team to guarantee the first place team wins the division.

Let's do an example. As I write this the New York Yankees have 94 wins, the Boston Red Sox have 60 losses. The easiest way to compute the magic number comes from working backwards from the definition. There are 162 games in a season so the Yankees magic number is 162+1-(94+60) = 9. Any combination of nine Yankees wins and Red Sox losses and the Yankees wins the American League East. The "+1" comes from the fact that in a tie the Yankees would still need to win a one-game playoff to win the division.

What can the magic number teach us about complexity? Consider the RIOT Baseball Project at Berkeley. Not satisfied with the magic number, the project computes the First Place Clinch Number as the "Number of additional games, if won, guarantees a first-place finish." To compute this number one has to look not only at the current standings but the schedule of remaining games between the teams.

My main issue of the clinch number relates to complexity. Not only is it more complicated to compute; to update the clinch number after a game sometimes requires recomputing the number from scratch. The magic number has a simple update function counting down like a rocket launch. Yankees win the magic number drops by one. Red Sox lose the magic number drops by one. If the Yankees beat the Red Sox, both events happen so the magic number drops by two. And once the magic number hits zero you pop the champagne. That's the beauty of the magic number.

Sunday, September 19, 2004

Quantum in the North; Random in the South

A couple of interesting workshops happening this week both studying information, not the usual classical notions of information but quantum and random information.

At Banff in the Canadian Rockies we have the BIRS Workshop on Quantum Computation and Information Theory. This workshop will examine the properties of quantum information and tie it together with people working on quantum algorithms and complexity.

Meanwhile in Córdoba, Argentina on the edge of the Sierra Chica mountain is the Conference on Logic, Computability and Randomness 2004. Much of this conference focuses on random sets, not from a probability distribution but single sets that "look" random. Rater surprisingly, whether we define random sets by passing statistical tests, foiling betting strategies or using Kolmogorov complexity, these notions often lead to equivalent definitions of random.

While circumstances keep me in Chicago I say to the participants of both meetings: Enjoy the mountains and save some theorems for me.

Friday, September 17, 2004

NSF News

Arden Bement, who has been serving as the acting director of the National Science Foundation since Rita Colwell stepped down in February, has been nominated for the permanent director position.

The US Senate continues to work on the appropriation bills to set funding levels for next years US fiscal year that starts October 1. The CRA has a summary of the various bills affecting research funding in computer science (see also this note from AIP). Most worrisome (as I've mentioned before) is the funding for the VA-HUD bill that covers the NSF. The House committee would cut the budget by 2%. The CRA has an Advocacy Alert, still time to contact your senators.

Wednesday, September 15, 2004

Email's Curse of Success

Computer Scientists have communicated via email since the mid-1980's. Back then email worked quite well: You would send a message and usually get a quick response. We avoided telephone tag, trying to reach each other by phone when we needed to talk to each other quickly. Research ideas spread quickly; the world became smaller. Even within a department communication became paperless as one can get a message out to everyone far quicker electronically.

So what went wrong? We still use email today as the primary source of communication among computer scientists. But send a message today and I've learned to wait on average a couple of days to expect a response, if I get one at all.

Spam is the obvious culprit. Spam does clog our inboxes and even worse many of us don't carefully go through our spam folders and some legitimate mail gets unread. Spam has also made some computer scientists reluctant to share their email addresses online. But spam is not the only issue.

Email has become the communication of choice in the rest of the world as well. Besides messages from other scientists, I get email about my daughter's soccer team, announcements of upcoming concerts, warnings from the local police departments, a morning summary of the New York Times, financial information, utility bills and much more. All legitimate and usually useful email but it takes longer to work through it and slows down the time to respond to other scientists. Not to mention the many other web distractions such as news and weblogs (So stop reading this blog and respond to my emails. You know who you are.)

I can't rely on older technologies; since computer scientists expect email they check even less often their phone messages and postal mail. I can't rely on newer technology; computer scientists are surprisingly slow in adapting to new tools (like mail attachments) and it'll be years before instant messaging becomes common in the scientific community.

Oddly enough in our highly connected society it becomes harder and harder to get someones attention. So what am I doing? Slowly collecting the cell phone numbers of other computer scientists. Want mine? Send me an email.

Tuesday, September 14, 2004

Is the AP Test to Blame for Shifting CS Enrollments?

It is no secret that undergraduate enrollments in computer science have been dropping over the past five years. Students follow the money and, with the perception of a weaker job market in computer-oriented careers, less students are willing to study computer science.

Why does computer science follow the job market so closely? We don't see such swings in physics or history but such swings are common in engineering disciplines. Are undergraduate viewing computer science more as engineering than science? And why?

One theory I recently heard puts the blame on the Advanced Placement (AP) Computer Science Exam given to high school students. The reasoning goes as follows: The AP exam has a strong emphasis on the Java programming language and so high school teachers, teaching to the exam, focus most of their course on syntax and coding of Java. This gives the impression to students that computer science = programming.

I don't agree with this assessment. I looked at some sample CS AP tests. The tests, particularly the AB exam, requires some knowledge of data structures and simple algorithms. Nothing deep but enough that students should realize that computer science is more than just programming.

There was a surge of interest in computer science when I started college in the early 1980's (with the advent of personal computers) before an AP test in Computer Science even existed. Also I've heard of declines in enrollments outside the US where they don't use the AP tests.

But in the end we shouldn't be that worried about shifting enrollments. Advances in computer technology have helped drive computer science from a virtually non-existent discipline forty years ago to one that many universities now consider one of their most important departments. Better to have enrollments that swing up and down with the state of the computer industry than one that stagnates at the low end.

Sunday, September 12, 2004

Favorite Theorems: List Decoding

August Edition

In coding theory, one typically maps a string to a code such that with some small amount of error in the code one can still recover the original string. What if the amount of error is too much to give a unique decoding? In the 1950s Peter Elias suggested the idea of list decoding, coming up with a short list of possibilities one of which is correct.

Madhu Sudan showed that list decoding can be achieved in scenarios where one cannot do unique decoding.

Decoding of Reed-Solomon Codes Beyond the Error-Correction Bound by Madhu Sudan

In this paper Sudan gives a polynomial-time list-decoding algorithm that can deal with errors in the code beyond what regular codes can handle. Later Guruswami and Sudan give a polynomial-time algorithm that handles what is believed to be the best possible amount of error.

List-decodable codes have had applications to many areas including pseudo-random generators, extractors, hard-core predicates, probabilistically-checkable proofs and a neat result by Sivakumar on the implications of SAT being membership-comparable.

We've seen many other important papers in coding theory from computer scientists over the last decade. Besides the work on list decoding I should also mention Spielman's breakthrough result showing linear time encodable and decodable codes building on the initial work of using expander graphs for codes developed by Sipser and Spielman.

For much more on coding see the surveys by Sudan on List Decoding and Guruswami on Error-correcting codes and Expander Graphs.