Sunday, October 31, 2004

Election Day

A true story on Election Day 2000: An Israeli postdoc in the US came to work and said "I have watched the conventions and seen the debates. I have studied the platforms and as much news analysis as I could get hold of. After serious consideration I decided that, if I were allowed to vote, I would vote for Bush." An American computer scientist walked in soon thereafter and said "I woke up this morning and decided to vote for Nader." Draw your own moral.

With the US elections on Tuesday and politics on everyone's mind, let's open the comments for anyone who has anything they want to say about the presidential race. Get it off your chest. I only ask you to be civil. And don't forget to vote.

Friday, October 29, 2004

The Co-Author Conundrum

In theoretical computer science we traditionally list the co-authors of our papers alphabetically. Done this way for "fairness" it leads to a binary notion of author. Either you are an equal author of a paper or you are off the paper. There is no middle ground.

In our publish or perish society, authoring papers helps you succeed, in getting hired, promoted and receiving grants and awards. So choosing who is an author of a paper, particularly important papers, can be an important and sometimes messy decision complicated by the fact that the authors have to do the choosing.

An author should have made significant contributions to a paper. But how do we define significant? A person who produces key ideas in the proof of a main result certainly becomes an author. A person who simply writes up the proof should not be. But what about the person who works out the messy but straightforward details in a proof? What about the person who poses the questions but has no role in the proof? Tricky situations that one needs to handle on a case-by-case basis.

An advisor should hold him or herself to a higher standard. A good advisor guides the research for a student and should not become a co-author unless the advisor had made the majority of the important ideas in the proofs. Likewise we hold students to a slightly lower standard to get them involved in research and exposition of their work.

Computer scientists tend to add co-authors generously. While seemingly nice, this makes it difficult to judge the role authors have played in a paper, and sometimes makes who you know or where you are more important than what you know.

Tuesday, October 26, 2004

Nothing Like a Deadline to Ruin Your Day

Some upcoming deadlines: STOC (11/4), Complexity (11/18), Electronic Commerce (12/7), the new NSF program Theoretical Foundations (1/5), and ICALP (2/13). Feel free to comment if I've missed something.

Since computer science takes its conferences more seriously than the journals and most conferences have hard deadlines, we have become a deadline-driven field. Most authors submit their papers on the deadline day, if not in the last hour or two. Papers get written to meet a deadline which has both good and bad aspects: good in that papers get written and bad in that they get written quickly and often incompletely.

Sometimes conference organizers see a lack of submissions (forgeting that most papers come on deadline day) and extend the deadline by a few days or a week. I've often heard people complain about losing their weekends if a deadline moves from Friday to Monday. Why? You could still submit on Friday. People feel their papers are never complete and they need to keep fixing it up until the last possible second even though these last minute changes will not likely affect acceptance.

One person I knew once turned down an opportunity to attend a workshop because of a grant deadline on the same week. This was six months beforehand. A little planning is all that's needed to submit the grant one week early but some in our field cannot pull this off even months ahead of time.

Remember that deadlines are upper bounds, no shame in submitting early. And it's not the end of the world if you miss a deadline; there's always another conference with another deadline right around the corner.

Monday, October 25, 2004

What Would the Martians Think?

In Bill Gasarch's post last week, he discusses what makes a problem natural. You used to hear the argument that a complexity class was natural if it contained an interesting problem not known to be contained in any smaller class. But then would SPP be natural simply because it contains graph isomorphism? On the other hand I find BPP a natural way to define probabilistic computation even though it fails this test. Does a class go from natural to unnatural if a new algorithm for a problem is found?

I prefer to use the Martian rule. Suppose we find a Martian civilization at about the same level of scientific progress that we have. If they have a concept the same or similar to one of ours than that concept would be natural, having developed through multiple independent sources.

Of course we don't have a Martian civilization to compare ourselves with so we have to use our imagination. I firmly believe the Martians would have developed the P versus NP question (or something very similar, assuming they haven't already solved it) making the question very natural. On the other hand I suspect the Martians might not have developed a class like LWPP. Other classes like UP are less clear, I guess it depends whether the Martians like their solutions unique.

Applying the Martian rule to Gasarch's post, WS1S is probably more natural than regular languages without squaring that equal Σ*. At least my Martians would say that.

Saturday, October 23, 2004

Favorite Theorems: Learning Circuits

September Edition

Computation Complexity and Computational Learning share many aspects and goals. We both analyze and compare different models of computation either to understand what they can compute or how to find the appropriate model to fit some data. No surprise that many of the tools developed in one area have use in the other. This month's paper exemplifies the connection; it uses tools in complexity to get a nice learning theory result which in turn has very nice implications in complexity theory.

Oracles and Queries that are Sufficient for Exact Learning
Nader Bshouty, Richard Cleve, Ricard Gavaldà, Sampath Kannan and Christino Tamon

The main result shows how to learn circuits probabilistically with equivalence queries and an NP oracle. An equivalence query given a circuit C trying to learn a language L, either says C is correct or gives an input where it fails. The proof uses a clever divide and conquer argument built upon Jerrum, Valiant and Vazirani's technique for random sampling with an NP oracle.

Kobler and Watanabe observe that if SAT has small circuits we can answer equivalence queries to SAT with an NP oracle. Applying Bshouty et. al. they show that if SAT has polynomial-size circuits than the polynomial-time hierarchy collapses to ZPPNP.

Sengupta noticed that old results give a consequence of PH in S2p if SAT has small circuits. This strengthens Kobler and Watanabe because of Jin-Yi Cai's result showing that S2p is contained in ZPPNP. Cai's paper uses many techniques similar to Bshouty et. al. Shaltiel and Umans have a recent result giving weaker assumptions for derandomizing S2p by derandomizing random sampling.

If SAT does not small circuits, the Bshouty et. al. algorithm produces a small set of inputs that give a co-NP proof of this fact. Pavan, Sengupta and myself applied this fact to the two queries problem.

Thursday, October 21, 2004

Go Sox!

Saturday Evening, October 25, 1986: I huddled with about a dozen of my fellow MIT graduate students (and a couple of faculty) watching game six of the baseball World Series in a Toronto hotel room right before the start of FOCS. The Boston Red Sox led by two runs with two out and none on in the bottom of the tenth against the New York Mets. One more out and the Sox would win their first championship since 1986.

The Red Sox didn't win the series that year and failed to return until this year. After an amazing comeback against their rivals, the New York Yankees, the Red Sox will host the first game of the World Series on Saturday against the St. Louis Cardinals.

By far baseball is the favorite team sport among American computer scientists (at least of those that care about sports at all). Why? Mabye because it's a discrete game with a small state space. At Fenway Park (Boston's home field) they use lights to give the number of ball, strikes and outs in unary notation. The game has many nice mathematical properties and not just the myriad of statistics. For example, it is a theorem of baseball that at any point in a half inning the number of batters is equal to the sum of the number of outs, the number of runs scored and the number of men on base. Proof by induction.

The real reasons I love baseball are less tangible. Both a team sport and a one-on-one contest between pitcher and batter. A strategic game dealing with balancing probabilities. Suspense on every pitch. And much more.

By far the plurality of baseball fans in our field seem to root for the Red Sox. Probably because most of us spent at least part of our academic career in the Boston area and Boston takes its baseball far more seriously than any other city. In full disclosure, my favorite team is the Chicago White Sox but I root for the Red Sox in their absence.

Nothing beats attending baseball game live, especially in Fenway. Alas I never managed to attend a world series game though I've come very close.

October 14, 1992: The Pittsburgh Pirates won the National League East and the World Series was scheduled to open during FOCS in Pittsburgh. I wrote for and got tickets to the first game if Pittsburgh made the series. In the NLCS Atlanta scored three runs in the bottom of the ninth of game 7, meaning Atlanta and not Pittsburg would host the series. When Cabrera hit the single scoring those final two runs, I sat staring at the TV and cried.

Wednesday, October 20, 2004

Conversations with Bill

A guest post from William Gasarch

Why is it hard for us to explain to the layperson what we do? The following true story is telling. I will label the characters MOM and BILL.

MOM: What kind of thing do you work on?

BILL: (thinking: got to give an easy example)
Lets say you have n, er 1000 numbers. You want to find the — (cut off)

MOM: Where did they come from?

BILL: Oh. Lets say you have 50 numbers, the populations of all of the states of America, and you want to find — (cut off)

MOM: Did you include the District of Columbia?

BILL: No.

MOM: Why not?

BILL: Because its not a state. But for the example it doesn't matter because — (cut off)

MOM: But they should be a state. They have been oppressed to long and if they had their own state then — (cut off)

BILL: But none of that is relevant to the problem of finding the Max of a set of numbers.

MOM: But the problem of Statehood for DC is a more important problem.

BILL: Okay, lets say you have 51 numbers, the populations of the 50 states and the District of Columbia.

MOM: What about Guam?

BILL: I should have majored in Government and Politics…

To the person on the street the very definition of a problem is … problematic. Abstraction that we do without blinking an eye requires a conceptual leap that is hard or at least unfamiliar to most people.

Even people IN computer science may have a hard time understand what we are talking about. Here is another real life story between two characters who I will call BILL and DARLING. DARLING has a Masters Degree in computer Science with an emphasis on Software Engineering.

DARLING: Bill, can you give me an example of a set that is provably NOT in P.

BILL: Well, I could say HALT but you want a computable set, right?

DARLING: Right.

BILL: And I could say that I could construct such sets by diagonalization, but you want a natural set, right?

DARLING: Right.

BILL: And I could say that the set of true statements in the language WS1S, but you want a natural set.

DARLING: What is WS1S?

BILL: Weak Monadic Second order with one successor, but I think you agree that if you don't know what it is then it's not natural.

DARLING: Right. So, is there some set that is natural and decidable that is provably not in P?

BILL: AH, yes, the set of regular expressions with squaring that equal Σ* is EXPSPACE complete and hence provably not in P.

DARLING: Why is that problem natural?

BILL: Good Question! A problem is natural if it was being worked on before someone classified it.

DARLING: Okay. What is the first paper this problem appeared in?

BILL: It was in THE EQUIVALENCE PROBLEM FOR REGULAR EXPRESSIONS WITH SQUARING REQUIRES EXPONENTIAL SPACE by Meyer and Stockmeyer, From FOCS 1972. Oh. I guess that proves that its NOT natural.

This story raise the question—what is natural? Do we need that someone worked on a problem beforehand to make it natural? Is it good enough that they should have worked on it? Or that they could have? Logic has the same situation with the Paris-Harrington results of a result from Ramsey Theory that is not in Peano Arithmetic, but the first time it was proven was in the same paper that proved it was not provable in PA.

Incidentally, there are more natural problems that are not in P. Some games on n by n boards are EXPSPACE or EXPTIME complete and hence not in P. Would have been a better answer, though it would not have made as good a story.

Tuesday, October 19, 2004

More FOCS News

Fair and balanced coverage from Adam Klivans

To answer one of Lance's previous posts, the Internet is definitely harming conferences: most everyone who stayed up until 5 AM to watch the Red Sox beat the Yankees in 14 innings on MLB.TV has not made it to James Lee's talk at 8:30 AM on embeddings (in fact I think I'm the only one who did make it here). Krauthgamer, Lee, Mendel, and Naor presented a powerful new method for constructing embeddings of finite metric spaces called measured descent which, among other things, implies optimal volume-respecting embeddings (in the sense of Feige).

I checked the registration numbers and indeed only 172 people have officially registered for the conference—that's 100 less than the registration at STOC in Chicago.

Yesterday I mentioned the winners of the best paper award. I should also mention the best student paper award winners: Lap Chi Lau's An Approximate Max-Steiner-Tree-Packing Min-Steiner-Cut Theorem shared the prize with Marcin Mucha and Piotr Sankowski's Maximum Matchings via Gaussian Elimination. Lau's paper gives the first constant factor approximation to the problem of finding a large collection of edge-disjoint trees that connect an undirected multigraph. Mucha and Sankowski give a nice method for finding maximum matchings in general graphs in time O(nω) where ω is the matrix multiplication exponent. Lovász showed how to test for a matching in a graph using matrix multiplication, and Mucha and Sankowski extend this and actually recover the matching.

Valiant's talk on Holographic Algorithms was well attended: he described a new, quantum-inspired method for constructing polynomial-time algorithms for certain counting problems. The algorithms are classical (no quantum required) and give the first efficient solutions for problems such as PL-Node-Bipartition: find the cardinality of a smallest subset of vertices V' of a max-degree 3, planar graph such that deletion of V' (and its incident edges) causes the graph to become bipartite. At the end he gave a simple criterion for proving P = P#P via these techniques!

Monday, October 18, 2004

FOCS News

Adam Klivans reports from Rome.

Rome is the host city for this year's FOCS conference. While everyone enjoys visiting one of the world's great capitals, attendance at the sessions can occasionally suffer, and the sessions this year do seem noticeably smaller. Another explanation could be the the high cost of traveling to and staying in Rome. On the plus side, I get to see many European theorists of whom I had known in name only.

For those who did make the trek to the southern tip of the Villa Borghese, the first day featured the presentation of the two results which won best paper: Subhash Khot's Hardness for Approximating the Shortest Vector in Lattices and Applebaum, Ishai, and Kushilevitz's Cryptography in NC0. Subhash was an author of two other impressive hardness results in the same session: Ruling Out PTAS for Graph Min-Bisection, Densest Subgraph and Bipartite Clique (the title is self-explanatory) and Optimal Inapproximability Results for Max-Cut and Other 2-variable CSPs? (with Kindler, Mossel, and O'Donnell) which gives evidence that the Max-Cut approximation algorithm of Goemans-Williamson is the best possible.

The cryptography session featured the above Cryptography in NC0 paper which Lance mentioned in an earlier post as well as an intriguing result due to Salil Vadhan, An Unconditional Study of Computational Zero Knowledge showing how to establish important properties of computational zero knowledge proofs without assuming the existence of a one-way function.

The controversial topic of what to do with the special issue of FOCS continued at last night's business meeting. It appears as though Elsevier will lose another opportunity to publish a special issue of STOC/FOCS, as a vote last night indicated a strong desire to give SICOMP the responsibility instead (a similar thing occurred at STOC this year).

Saturday, October 16, 2004

Some Dagstuhl Presentations

At Dagstuhl Manindra Agrawal presented recent work of his students Neeraj Kayal and Nitin Saxena (the trio that showed a polynomial-time algorithm for primality testing) on rings given by a matrix describing the actions on the base elements. They show a randomized reduction from graph isomorphism to ring isomorphism and from factoring to #RI, counting the number of ring isomorphisms. They also show a polynomial-time algorithm for determining if there are any non-trivial automorphisms of a ring and that #RI is computable with an oracle for AM∩co-AM. Agrawal conjectured that #RI is computable in polynomial time, a conjecture that would imply factoring and graph isomorphism have efficient algorithms.

We also saw a series of presentations by Andris Ambainis, Robert Špalek and Mario Szegedy. Ambainis described his improved method for showing lower bounds for quantum algorithms that provably beats the degree method. Špalek talked about his work with Szegedy showing that Ambainis techniques as well as different tools developed by Zhang, Laplante and Magniez, and Barnum, Saks and Szegedy all gave the same bounds. Szegedy, in his presentation, called this measure the div complexity and showed that the size of a Boolean formula computing a function f is at least the square of the div complexity of f.

Wednesday, October 13, 2004

Is the Internet Harming Dagstuhl?

Dagstuhl was designed as a place to bring a small group of researchers to an isolated environment where they could give some talks, discuss research and otherwise socialize among themselves free from other distractions. No televisions though a radio bought to hear news during the 1991 Gulf War. We could get two-day old news from America via the Herald Tribune. While they had computer rooms, in the early days we had no world wide web and email was far less used. Instead we had rooms for coffee, rooms for beer and wine, rooms for billiards and music and rooms just to hang out. Everyone stayed on premises and we had no phones in rooms, just a couple communal phones to call home.

Although Dagstuhl has expanded, rooms not only have phones but WiFi throughout. We can answer email, read news, write weblog posts (as I am doing now) from the comfort of our own isolated desks. We're watching baseball games and the debate over the internet. But worse than being connected, the rest of the world knows we're connected. I find myself having to take time to fix problem sets for my class and deal with departmental issues as do many of my other colleagues here.

The internet has greatly helped science by bringing us closer together but also prevents us from being disconnected losing many of the advantages of these workshops. A sign here proclaims "Are you here for computer networking or human networking?" Something to remember next time you go to a conference.

Tuesday, October 12, 2004

Back in Dagstuhl

I'm back in Dagstuhl for the workshop on Algebraic Methods in Computational Complexity. The roof looks great. I have attended several Dagstuhl workshops for over twelve years now since the workshop on Structure and Complexity Theory in 1992. I have seen Dagstuhl expand and evolve over these years and this is the first time I feel that Dagstuhl has achieved its completed state. I love coming here; Dagstuhl has a contained environment in a pretty but boring part of Germany where we complexity theorists give seminars, eat and drink together and talk science and other stuff. Politics and baseball seem to dominate the discussions this week.

A group of German software engineering professors share Dagstuhl with us this week. They are meeting to discuss future directions of German software engineering research and to find ways to increase student enrollment. The drop in students desiring a computer science degree is not just an American phenomenon.

Monday, October 11, 2004

A Celebration of Women in Computing

A report from Varsha Dani.

The Grace Hopper Celebration of Women in Computing was held in Chicago on October 6-9. This was the fifth such conference, since its inception in 1994. This year there were over 800 attendees from all over the country.

This conference is a forum for discussion of issues faced by women and a showcase for achievements of women in the fields of computing and technology. There were a number of talks on social issues, some technical presentations by young investigators, and a few invited technical talks. There were also a number of social and networking events hosted by IBM, Microsoft, Google and others.

Among the invited talks, there were three I particularly enjoyed.

Jessica Hodgins of CMU talked about connections between ideas in robotics and computer graphics and animation, especially simulation of human movements.

Cynthia Dwork of Microsoft Research spoke about the problem of publishing (transformed) data from public databases (such as census data) so as to maintain a balance between the utility of the published information and the protection of the privacy of individuals represented by the data. Her approach to privacy is influenced by ideas from cryptography.

Daniela Rus of MIT spoke about self-reconfiguring robots. These robots are distributed systems, consisting of a number of identical modules which can dynamically adapt the way that they are connected to each other to best fit the task at hand.

Thursday, October 07, 2004

NP-Completeness for a 9-Year Old

My nine-year old daughter had a homework problem with the following diagram.

She had no problem solving questions like: Beginning and ending at the entrance, describe the shortest route you can take that will allow you to see 4 different kinds of animals.

"You're doing computer science," I said.

"I don't see any computers," she responded.

"Computer science is problem solving like finding the shortest path."

"Then computer science is pretty easy."

"OK, is there a way to visit every animal exactly once?"

She found this question much more challenging.

Groups versus Departments

In the US the terms Assistant Professor, Associate Professor and Professor represent different stages in one's career but they all play a similar role in research and advising students. An assistant professor is nobody's assistant.

The names get their meaning from a structure you still see in many other countries (Germany is a good example). Here you have research groups, where a lead professor has nearly complete control of hiring and the budget. The equivalents of assistant and associate reflect the temporary and permanent faculty members of those groups.

How does this affect graduate studies? In Germany a grad student joins a group and works within that group. In the United States a student joins a department usually without a specific advisor in mind and often not initially committing to a specific subfield of computer science.

So to those who send me and other American computer scientists requests to join our groups, the US system doesn't work that way. Instead go to the departmental web page and follow the appropriate links to find information on how to apply to that department. If you have a specific researcher that you want to work with, use the personal statement to say this and your reasons for it.

Trust me, we read the applications carefully and choose Ph.D. candidates as best as we can. It just doesn't help to send personal requests, I just point to our web page and trash the email.

Wednesday, October 06, 2004

Holy Trefoils, Math Fans!

Can one use a comic book and a toy to teach a complicated subfield of mathematics? Why knot?

Tuesday, October 05, 2004

Larry Stockmeyer Commemorations

Ron Fagin asked me to announce two public commemorations of Larry Stockmeyer and his work.

The first will be held at the IBM Almaden Research Center on Monday, October 25, 2004.

The second will be held in conjunction with STOC '05 in Baltimore on May 21-22, 2005.

Please join the community in honoring the memory of one of the great complexity theorists.

Sunday, October 03, 2004

Are we too nice?

Steve Smale talked about his experiences in the Economics Theory Workshop at Chicago, particularly the aggressive questioning. I didn't attend his talk but I did go to a few of the econ theory seminars years ago and it forms an interesting contrast to the CS theory talks which have few usually technical questions followed by polite applause.

The econ theory seminar took place in a medium-size conference room with a long table. Graduate students sat in chairs along the walls. The speaker was at one end of the table and econ professors, usually including multiple Nobel prize winners, around the rest of the table. A copy of the paper was sent with the talk announcement and almost from the first slide the faculty aggressively attacked the speaker with questions about the validity of the model or the importance of the results. (Remember this was the theory seminar, imagine the questions at the political economics seminar). At the end of the seminar time, the talk ended and everyone left the room. No applause.

I don't recommend that we follow this model in theoretical computer science. However we usually go to the other extreme and (outside of crypto) rarely ask negative questions in a seminar. Typically the only negative feedback we get in our field is from anonymous referees and reviewers. If we were forced to defend our research in an interactive setting, we would establish a better understanding of the importance of the models and results of our own research.