With the US elections on Tuesday and politics on everyone's mind, let's open the comments for anyone who has anything they want to say about the presidential race. Get it off your chest. I only ask you to be civil. And don't forget to vote.
Sunday, October 31, 2004
Friday, October 29, 2004
In our publish or perish society, authoring papers helps you succeed, in getting hired, promoted and receiving grants and awards. So choosing who is an author of a paper, particularly important papers, can be an important and sometimes messy decision complicated by the fact that the authors have to do the choosing.
An author should have made significant contributions to a paper. But how do we define significant? A person who produces key ideas in the proof of a main result certainly becomes an author. A person who simply writes up the proof should not be. But what about the person who works out the messy but straightforward details in a proof? What about the person who poses the questions but has no role in the proof? Tricky situations that one needs to handle on a case-by-case basis.
An advisor should hold him or herself to a higher standard. A good advisor guides the research for a student and should not become a co-author unless the advisor had made the majority of the important ideas in the proofs. Likewise we hold students to a slightly lower standard to get them involved in research and exposition of their work.
Computer scientists tend to add co-authors generously. While seemingly nice, this makes it difficult to judge the role authors have played in a paper, and sometimes makes who you know or where you are more important than what you know.
Tuesday, October 26, 2004
Since computer science takes its conferences more seriously than the journals and most conferences have hard deadlines, we have become a deadline-driven field. Most authors submit their papers on the deadline day, if not in the last hour or two. Papers get written to meet a deadline which has both good and bad aspects: good in that papers get written and bad in that they get written quickly and often incompletely.
Sometimes conference organizers see a lack of submissions (forgeting that most papers come on deadline day) and extend the deadline by a few days or a week. I've often heard people complain about losing their weekends if a deadline moves from Friday to Monday. Why? You could still submit on Friday. People feel their papers are never complete and they need to keep fixing it up until the last possible second even though these last minute changes will not likely affect acceptance.
One person I knew once turned down an opportunity to attend a workshop because of a grant deadline on the same week. This was six months beforehand. A little planning is all that's needed to submit the grant one week early but some in our field cannot pull this off even months ahead of time.
Remember that deadlines are upper bounds, no shame in submitting early. And it's not the end of the world if you miss a deadline; there's always another conference with another deadline right around the corner.
Monday, October 25, 2004
I prefer to use the Martian rule. Suppose we find a Martian civilization at about the same level of scientific progress that we have. If they have a concept the same or similar to one of ours than that concept would be natural, having developed through multiple independent sources.
Of course we don't have a Martian civilization to compare ourselves with so we have to use our imagination. I firmly believe the Martians would have developed the P versus NP question (or something very similar, assuming they haven't already solved it) making the question very natural. On the other hand I suspect the Martians might not have developed a class like LWPP. Other classes like UP are less clear, I guess it depends whether the Martians like their solutions unique.
Applying the Martian rule to Gasarch's post, WS1S is probably more natural than regular languages without squaring that equal Σ*. At least my Martians would say that.
Saturday, October 23, 2004
Computation Complexity and Computational Learning share many aspects and goals. We both analyze and compare different models of computation either to understand what they can compute or how to find the appropriate model to fit some data. No surprise that many of the tools developed in one area have use in the other. This month's paper exemplifies the connection; it uses tools in complexity to get a nice learning theory result which in turn has very nice implications in complexity theory.
Nader Bshouty, Richard Cleve, Ricard Gavaldà, Sampath Kannan and Christino Tamon
The main result shows how to learn circuits probabilistically with equivalence queries and an NP oracle. An equivalence query given a circuit C trying to learn a language L, either says C is correct or gives an input where it fails. The proof uses a clever divide and conquer argument built upon Jerrum, Valiant and Vazirani's technique for random sampling with an NP oracle.
Kobler and Watanabe observe that if SAT has small circuits we can answer equivalence queries to SAT with an NP oracle. Applying Bshouty et. al. they show that if SAT has polynomial-size circuits than the polynomial-time hierarchy collapses to ZPPNP.
Sengupta noticed that old results give a consequence of PH in S2p if SAT has small circuits. This strengthens Kobler and Watanabe because of Jin-Yi Cai's result showing that S2p is contained in ZPPNP. Cai's paper uses many techniques similar to Bshouty et. al. Shaltiel and Umans have a recent result giving weaker assumptions for derandomizing S2p by derandomizing random sampling.
If SAT does not small circuits, the Bshouty et. al. algorithm produces a small set of inputs that give a co-NP proof of this fact. Pavan, Sengupta and myself applied this fact to the two queries problem.
Thursday, October 21, 2004
The Red Sox didn't win the series that year and failed to return until this year. After an amazing comeback against their rivals, the New York Yankees, the Red Sox will host the first game of the World Series on Saturday against the St. Louis Cardinals.
By far baseball is the favorite team sport among American computer scientists (at least of those that care about sports at all). Why? Mabye because it's a discrete game with a small state space. At Fenway Park (Boston's home field) they use lights to give the number of ball, strikes and outs in unary notation. The game has many nice mathematical properties and not just the myriad of statistics. For example, it is a theorem of baseball that at any point in a half inning the number of batters is equal to the sum of the number of outs, the number of runs scored and the number of men on base. Proof by induction.
The real reasons I love baseball are less tangible. Both a team sport and a one-on-one contest between pitcher and batter. A strategic game dealing with balancing probabilities. Suspense on every pitch. And much more.
By far the plurality of baseball fans in our field seem to root for the Red Sox. Probably because most of us spent at least part of our academic career in the Boston area and Boston takes its baseball far more seriously than any other city. In full disclosure, my favorite team is the Chicago White Sox but I root for the Red Sox in their absence.
Nothing beats attending baseball game live, especially in Fenway. Alas I never managed to attend a world series game though I've come very close.
October 14, 1992: The Pittsburgh Pirates won the National League East and the World Series was scheduled to open during FOCS in Pittsburgh. I wrote for and got tickets to the first game if Pittsburgh made the series. In the NLCS Atlanta scored three runs in the bottom of the ninth of game 7, meaning Atlanta and not Pittsburg would host the series. When Cabrera hit the single scoring those final two runs, I sat staring at the TV and cried.
Wednesday, October 20, 2004
Why is it hard for us to explain to the layperson what we do? The following true story is telling. I will label the characters MOM and BILL.
MOM: What kind of thing do you work on?
BILL: (thinking: got to give an easy example)
Lets say you have n, er 1000 numbers. You want to find the — (cut off)
MOM: Where did they come from?
BILL: Oh. Lets say you have 50 numbers, the populations of all of the states of America, and you want to find — (cut off)
MOM: Did you include the District of Columbia?
MOM: Why not?
BILL: Because its not a state. But for the example it doesn't matter because — (cut off)
MOM: But they should be a state. They have been oppressed to long and if they had their own state then — (cut off)
BILL: But none of that is relevant to the problem of finding the Max of a set of numbers.
MOM: But the problem of Statehood for DC is a more important problem.
BILL: Okay, lets say you have 51 numbers, the populations of the 50 states and the District of Columbia.
MOM: What about Guam?
BILL: I should have majored in Government and Politics…
To the person on the street the very definition of a problem is … problematic. Abstraction that we do without blinking an eye requires a conceptual leap that is hard or at least unfamiliar to most people.
Even people IN computer science may have a hard time understand what we are talking about. Here is another real life story between two characters who I will call BILL and DARLING. DARLING has a Masters Degree in computer Science with an emphasis on Software Engineering.
DARLING: Bill, can you give me an example of a set that is provably NOT in P.
BILL: Well, I could say HALT but you want a computable set, right?
BILL: And I could say that I could construct such sets by diagonalization, but you want a natural set, right?
BILL: And I could say that the set of true statements in the language WS1S, but you want a natural set.
DARLING: What is WS1S?
BILL: Weak Monadic Second order with one successor, but I think you agree that if you don't know what it is then it's not natural.
DARLING: Right. So, is there some set that is natural and decidable that is provably not in P?
BILL: AH, yes, the set of regular expressions with squaring that equal Σ* is EXPSPACE complete and hence provably not in P.
DARLING: Why is that problem natural?
BILL: Good Question! A problem is natural if it was being worked on before someone classified it.
DARLING: Okay. What is the first paper this problem appeared in?
BILL: It was in THE EQUIVALENCE PROBLEM FOR REGULAR EXPRESSIONS WITH SQUARING REQUIRES EXPONENTIAL SPACE by Meyer and Stockmeyer, From FOCS 1972. Oh. I guess that proves that its NOT natural.
This story raise the question—what is natural? Do we need that someone worked on a problem beforehand to make it natural? Is it good enough that they should have worked on it? Or that they could have? Logic has the same situation with the Paris-Harrington results of a result from Ramsey Theory that is not in Peano Arithmetic, but the first time it was proven was in the same paper that proved it was not provable in PA.
Incidentally, there are more natural problems that are not in P. Some games on n by n boards are EXPSPACE or EXPTIME complete and hence not in P. Would have been a better answer, though it would not have made as good a story.
Tuesday, October 19, 2004
To answer one of Lance's previous posts, the Internet is definitely harming conferences: most everyone who stayed up until 5 AM to watch the Red Sox beat the Yankees in 14 innings on MLB.TV has not made it to James Lee's talk at 8:30 AM on embeddings (in fact I think I'm the only one who did make it here). Krauthgamer, Lee, Mendel, and Naor presented a powerful new method for constructing embeddings of finite metric spaces called measured descent which, among other things, implies optimal volume-respecting embeddings (in the sense of Feige).
I checked the registration numbers and indeed only 172 people have officially registered for the conference—that's 100 less than the registration at STOC in Chicago.
Yesterday I mentioned the winners of the best paper award. I should also mention the best student paper award winners: Lap Chi Lau's An Approximate Max-Steiner-Tree-Packing Min-Steiner-Cut Theorem shared the prize with Marcin Mucha and Piotr Sankowski's Maximum Matchings via Gaussian Elimination. Lau's paper gives the first constant factor approximation to the problem of finding a large collection of edge-disjoint trees that connect an undirected multigraph. Mucha and Sankowski give a nice method for finding maximum matchings in general graphs in time O(nω) where ω is the matrix multiplication exponent. Lovász showed how to test for a matching in a graph using matrix multiplication, and Mucha and Sankowski extend this and actually recover the matching.
Valiant's talk on Holographic Algorithms was well attended: he described a new, quantum-inspired method for constructing polynomial-time algorithms for certain counting problems. The algorithms are classical (no quantum required) and give the first efficient solutions for problems such as PL-Node-Bipartition: find the cardinality of a smallest subset of vertices V' of a max-degree 3, planar graph such that deletion of V' (and its incident edges) causes the graph to become bipartite. At the end he gave a simple criterion for proving P = P#P via these techniques!
Monday, October 18, 2004
Rome is the host city for this year's FOCS conference. While everyone enjoys visiting one of the world's great capitals, attendance at the sessions can occasionally suffer, and the sessions this year do seem noticeably smaller. Another explanation could be the the high cost of traveling to and staying in Rome. On the plus side, I get to see many European theorists of whom I had known in name only.
For those who did make the trek to the southern tip of the Villa Borghese, the first day featured the presentation of the two results which won best paper: Subhash Khot's Hardness for Approximating the Shortest Vector in Lattices and Applebaum, Ishai, and Kushilevitz's Cryptography in NC0. Subhash was an author of two other impressive hardness results in the same session: Ruling Out PTAS for Graph Min-Bisection, Densest Subgraph and Bipartite Clique (the title is self-explanatory) and Optimal Inapproximability Results for Max-Cut and Other 2-variable CSPs? (with Kindler, Mossel, and O'Donnell) which gives evidence that the Max-Cut approximation algorithm of Goemans-Williamson is the best possible.
The cryptography session featured the above Cryptography in NC0 paper which Lance mentioned in an earlier post as well as an intriguing result due to Salil Vadhan, An Unconditional Study of Computational Zero Knowledge showing how to establish important properties of computational zero knowledge proofs without assuming the existence of a one-way function.
The controversial topic of what to do with the special issue of FOCS continued at last night's business meeting. It appears as though Elsevier will lose another opportunity to publish a special issue of STOC/FOCS, as a vote last night indicated a strong desire to give SICOMP the responsibility instead (a similar thing occurred at STOC this year).
Saturday, October 16, 2004
We also saw a series of presentations by Andris Ambainis, Robert Špalek and Mario Szegedy. Ambainis described his improved method for showing lower bounds for quantum algorithms that provably beats the degree method. Špalek talked about his work with Szegedy showing that Ambainis techniques as well as different tools developed by Zhang, Laplante and Magniez, and Barnum, Saks and Szegedy all gave the same bounds. Szegedy, in his presentation, called this measure the div complexity and showed that the size of a Boolean formula computing a function f is at least the square of the div complexity of f.
Wednesday, October 13, 2004
Although Dagstuhl has expanded, rooms not only have phones but WiFi throughout. We can answer email, read news, write weblog posts (as I am doing now) from the comfort of our own isolated desks. We're watching baseball games and the debate over the internet. But worse than being connected, the rest of the world knows we're connected. I find myself having to take time to fix problem sets for my class and deal with departmental issues as do many of my other colleagues here.
The internet has greatly helped science by bringing us closer together but also prevents us from being disconnected losing many of the advantages of these workshops. A sign here proclaims "Are you here for computer networking or human networking?" Something to remember next time you go to a conference.
Tuesday, October 12, 2004
A group of German software engineering professors share Dagstuhl with us this week. They are meeting to discuss future directions of German software engineering research and to find ways to increase student enrollment. The drop in students desiring a computer science degree is not just an American phenomenon.
Monday, October 11, 2004
The Grace Hopper Celebration of Women in Computing was held in Chicago on October 6-9. This was the fifth such conference, since its inception in 1994. This year there were over 800 attendees from all over the country.
This conference is a forum for discussion of issues faced by women and a showcase for achievements of women in the fields of computing and technology. There were a number of talks on social issues, some technical presentations by young investigators, and a few invited technical talks. There were also a number of social and networking events hosted by IBM, Microsoft, Google and others.
Among the invited talks, there were three I particularly enjoyed.
Jessica Hodgins of CMU talked about connections between ideas in robotics and computer graphics and animation, especially simulation of human movements.
Cynthia Dwork of Microsoft Research spoke about the problem of publishing (transformed) data from public databases (such as census data) so as to maintain a balance between the utility of the published information and the protection of the privacy of individuals represented by the data. Her approach to privacy is influenced by ideas from cryptography.
Daniela Rus of MIT spoke about self-reconfiguring robots. These robots are distributed systems, consisting of a number of identical modules which can dynamically adapt the way that they are connected to each other to best fit the task at hand.
Thursday, October 07, 2004
She had no problem solving questions like: Beginning and ending at the entrance, describe the shortest route you can take that will allow you to see 4 different kinds of animals.
"You're doing computer science," I said.
"I don't see any computers," she responded.
"Computer science is problem solving like finding the shortest path."
"Then computer science is pretty easy."
"OK, is there a way to visit every animal exactly once?"
She found this question much more challenging.
The names get their meaning from a structure you still see in many other countries (Germany is a good example). Here you have research groups, where a lead professor has nearly complete control of hiring and the budget. The equivalents of assistant and associate reflect the temporary and permanent faculty members of those groups.
How does this affect graduate studies? In Germany a grad student joins a group and works within that group. In the United States a student joins a department usually without a specific advisor in mind and often not initially committing to a specific subfield of computer science.
So to those who send me and other American computer scientists requests to join our groups, the US system doesn't work that way. Instead go to the departmental web page and follow the appropriate links to find information on how to apply to that department. If you have a specific researcher that you want to work with, use the personal statement to say this and your reasons for it.
Trust me, we read the applications carefully and choose Ph.D. candidates as best as we can. It just doesn't help to send personal requests, I just point to our web page and trash the email.
Wednesday, October 06, 2004
Tuesday, October 05, 2004
The first will be held at the IBM Almaden Research Center on Monday, October 25, 2004.
Please join the community in honoring the memory of one of the great complexity theorists.
Sunday, October 03, 2004
The econ theory seminar took place in a medium-size conference room with a long table. Graduate students sat in chairs along the walls. The speaker was at one end of the table and econ professors, usually including multiple Nobel prize winners, around the rest of the table. A copy of the paper was sent with the talk announcement and almost from the first slide the faculty aggressively attacked the speaker with questions about the validity of the model or the importance of the results. (Remember this was the theory seminar, imagine the questions at the political economics seminar). At the end of the seminar time, the talk ended and everyone left the room. No applause.
I don't recommend that we follow this model in theoretical computer science. However we usually go to the other extreme and (outside of crypto) rarely ask negative questions in a seminar. Typically the only negative feedback we get in our field is from anonymous referees and reviewers. If we were forced to defend our research in an interactive setting, we would establish a better understanding of the importance of the models and results of our own research.