Wednesday, September 29, 2010

NRC Rankings

The NRC "rankings" of Graduate programs was released yesterday. I put up a Google spreadsheet of the CS rankings. Phds.org will also generate rankings using various criteria.

You are not getting actual ranking from the NRC rather two ranking ranges: Statistical-based (R-ranking for "Regression") and Reputation-based (S-ranking for "Survey"). For example, Northwestern CS has a 90% chance of being ranked between 42nd and 73rd (R-ranking) and between 26th and 69th (S-ranking).

Even with these wide ranges there are a number of problems in CS rankings: no citation information is being used, the data is five years old and much of the information is inaccurate (as the Pontiff complains). A simple sanity check on any ranking of CS departments would list the top four as some permutation of MIT, Berkeley, Stanford and Carnegie-Mellon. Congrats to Princeton for breaking this check.

Why no citation information? The NRC originally used the ISI data which didn't cover most conference proceedings that computer scientists consider their main venues of publication. After some discussions with the CRA, the NRC admitted this as a problem and decided to ignore CS publication data citing lack of time to find a different approach.

Yesterday the CRA released a statement on the reliability of the CS rankings.
CRA is pleased that the NRC acknowledges there are errors in the data used to evaluate computer science departments and that, in the words of NRC Study Director Charlotte Kuh, “There’s lots more we need to look at for computer science before we really get it right.”
Schools are taking the opportunity to promote their rankings. Northwestern Engineering boasts how well they did with some nice charts. We're certainly not the only ones.

Did the NRC fulfill their main mission of giving a valuable alternative to the US News rankings? US News uses a "wisdom of crowds" approach by just surveying department and graduate program chairs. The NRC tried a scientific approach which led to years of delays, many complaints about methodologies and accuracy, and a lack of a true ranking. After all the measures we can feed into a formula, the one thing that draws students and faculty to a department is its reputation. Complain as you will about US News, they do try to capture that one statistic.

Meanwhile what do I say to prospective graduate students who cite the low Northwestern CS numbers? I could mention the problems in the methodology, point to the CRA statement, say the numbers are based on data that predates most of the theory group. More likely I will fall back to that trite but true statement: You shouldn't choose a graduate program for their numbers, but for their people.

Tuesday, September 28, 2010

Review of Lipton's Book-Blog/Review of Goldreich and Arora-Barak/New Book Review Column/Do you want to review?

  1. My review of Lipton's new blog-book is here. It will appear in my SIGACT NEWS at some later time.
  2. Daniel Apon's joint review of Computational Complexity: A Conceptual Perspective by Oded Goldreich and Computational Complexity: A Modern Approach by Sanjeev Arora and Boaz Barak is here. It will appear in my SIGACT NEWS at some later time.
  3. My latest SIGACT NEWS book review column (along with all of my other SIGACT NEWS book review columns) can be obtained from here (Ignore the list of books I want reviewed. See next item.)
  4. HERE. is a link to the list of books I want reviewed If you see a book you want then email me at gasarch@cs.umd.edu the name of the book and your postal address. We will then work out details over email- when the review is due, and whether me or the publisher sends it to you. Before volunteering you should read my advice for reviewers. Here is the LaTeX Template.

Monday, September 27, 2010

The Theory Social Network

Derrick Stolee tweeted that the videos from the 2010 Complexity Conference are now on-line. If you want to attend a conference live, don't forget early hotel and conference registration for FOCS ends Thursday. Hope to see you there.

On a related topic, I once wondered if we could have a "virtual complexity coffeehouse" where researchers could electronically drop in, talk research and other stuff. I even once played with Second Life on the hope that one could create the coffeehouse there, but found that environment wanting.

The new Q&A site (http://cstheory.stackexchange.com) comes reasonably close, the format of the communication seems to generate real discussion about various problems and interest among the community. I know it's working because I've already had to tell one of my students that he spends too much time there. Now, if they also had some discussions on controversial issues, advice on living the theory life, and dispensed a good latte we'd be all set.

Thursday, September 23, 2010

Theorems that are less interesting because they are more interesting

(This post was inspired by Joe Kruskal's passing. Kruskal's tree theorem, trees under minor are a well quasi order, relates to example 2 below.)

There are some theorems that seem less interesting once you generalize them. I am not sure they are less interesting, but they seem that way. Here are some.
  1. In 1978 I told my number theory professor that there is a polynomial in many variables with integer coefficients such that the positive numbers in the image are exactly the primes. He thought that was interesting! Maybe it uses algebraic number theory! or cyclotomic fields! or some deep number theory! I then told him that for any r.e. set A there is a polynomial in many variables with integer coefficients such that the positive numbers in that set is exactly A. You prove this by coding Turing Machines into polynomials. No hard number theory needed. It uses some easy number theory and many tricks. He was much less interested. (The result comes from the machinery used to prove Hilbert's Tenth Problem.) Jones came up with the actual polynomial. See here.)
  2. It easy to show that if L is regular (CFL, c.e.) then SUBSEQ(L) is regular (CFL, c.e.). What about if L is decidable? The good news: if L is decidable then SUBSEQ(L) is decidable. That sounds surprising and interesting. It is not. Actually if L is any language whatsoever then SUBSEQ(L) is regular. (High Level Proof: subsequence is a well quasi ordering, hence any downwardly closed subset of Σ*, such as SUBSEQ(L), has a finite obstruction set. This finite obstruction set can be used to give a DFA for SUBSEQ(L).) What I really want to see is a proof of L decidable implies SUBSEQ(L) is decidable that comes from recursion theory. I don't have one. I'm not even sure what that would mean. Show that SUBSEQ(L) is both c.e. and co-c.e.? (For more information see this post of mine.)
If you know theorems that are less interesting because they are more interesting, then please comment.

Tuesday, September 21, 2010

The World in Its Own Terms

The New York Times Magazine last Sunday focused on technology on education. Lots of good reads but what caught my eye was an article by Microsoft Research's Jaron Lanier, basically arguing that computational thinking ruins your enjoyment of life.
A career in computer science makes you see the world in its terms. You start to see money as a form of information display instead of as a store of value. Money flows are the computational output of a lot of people planning, promising, evaluating, hedging and scheming, and those behaviors start to look like a set of algorithms. You start to see the weather as a computer processing bits tweaked by the sun, and gravity as a cosmic calculation that keeps events in time and space consistent.
This way of seeing is becoming ever more common as people have experiences with computers. While it has its glorious moments, the computational perspective can at times be uniquely unromantic.
Nothing kills music for me as much as having some algorithm calculate what music I will want to hear. That seems to miss the whole point. Inventing your musical taste is the point, isn’t it? Bringing computers into the middle of that is like paying someone to program a robot to have sex on your behalf so you don’t have to.
And yet it seems we benefit from shining an objectifying digital light to disinfect our funky, lying selves once in a while. It’s heartless to have music chosen by digital algorithms. But at least there are fewer people held hostage to the tastes of bad radio D.J.’s than there once were. The trick is being ambidextrous, holding one hand to the heart while counting on the digits of the other.
Lanier is mostly upset that a computer can predict his likes and dislikes so easily. Sorry Jaron, you are just not as complex as you thought you were.

Monday, September 20, 2010

Joseph B Kruskal passed away (Guest post by Clyde P Kruskal)

(Guest post from Clyde P Kruskal.)

Yesterday (September 19, 2010) my uncle, Joseph B. Kruskal passed away. I just wanted to say a few personal words. Maybe, later, someone will discuss his work. Although he is famous in computer science for Kruskal's Algorithm and Kruskal's Tree Theorem he was not primarily a computer scientist. He was also a statistician and psychometrician.

I recall as a child how happy I was when he came to visit, usually after giving a talk somewhere. I also remember that once, when our extended family had a get together on the Long Island Sound, he spent all day taking the children one-by-one out sailing.

I always knew the three Kruskal brothers were well-known mathematicians. But it still surprised me the first time I actually saw my uncle's work referenced in a book. I was in college working at a summer job, learning some graph theory, which at the time I knew nothing about. As I turned the page of the book, there was Kruskal's algorithm! I had never heard of it until then.

Now I have the pleasure of teaching the algorithm in my classes. I used to think that when I got old enough I would be able to fool students into thinking that it was actually my own, but it never quite worked out that way.

I did use his theorem on Totally Unimodular Matrices that he co-authored with Alan Hoffman, in a paper I wrote with Marc Snir. When I told Joe we were referencing the theorem, he told me that he rarely collaborated on research, but this was an exception, which he very much enjoyed. When looking up the reference just now, I found an interesting description of their collaboration, which confirmed how I recalled it.

There was a period of time when I used to visit Bell Labs, where Joe worked. I would stay with him and my Aunt Rachel, who were wonderful hosts. I used to enjoy our wide ranging dinner conversations, and I learned so much about words, politics, statistics, my family, etc.

I apologize if this is a bit disjointed. It makes me feel better having written it on this very sad day.

Thursday, September 16, 2010

Should You Mentor High School Students?

I have mentored many high school students on projects (21 in the past, 6 right now). Is this a good use of ones time? I note my experiences and advice- if you have different experiences and advice, feel free to share.
  1. Unlike PhD students you don't have to worry about the job market, support money, or if they prove something original.
  2. I have not gotten papers out of it. Even very good students lack a certain maturity for paper writing. (There have been exceptions.)
  3. It is good of society. (The most important problem facing society today, that I can do something about, is that not enough high school students know Ramsey Theory.)
  4. By explaining material to them it has helped me sharpen my own understanding.
  5. Most of the students I have advised have been pretty good mathematically (they already know discrete math and induction and how to prove things). Some have been good at coding which also comes in handy. Most have come from Blair High School's Magnet Program. (They have a very strong physics department which specializes in Magnetism.)
  6. Having students do a project where they can code some stuff is good in that SOMETHING will come of it. Also, this may be something you wouldn't normally do so it may help you.
  7. The standards of what is original are different on this level. The project does not have to really be original in the sense we would mean it. If they work out something that you already know the answer to and write it up that's fine. It may be more accurate to call it a research experience.
  8. The mostly did projects that DID NOT require a lot of background. Even if they are very good, they are unlikely to have a lot of background. This is why many of them have worked in Ramsey Theory. You may be saying but bill, you like Ramsey Theory anyway. True- but one of the reasons I got interested in it was to help mentor high school students. Its a chicken-and-egg thing. (This metaphor may die soon as scientists now think the chicken came first.)
  9. The student of mine who has gone the furthest in Math is Jacob Lurie. who is now a full professor at Harvard (in Math). He didn't need much help; however, I did help him learn some recursion theory (we went through all of Soare's book) His project was on Recursive Surreal Numbers.
  10. Many of my students enter various contests. For every thousand dollars they win, I get a free lunch. (If Jacob wins the Fields Medal I'll get 15 free lunches!)
  11. Recruitment. Some of my mentorees have come to UMCP, though that is not really on my mind when I agree to be a mentor.
  12. Why have I mentored so many? I have never sought them out--- they find me since I have mentored people in the past. Also, if a HS student comes around and wants a mentor they are often pointed to me.
  13. Should you mentor high school students? It is unlikely to help you with your research program (there are exceptions). But if you do mentor high school student then (1) make sure its not a big time sink, and (2) use it to learn or relearn results you've forgotten (Recently I was forced to relearn the Erdos-Szekeres Theorem).
  14. So, what have they worked on. Including this years gang and their tentative projects here is the breakdown:
    1. 11 worked on Ramsey Theory (3 of these worked together).
    2. 3 worked on Duplicator-Spoiler Games (together).
    3. 3 worked on Empirical Algorithms (2 of these worked together).
    4. 3 worked on Crypto.
    5. 2 analyzed some Simple Games.
    6. 1 worked on Recursive Surreal Numbers.
    7. 1 worked on Graph Isomorphism.
    8. 1 worked on Monotone Circuits.
    9. 1 worked on the Prob Method.

Tuesday, September 14, 2010

How Can You Spend $150 Million?

Suppose you happen to have $150 million burning in your pocket that you want to use to help the theoretical computer science community ($150 million endowed will bring in about $6 million/year) How would you use it?
  • Create an Institute for Theory of Computing?
  • Endow a few theory conferences?
  • Create a 150 millennium prizes?
  • Build a fancy new CS building?
  • Endow 25-30 professorships or a 75-100 postdocs?
  • Hire lots of Fields medalists at ridiculous salaries only if they spend all their time working on the P v NP question?
  • Something else?
This is all hypothetical, I don't known anyone, outside of the Simons Foundation, who has $150 million ready to spend.

$150 million doesn't go as far as it used to. In 1992, Henry Rowan gave $100 million to Glassboro State College in New Jersey, now called Rowan University. How much would it cost to get your name on a university today (say at a large private Midwestern university with a geographically confusing name)?

Monday, September 13, 2010

FOCS Travel Support (Guest Post by Paul Beame)

The program for the FOCS 2010 has an excellent set of selected papers and in addition has tutorials on Saturday by Ketan Mulmuley, Mihai Patrascu, and Tim Roughgarden, a memorial for Avner Magen, and an invited talk by recently announced Nevanlinna Prize winner Dan Spielman.


Through two different sources, NSF and IEEE TCMF, there is substantial support for student and postdoc travel and registration for FOCS 2010.  

Travel awards ($15K from NSF)   Students and some postdocs at US institutions.
  • One travel award per institution
  • Up to $750 per attendee
Student registration awards (>= $3K from IEEE TCMF)
  • Apply by September 23
  • Notification by September 28
  • No requirement for US institution or limit per institution
See the Travel-Support page for how to apply.   

The hotel reservation deadline and the early registration deadline for FOCS 2010 are both September 30.

Friday, September 10, 2010

A Mathematical Urban Legend

I have heard the following story many times in many versions:

A PhD student in Math at PRESTIGIOUS SCHOOL was defending his thesis which was in REALLY ABSTRACT AREA and BIG SHOT IN THAT AREA was in town and decided to go to the defense. At the end of the defense BIGSHOT said The object you are studying is NOTION OF TRIVIAL FOR THAT OBJECT.
I've heard this with PRESTIGIOUS SCHOOL being Harvard, MIT, or Princeton. I've heard it with REALLY ABSTRACT AREA being Category theory, Algebraic Geometry, Topology, or Logic. I've heard it with NOTION OF TRIVIAL being the empty set, any finite set, a 1-point space, or an inconsistent system of axioms. I have not heard all possible combinations. For example, if the area was algebraic geometry, it is unlikely that the notion of triviality be an inconsistent system of axioms, unless the teller got his legends wrong.

Is this a pure math pure urban legend or was there ever a case like this? I suspect that it is pure fiction. You may be thinking: Math is so abstract that it could have happened. True enough, but do not confuse what could have happened with what did happen.

Like most urban legends the story persists because, while it may be false, the point it is making--- math is so abstract it is possible to lose track of what you are talking about--- has some truth to it.

If this story is true or even mostly and you KNOW this then please leave a comment to that affect.

Wednesday, September 08, 2010

The Rubik's Cube Conjecture PROVEN! (Do we care?)

In 1994 Fermat's last theorem was proven! In 2003 Poincare's conjecture was proven! In 2010 the Rubik Cube Conjecture was proven! That is, it was shown that the minimum number of moves needed to solve Rubik's Cube is 20 (it was known that there are starting configurations that require 20). Here is the pointer: Rubiks Cube has been solved. (Jeff Erickson also had a blog entry on Rubik's cube lately: here.)

(ADDED LATER. Clarification: From any starting position one can get to the solution in \le 20 moves. There exists a starting position where every solution takes \ge 20 moves. )

We all have a sense that the Rubik Cube result is not as important as the other two. What makes a math problem important?
  1. Would Martians work on it? I suspect Martians would work on FLT and Poincare but not Rubik's cube. One of the first things I would ask Space Aliens is what Math they have done.
  2. Does it connect to other parts of mathematics? This leads to a seemingly circular definition of importance: A topic in math is important if it connects to other topics that are interesting. FLT and Poincare's conjecture are connected to other parts of mathematics. One way to understand Rubik's cube is through group theory; however, this is not so much a connection as an application. Rubik's cube studies have not lead to advances in group theory.
  3. Does the problem have its origins in some real world phenomena? (This is not necessary but helps.)
  4. The problem should be hard but not so hard that you cannot make any progress on it.
  5. The proof must be interesting (hmmm- then we need to define interesting).
These criteria may be too strict. It would be interesting to discuss other branches of math that are considered important that do not quite fit these criteria and see what else makes them important. Or branches that do fit these criteria but are not considered important.

Tuesday, September 07, 2010

How to Write Up Major Results

First some conference news: FOCS conference and hotel registration available on line, early deadline is September 30. STOC 2011 website is up with the CFP, deadline November 4. For more news follow the SIGACT Twitter or Facebook.

Deolalikar's paper caused a stir in the community partly because it was written better than the usual P versus NP proofs we see. But the right comparison is to papers that have settled major questions, such as Agrawal, Kayal and Saxena's Primes in P and Omer Reingold's Undirected s-t Connectivity in Log Space.

Both those papers needed to be convincing right off the bat and they both were. First notice the algorithms (page 3 in AKS and pages 9-12 in Reingold). No vague outline, no "should be able to", just very specific algorithms that one can analyze. Reingold only uses one "clearly" for the simple initial transformation from an adjacency matrix to  a rotation map.

Both papers have to make two arguments, that the algorithms work in the claimed time/space and that they work correctly. In AKS, Section 4 shows the algorithm works correctly and Section 5 (using Lemma 4.3) show the algorithm runs in polynomial time. Section 4 is broken into a series of lemmas, each easily checkable with a limited knowledge of algebra. In Reingold the tricky part is getting the algorithm exactly right so that is uses logarithmic space which is carefully analyzed in his Sections 3 and 4. The correctness proof (based on earlier work) is a rather short Lemma 3.2 but Reingold does go over much of that background in Section 2.

In both these results it took an amazing ingenuity to discover these algorithms, but it wouldn't take more than an undergraduate math major to check the proofs.

Showing that P ≠ NP will be a much more difficult task, instead of coming up with a single algorithm, one has to show that no possible algorithm can solve some specific problem in NP. But even though the proof will be harder, the write-up needs to be just as understandable and easy to follow as AKS and Reingold. The community needs a proof we can verify step by step so that we can either be convinced of the proof or find the problem in the proof. If there is a jump in logic that one cannot verify the author has failed in his or her write-up.

Proving P  ≠ NP will require pure genius but you shouldn't need a Fields medalist to verify the proof.

Friday, September 03, 2010

CS Happenings

Yesterday I was at a meeting of the CCC Council, add in a couple of emails and I have lots of short notes on CS stuff.

In these meetings I get to hear about CS research beyond theory: High-performance (i.e. very fast) computers, computing at the margins (guess what it is before you click), CS roles in energy and health and the debate on the Venn diagram of robotics and cyber-enabled physical systems. Question of the day: If you walk in a room and a sensor turns on a light, does that count as a robot?

Did you know that you can plug devices into a single spigot and electrical outlet and via machine learning techniques determine water and electrical activity in your home. Sure there are real applications for energy conservation but how about the automated Facebook posts.
Lance has just flushed the toilet but failed to turn off the bathroom lights.
Ed Lazowska send around a 1998 piece by Bob Lucky on electrical engineers with the worry that CS might not be far behind.
Projecting the current trends, future computers will consist of a single chip.  No one will have the foggiest idea what is on that chip.  Somewhere in the basement of Intel or its successor will be a huge computer file with the listing of that chip.  The last electrical engineer will sit beside the file, handcuffed to the disk drive like a scene out of "Ben Hur."  That engineer will be extremely well paid, and his or her every demand will be immediately satisfied.  That engineer will be the last keeper of the secret of the universe: E = IR.
The NRC Graduate Assessment Report will be released on September 28. What will be the top ten CS departments? The report won't tell you, rather giving multiple ranking ranges based on five-year old data. Prepare to be disappointed.

Slides from the Snowbird meeting I posted about in July are now available including a vidcast of Sally Fincher's wonderful talk on CS education. On the topics of talks, you can now watch the video of the STOC tutorial talks and ICM Talks (see also Gowers).

Richard Beigel asked me to pass the following note:
PIs who receive awards or funding increments after January 3, 2010 will be required to upload a Project Outcomes Report within 90 days after their award expires.  These reports are for the general public and will not be edited by NSF.  PIs will be able to follow a link from fastlane in order to upload their reports.  Because public support for science is very important, I would ask PIs to polish these reports and include images (with permission from the owner). To the best of my understanding, NSF PIs can continue to upload all required reports via fastlane and can ignore the login instructions in the link for grants.gov. 

Wednesday, September 01, 2010

Intelligent questions about the alleged P NE NP proof

People have asked me how much the alleged proof of P NE NP was discussed at Barriers II. Actually, nobody discussed the proof; however, we did discuss the aftermath. Most of the discussion was about the same questions I had posted a few days ago. In fact, here is a direct quote: Your post mixed interesting questions, which were ignored, and stupid thoughts, which is all anyone commented on. Why don't you repost just the interesting questions?

SO, here is that posts questions, minus the stupid things I said, plus things I heard at the conference.
  1. Why did this proof get so much attention before it was verified? Because Lipton and Cook took it seriously.
  2. Was this all good for the community? YES- more people know what P vs NP is. YES- if Gowers and Tao now begin working on it more. YES- stone soup.
  3. Do mathematicians care about complexity theory? The answer is surely yes, but how strongly? Is the answer yes or Yes or YES or HELL YES! ? If you count number-of-mathematicians then its hard to say. If you take a weighted sum based on quality of mathematicians then Gowers and Tao alone make the weighted sum enormous. In any case, when did the change happen? Was it drastic or slow?
  4. If someone claims to prove a big result and its public and problems are found with the proof, should they retract? If so then by when? One thing problematic with the question is the word should. Should for the good of the field? Should for the good of ones own career? Should if you want to be taken seriously? If someone has minor mistakes and just needs a little more time to fix them then a retraction is not needed. But the terms minor mistakes and a little more time are not well defined.
  5. How much progress has been made on P vs NP? Scott thinks so. (See the talk Has there been progress P vs NP which is on his homepage.)
  6. Have there been hard math problems that were solved all-of-a-sudden without any real prior results or plan? Finding and intermediary c.e. degree was such a problem, but it wasn't anywhere near as hard or as important as P vs NP.
  7. What criteria SHOULD we use to determine if a proof of a major result is worth looking at?
  8. What criteria DO we use to determine if a proof of a major result is worth looking at?
  9. Has Descriptive Complexity Theory been ruled out as a way to solve P vs NP (something like oracles or natural proofs)?
  10. The story about the story has reached the New York Times: here