## Tuesday, January 31, 2006

### STOC Papers

The accepted papers of STOC have been posted. A few on the list I have mentioned before including Daskalakis, Goldberg and Papadimitriou on Computing the Nash Equilibrium (see also this extension that appeared after the STOC deadline), the Kelner-Spielman and Guruswami-Rudra papers and of course Irit Dinur's new proof of the PCP theorem, surely a lock for best paper.

There are several other interesting looking papers, including Zuckerman's Linear Degree Extractors and the Inapproximability of Max Clique and Chromatic Number, Ambainis, Spalek and de Wolf on Quantum Direct Product Theorems and Charikar, Makarychev and Makarychev with Near-Optimal Algorithms for Unique Games. I can't find the latter online but here is the result from a talk abstract.

We present new approximation algorithms for unique games that satisfy roughly k-ε/2 and 1 - O((ε log k)1/2) fraction of all constraints if 1 - ε fraction of all constraints is satisfiable. These results show limitations on the hardness bounds achievable using UGC. In particular, they disprove a stronger version of UGC that was conjectured in a recent paper. Somewhat surprisingly, even a slight improvement of our results (beyond low order terms) will disprove the unique games conjecture.
Many more interesting papers, be sure to look the list over yourself.

More from Suresh and PC member Scott.

## Monday, January 30, 2006

### Quality versus Quantity

Is it better to have a large number of good papers or a small number of great papers?
The answer is both, great papers to show you have depth and many good papers to show that the great papers were not flukes.

But suppose Fate gives you two roads and you had to choose. History will only remember your best work, so you'll want great papers or the world will eventually forget you. But a CV with many solid good papers will sell better and should help you land better jobs and grants. You'll be perceived as an expert in the field based more on breadth more than depth.

Underlying the question is whether a graduate student should take aim at very hard questions hoping for an award-winning paper. Unless you have some specific new approach that might work or you can still get reasonable results even if the big problem does not fall, you should try to focus more on tractable problems that will build up your research reputation.

## Sunday, January 29, 2006

### Too Many Conferences, Too Little Time

First thanks to Bill Gasarch for his interesting takes on academic life. Another person who should have his own blog.

I missed SODA again this year, but Suresh and Jeff had it covered. Actually I've never been to SODA and have never missed Complexity, which says much more about me than about the conferences.

I also missed the QIP (Quantum CS) conference a week before (I have been to a couple of QIPs in years gone by). Would have been worth it just to see Scott Aaronson's dinner speech but I had to settle for hearing it instead.

And a shout out to everyone at the Kolmogorov Dagstuhl just underway. Let's hope the roof stays on this time.

Speaking of conferences, the STOC accepted papers list should come out this week. Keep tuned.

## Friday, January 27, 2006

### LUDDITES Revisited



GUEST BLOGGER: Bill Gasarch

This is my last day guest blogging, so I'll end where I began,
THREE points on LUDDITES

I) Janos Simon corrected my history of Luddites, for which I thank him.
If you are interested, go to HIS comment on MY post from Monday Jan 23
for a link to a very nice article.

II) My father and father-in-law offer an interesting contrast:

FATHER-IN-LAW (Engineering Major, career mostly in Business, now retired):
LUDDITE: Does not program his VCR. Not sure if he doesn't know how to or just
doesn't want to.  So he HAS to be home on Sunday to watch Desperate Housewives
(a show I found distasteful- My father in law is hipper than I am).

NON-LUDDITE: Took a course on C at a local community college when
he was 70.  Pays all his bills on line.

FATHER (English Major, High School English Teacher and Vice Principal, now retired)
LUDDITE: Got a computer recently and still can't get email or pay his bills on line.

NON-LUDDITE: Uses his VCR to tape ALOT of shows.  He needs it since he watches ALOT:
West Wing, My Name is Earl, The Sopranos, Sex in the City when it was on (a show I find
distasteful- My dad is hipper than I am), 6 feet under, Deadwood, all four Law and Orders,
and all three CSI's, Without a trace, other stuff I can't recall. This from the man
who restricted me, wisely, to no more than an hour of TV a night when I was a kid.)

III) Stuart Kurtz emailed me some more questions for my Luddite quiz.  I asked him if I
could post them and he suggested asking for other inputs.  No one replied, so here are his:

STUART BEGIN:
9) Do you write emails (or blog posts) in
a) variable width fonts with formatting,
b) variable width fonts without formatting,
c) fixed width fonts,
d) What's a blog?,
e) What's email?, or
f) What's writing?

10) Do you indicate emphasis by
a) using italic or slanted font,
b) using a bold faced font,
c) metadiscourse, i.e., "I want to emphasize that... ",
d) ALL CAPS, or
e) Shouting and waving your arms.

a) four buttons,
b) three buttons,
c) two buttons,
d) one button,
e) control characters are good enough for RMS, and they're good  enough for me, or
f) four feet and a tail.

12) What's your favorite programming language?
a) Ruby or Python,
b) Java
c) Lisp,
d) C++,
e) Awk,
f) IBM-360 assembly language,
g) C,
h) Lisp, or

[I know Lisp occurs twice, but c and h are still different answers. Note that there's no
point asking for Perl -- as Perl programmers can only write, not read.]
STUART END.

bill g.

P.S. I am supposed to say Now that I've guest blogged for a week I'm even more
impressed with Lance getting a topic out every day'' But this is NOT TRUE.
I was SO IMPRESSED with Lance in the first place that I can't be more impressed''



## Thursday, January 26, 2006

### How Much are we effected by non-scientific criteria?



GUEST BLOGGER BILL GASARCH

TOPIC: How much is what we do influenced by non-scientific criteria?

(BEFORE I START TODAYS BLOG- A REQUEST.  EMAIL ME OTHER
LUDDITE QUESTIONS- I WILL POST THE BEST ONES ON FRIDAY)

I) AN INCOMPLETE SUMMARY OF
Thomas Kuhn's book The Structure of Scientific Revolution:

For long periods of time a field of science will agree on the basic terms
and problems of the field and will all work with that worldview (also called a paradigm).
This is called Normal Science. This is GOOD since if people were working with different
paradigms progress would be hard.  BUT there comes a time when some problems just cannot
be solved using the usual techniques. There will be an effort to jam this problem and
some approaches to it into the current paradigm, but eventually, the old paradigm will
fall and a new one will take its place. The new one will help to answer some old questions,
and pose new ones that could not have even been asked in the old one.

Newtonian Phy vs Einstein is the usual example, though there are others
on a much less cosmic scale.

II) People after him have misconstrued his work to saying that science has NO
objective truth, that it ALL depends on the Paradigm.  This is, of course, hogwash.
More so when they claim that its a tool by the elite to dominate the masses, or some
such (look up SOKAL HOAX on google for one view of this view).

III) But a fair question CAN be raised along these lines:

How MUCH of what scientists do depends on political or personality or
other factors VERSUS how much is driven by objective scientific principles?

A few examples

a) What if in response to Russell's paradox the math world essentially
axiomized what set theorist now call V=L (every object is constructable).
Then we would know LOTs more about L, we would KNOW that the Axiom of Choice
is true, and we would know that Cont Hyp is true.  We might know that there
were these weird other models that are unnatural where CH is false, but we
wouldn't care.  (Some Set Theorists tell me this could never happen- that
people would be interested in other models. They are wrong.)

b) What if in response to the Banach Tarski paradox mathematicians rejected
some version of the axiom of choice? This would have
been quite possible before AC began being used in so many places.

c) The people who believe in constructive methods only (e.g, Brouwer) are
portrayed as cranky old men holding onto an old paradigm that no longer worked.
But if they had won then people like Hilbert would be viewed as crazy rebels who
fortunately were never taken seriously. (This one I am less sure of- nonconstructive
techniques are SO powerful that I think they may be inevitable.)
d) If Computing Devices were invented either earlier or later then they were
would have a drastic effect on Theory.  While we think that P vs NP
is a natural problem, it only came out once the technology was in place.
Was it inevitable that it arise? Probably
Was it inevitable that it be considered important? Hard to say.

e) There is ALOT of work in Quantum Computing because
(i) Peter Shor proved FACTORING in Quantum P hence giving the problem new interest, or
(ii) There is (or actually was) lots of Grant money in it.
(of course these two are linked)

f) Do schools like MIT have too big an influence on what gets studied?
(They have less influence now than the used to.)

MORE GENERALLY, if I had the time and the energy I would do
research on history/phil of math asking the question

HOW MUCH DO EXTERNAL FORCES EFFECT WHAT IS STUDIED ?

and I would do it WITHOUT an ax to grind.



## Wednesday, January 25, 2006

### A Theorem that should be better known



GUEST BLOGGER: Bill Gasarch

(BEFORE I START TODAYS BLOG- A REQUEST.  EMAIL ME OTHER
LUDDITE QUESTIONS- I WILL POST THE BEST ONES ON FRIDAY)

If u,v \in \Sigma^* then u is a SUBSEQUENCE OF v if you
can obtain u by taking v and removing any letters you like.

EXAMPLE: if v= 10010  then
e,0,1,00,01,10,11,000,001,110,0010,1000,1001,1010,10010
are all of its subsequences

Let L be any language-- a subset of \Sigma^* SUBSEQ(L)
is the set of subsequences of all of the strings in L.

The following three could be easy problems in a
course in automata theory:

a) Show that if L is regular then SUBSEQ(L) is regular

b) Show that if L is context free then SUBSEQ(L) is context free

c) Show that if L is c.e. then SUBSEQ(L) is c.e.
(NOTE- c.e. is computably enumerable- what used to be called
r.e.- recursively enumerable)

Note that the following is not on the list:

Show that if L is DECIDABLE then SUBSEQ(L) is Decidable.

Is this even true?  Its certainly not obvious.

There is a theorem due to Higman (1952), (actually a corollary of
what he did) which we will call SUBSEQ THEOREM:

If L is ANY LANGUAGE WHATSOEVER over ANY FINITE ALPHABET
then SUBSEQ(L) is regular.

This is a wonderful theorem that seems to NOT be that well known.
It's in very few Automata theory texts.  It is not heard much.
It falls out of well quasi order theory, but papers in that
area (is that even an area?) don't seem to mention it much.

This SEEMS to be an INTERESTING theorem that should get more
attention, which is why I wrote this blog.  Also, I should point
out that I am working on a paper (with Steve Fenner and Brian
Why do some theorems get attention and some do not?

1) If a theorem lets you really DO something, it gets attention.
There has never been a case of OH, how do I prove L is regular?
WOW- its the subseq language of L' !!'
By contrast, the Graph Minor Theorem, also part of well quasi
order theory, lets you PROVE things you could not prove before.

2) If a theorem's proof is easy to explain, it gets attention.
The SUBSEQ theorem needs well quasi order theory to explain.
(needs' is too strong- Steve Fenner has a prove of the |\Sigma|=2
case that does not need wqo theory, but is LOOOOOOOOOOOOOONG.
He things he can do a proof for the |\Sigma|=3 case, but that will be
LOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOONG.
Can be explained to an ugrad but you are better off going through
wqo theory.)

3) If a theorem CONNECTS to other concepts, its gets attention.
There are no real consequences of the SUBSEQ theorem.
Nor did it inspire new math to prove it.

4) If a theorem has a CHAMPION it may get attention.  For example
the SUBSEQ Theorem is not in Hopcroft-Ullman's book on automata
theory- one of the earliest books (chicken and egg problem- its
not well known because its not in Hopcroft-Ulman, its not in HU
because its not well known). The SUBSEQ theorem had no CHAMPION.

5) Timing.  Higman did not state his theorem in terms of regular
languages, so the CS community (such as it was in 1952) could not
really appreciate it anyway.

Yet, it still seems like the statement of it should be in automata
theory texts NOW.  And people should just know that it is true.

Are there other theorems that you think are interesting and not
as well known as they should be? If so I INVITE you to post them
SHOULD BE BETTER KNOWN will then become better known and hence
NOT be the winner, or the loser, or whatever.

NOTE: The |\Sigma|=1 case of Higman's theorem CAN be asked in
an automata theory course and answered by a good student.



## Tuesday, January 24, 2006

### Making up Problems

Guest Blogger: Bill Gasarch
HW AND EXAMS:
1. While making up the lecture notes problems may arise naturally. Such as OH, I don't want to bother proving this, but it would make a good HW'' or OH, I can do example 1 in class and leave example 2 for the HW, and 3 for exam''
2. OR the reverse- I want to ask THIS HW/exam problem, so I'll cover THAT in class.
3. If a paper you are reading for your research says by a trivial induction XXX' then XXX might make a fine HW or exam problem. Same for other easy' things that papers skip.
4. Do you allow your students open books? A sheet of notes? Calculators? Whichever it is, tune your questions to it, and view it as an OPPORTUNITY, not a restriction.
5. Just changing the numbers around might not be sufficient. Better to change the concepts around. EXAMPLE In class and HW I do Given Random Var X, and Dist D, find Expected value. On exam do Given Random Var X and you want Exp Value E, what should be the distribution.
6. Exams: Make it clear that you take off for clarity. This way you don't have to try to figure out if a complete mess has some idea in it.
7. Scoring: On a 20 point problem I tend to give either a 0 or a 10 or a 20. Getting the base case of an induction is NOT worth anything. Making an obvious typo is not worth taking ANYTHING off. This makes grading easier, but also you are spared having to make arbitrary distinctions that don't mean much. Do you really want to say On the way to a proof that wouldn't work'' vs On the way to a proof that might work'' vs On the way to a proof that would work'' are worth diff values? And then try to discern which it was? ALSO, If one student really didn't know much, and another one knew alot, I would rather the point DIFF by 20 points, rather than give 5 sympathy points, and take off 5 points for minor things, and end up with a 10 point difference.
8. There are two kinds of questions (actually I'm sure there are more) Those that test MASTERY of the material Those that test CREATIVITY- going beyond the material. I tend to ask more MASTERY questions, especially on exams.
9. What to do about students getting help from the web?
1. Ask old questions in new ways to avoid the usual search terms.
2. Tell students they must TELL you the sources they used. The problem here is if they DO tell you, then what do you do? You wanted them to do it on their own.
3. HW not worth much, but ask similar questions on exams. For lower level courses you can also have short quizes.
4. For a graduate course you can even say If you can understand the paper this came from and write it in your own words, AND UNDERSTAND IT then you will get full credit. But be forewarned, it may be easier to just DO it on your own.'' This WORKS for MASTERY type questions, not for problems that just need one insight, and you want the students to get that insight on their own.
RESEARCH PROJECTS:
1. Code it up and see what happens'' could be a basis for a research project.
2. As a warmup and confidence builder I would have a student read and UNDERSTAND two (maybe more) papers and COMBINE them. This could lead to very good research, but might not. But in any case the student KNOWS something and has DONE something.
3. If a problem dawns on you that you think someone else MUST have worked on then LOOK INTO IT. You may well find that nobody has worked on it--- what dawns on you as an obvious' problem to work on might not dawn on others as such. Peoples motivations differ.
4. If there is some paper you've always wanted to get around to reading but never did, have a graduate student read it and explain it to you. This will be good for him and for you. Can also work with very good ugrads. Might result in survey papers if they read a series of papers. WARNING: You may end up doing more work in correcting the student, and seeing what he really meant, etc. But in the end you'll both learn the paper.

## Monday, January 23, 2006

### Are you a Luddite?


GUEST BLOGGER: Bill Gasarch
(I will be guest blogging this week while Lance is on Vacation.)

Are you a Luddite?
The original Luddites were workers who, scared of lower wages
via technology, destroyed factory machines. This was around 1811.
Their leader was General Ned Ludd. (Not sure if General was an honorary title)
TODAY the term has come to mean someone who does not adapt to technology
or does not like technology.
If you are NOT one, you can use Google to find out more about them.

Are you a Luddite?
I offer the following questions and let you score yourselves.

1) At a conference do you use
a) Powerpoint with fancy animation and pictures off the web.
b) Powerpoint with nice backgrounds, but nothing much else
c) pdf files
d) physical slides made using latex
e) physical slides made using magic markers and overlays
f) physical slides without overlays
g) chalk
h) draw diagrams in the sand with a twig

2) Same as question 1 but for  large classroom lecture (over 50),
small classroom lectures (under 10), seminars (8 people who actually
know something).

3) For writing papers do you use
a) LaTeX (or some other package)
b) Typewriter
(YOU HAVE A TYPEWRITER? MIGHT BE WORTH SOMETHING ON EBAY!
c) Handwritten and give to your secretary to type
(YOU HAVE A SECRETARY? MIGHT BE WORTH SOMETHING ON EBAY!)
d) Quill pen and inkwell on parchment.

4) When listening to talks do you
a) Take notes with an e-pen that automatically puts it online
b) Take notes in an e-notebook
c) Take notes in a p-notebook (thats paper)
c) Not take notes at all
d) Fall asleep

5) When you applied to grad school did you
a) Check out the website of the school
d) Apply to schools you heard were good
e) Apply to schools randomly (time bounded Kolmogorov Random)

6) If you need a result that is already known do you
b) Goto the library
c) Goto your own file cabinet
d) Rederive the result by yourself

7) Which of these might you most likely say?
a) When is the next version coming out so I can update?
b) I'll update in 2 years (and you do)
c) I'll update in 2 years (but you don't)
d) You can have my chalk when you pry it from my cold dead hands.

8) Do you play music on
a) MP3's
b) CD's
c) LP's
d) 78's
e) Wax Cylinders
(WAX CYLINDERS! MIGHT BE WORTH SOMETHING ON EBAY!)

bill g.

Postscript: Thanks to my collegue Jack Lutz for catching that I spelled Luddite wrong
originally. I used him instead of a spell checker, and note that the error he found
would not have been discovered with a spell checker.

`

## Friday, January 20, 2006

### Free Electronic Editions of New Collaborative Books

I am on vacation next week and I've lined up Bill Gasarch as a guest Blogger in my absence. But today we have a guest post from Kamal Jain. This is a long post but well worth reading through.

This post is prompted by recent development and discussions on electronic publishing, which themselves are prompted by book scanning initiative of Google and Open Content Alliance. Although, I am not talking about paper books being converted into electronic format, I like the idea of having the books available in a searchable electronic format. And certainly this is a must have feature for any newly written book.

Recently, I got two invitations to write for books. The first was to write a book on Network Coding. I felt that I was not the best person so I did not accept. If I had, then I would have insisted on a free electronic copy. Second, I got an invitation to co-write a chapter on Cost Sharing with Mohammad Mahdian for a book, Algorithmic Game Theory, edited by, Noam Nisan, Tim Roughgarden, Eva Tardos and Vijay Vazirani. I agreed to this because I felt that such a book is a great idea and I could make a positive contribution. My selfish motive was to spread knowledge of the subject to which I have contributed. And, I guess that was also the expected motive of the other contributors. This I could say because the explicit incentive offered in the invitation to the contributors was that the editors (originally Eva and Vijay only) have made an excellent deal with a publisher, Springer Verlag. The deal they have is $40 for up to six hundred pages. I am not sure whether it is a paper back or hard-cover. But that was not my focus anyway. My focus is the absence of any electronic publishing component in the deal. Because of that, I felt this is not such a good deal in today's electronic age. On one side we are talking about scanning paper books, starting electronic journals, writing wikis, blogs and on the other we do not even make a deal on electronic publishing of newly written books. I wrote an email back to the editors that I do not think Springer deal is a good one. I was hoping to get back a response and start a discussion with them on this, which IMO, was obligatory for them because I point blank disagreed with the incentive they explicitly offered. At this point I am assuming that there is no electronic publishing agreement with the publisher. This was the background. Now, I realize that this is not something to discuss with the editors in private. This is an important issue which is likely to reoccur in other situations. So I requested this space from Lance so that I could discuss with the whole community. Following are some of my random thoughts and I like to hear everybody's thoughts too, random or not :-) Please press the comment button and put your thoughts in writing so that Springer and other publishers would know what we want from them. There are at least two kinds of books. First kind, written by individual authors. Second kind, written collaboratively by the community like the above proposed Algorithmic Game Theory. Individual authors write books for various reasons and it is up to them what kind of deal they lock with the publishers. The books written by a community has a predetermined goal and that is to spread the knowledge of the subject. It is not up to one or two persons to lock whatever deal they think is great. So the community must form unspoken guidelines to facilitate the negotiation between editors and publishers. These unspoken guidelines must include minimum desires of the community. Such a set of guidelines would have resolved the prisoner's dilemma for me. I did not like the absence of electronic publishing agreement. If I decline the invitation then the book still has gone ahead without my contribution and if I accept the invitation, which I did, then I know that my efforts are not optimally used. But in case it were a common expectation from the editors to negotiate an electronic publishing agreement, then I know that I could reject the invitation because others invitee would also do the same, thereby insisting that the editors go back to the publisher and make an electronic publishing agreement. One would ask why publishers have any electronic publishing agreement. For information, Reinhard Diestel's book, Graph Theory, has a free searchable and hyperlinked electronic edition and further this book is published by Springer Verlag. Let us first discuss what Springer provides to us and what we provide to Springer. Then we should discuss whether we are getting the optimal deal. 1. Springer does the marketing which sells the book. 2. Springer provides the brand name which sells the book. 3. Springer provides the brand name which makes the line in our resume about the book a bit bolder. 4. Springer prints and binds the book, for which the buyer pays. 5. Springer gave peanut financial support ($2000) to pay to students to draw pictures. This fund is for those contributors who do not have their own funds.
We give to Springer
1. Free content and transfer copyright so that they can legally publish the content. I am assuming there is no royalties involved in a community written book.
2. Word of mouth marketing.
3. Use our own funds for other expenses.
4. Our university or companies resources.
What are the possible deals we could have:
1. Status Quo. Springer publishes the book and sells them. Takes the copyright and does not provide free electronic copy. In future, if Springer wants, makes more money from electronic copy too.
2. Reinhard Diestel model. Provides free searchable and hyperlinked electronic edition. A user can't conveniently print the pages.
3. Springer publishes the book and sells them. Takes an exclusive time bound license, say one year. After one year, Springer still keeps the exclusive license on the paper publishing, but we could put the free electronic copies on our webpages.
4. Springer publishes the book and sells them. Takes the exclusive right to publish the book in paper format — that's all it needs to legally publish the book. We keep all other rights. We put the book in electronic format on our webpages or at some cheap servers.

Because, as mentioned above, Springer provides some value. We could still avoid Springer and create these values ourselves. We anyway will be spending couple of thousand hours on this book (my experience on working with Vijay is that it takes at least few hours per page). There are at least two ways to avoid Springer.

1. We go to a small publisher and get the book published. Transfer the exclusive right to publish the book in paper format. We keep all other rights.
2. We publish only the electronic version.
What role would Springer play?
1. Springer does the marketing. We will discuss this later to see how we could do the marketing ourselves.
2. Springer provides the brand name to sell the book. I think the brand name of the editors and the authors is much more in this case. This is also the case with any good book written by a community.
3. Springer provides the brand name to make the line related to this book in our resume a bit bolder. First, most authors contributing in the book already have enough lines in their resume that they can do with one fewer line. Second, this line is minor for a community written book. Each person contributes a chapter, may be equivalent to writing one or two journal papers.
4. Springer prints and binds the book. I do not know how much it costs to print and bind the book. "The Search" by John Battelle is a three hundred page hard-bound book and available at 16 bucks at Amazon. Well The Search probably will sell more than this technical book. But it shows that $40 for Algorithmic Game Theory could very well be an optimum profit making point for Springer rather than a favor as they want to portray to us. A small publisher would be able to beat that even in the presence of competing free electronic version. 5. The last is the peanut financial support. I am sure we could arrange$2000 bucks without Springer. Even if we fail, grad student would be happy to contribute this for a credit. If I do not personally have time to draw pictures, then I do not mind having a co-author who does that for me. A picture is worth thousand words. If I am claiming authorship for writing thousand words then anybody who draws pictures deserves the equal credit.
So the only value Springer provides is marketing. There are various ways we could do that too.
1. We create a pamphlet and a poster which we distribute to the program chair of various conferences.
2. Put the electronic version at one place. Let each of the contributor links to it. If there are fifty links from places like, Cornell, Georgia Tech, Stanford then on searches related to the keyword in the book, the book should show up at the top.
3. Let Citeseer crawl the book, let Google crawl the book, let us upload it on Wikipedia.
One question which one could raise is that many people in the world still live on the other side of the digital divide. But such people do not have \$40 bucks either. The solution for them is to have a publisher in India or China to publish this book and sells to these people.

Pre-bottom line is we give more to Springer than it is giving back in return. Game theoretically it is not a fair solution and we could do better. I am not sure whether there is any electronic publishing deal which the editors of this book have with the publisher, if they had then they probably would have told me. In any case this posting is about many others future books which will be written co-operatively. Bottom line is, any book which is not written for money must be available free of charge in an electronic format.

## Thursday, January 19, 2006

### Theory at NSF

From Sanjeev Arora
Bill Steiger of Rutgers is the new NSF program officer in charge of Theoretical Computer Science (TCS) and he has assumed this position now. There appear to be other ongoing changes within the Theoretical Foundations cluster. As CCF director Foster explained at STOC, up to 30% of the funds in the cluster will be placed in a fund which will give out grants via a centralized mechanism. (It is still unclear what the final effect will be on TCS funding.)

NSF program directors would also like to make members of the theoretical computer science research community aware of the following upcoming proposal deadlines:

Proposals that explore fundamentally new (emphasis mine) ideas about network design and information security are sought, and participation by the TCS community is welcome.

Realistically, these will probably involve TCS researchers teaming up with experimentalists to develop proposals that focus on rigorous approaches to well motivated problems in networking and security and that have a significant theoretical component as well as a significant experimental component.

What Arora leaves unsaid is that there are no NSF general programs in core theoretical computer science accepting new solicitations this year.

## Wednesday, January 18, 2006

### Favorite Theorems Preview

I have written up my ten favorite theorems for the decades 1985-1994, 1995-2004 and 1965-1974. This year we tackle the remaining decade 1975-1984, the second major decade in complexity and the decade leading up to when I started graduate school in 1985.

The first decade set the groundwork for computational complexity and the P versus NP problem. In the second decade attempts to understand and solve the P versus NP problem led to new and interesting questions that still challenge us today. But we most remember the second decade for analyzing different models of computation such as alternation, parallel, probabilistic, circuits, interactive proofs and the first hints of quantum computers.

Starting next month I will run down my favorite theorems of the decade that showed that the tools of computational complexity can help us understand efficient computation in whatever form it comes in.

## Monday, January 16, 2006

### Sports Droughts

We watched the Chicago Bears Football team lose to Carolina with some friends who were extremely pessimistic the entire game, even though the game remained close throughout. "We haven't seen the Bears win the Super Bowl in twenty years, they will continue to disappoint us."

Let's consider the twenty year statement. Let's assume that each year every team has an equal probability of winning and each year is independent of each other. Then the expected number of years between championships is equal to the number of teams, 32 in the National Football League. So the Bears are still ahead of the curve, not disappointing at all.

So how about the 86 years between the Boston Red Sox World Series championships in baseball, the 88 years between Chicago White Sox championships, and the 98 years since the Chicago Cubs last won? This is just the coupon collector problem where if one draws numbers 1 to n with replacement independently and uniformly, it will take an expected n ln n draws to see every number. For the thirty baseball teams, that makes 102 years. There was no curse for the Red Sox, White Sox and Cubs, just probability working as expected.

I'm cheating on many fronts. The number of teams in both football and baseball have grown dramatically over the past few decades. Each year is not independent; a good team one year will likely be good the following year. Teams do not have an equal probability; especially in baseball the richer teams have a higher chance of winning.

Nevertheless you have no one to blame for long losing streaks other than those evil gods of probability.

## Saturday, January 14, 2006

### The Defense

In the third Complexitycast we talk about the Dutch Defense with new Ph.D. Troy Lee and several of the participants in his ceremony. MP3 (18 minutes, 3 MB)

## Thursday, January 12, 2006

### Time to Write the Letters

Every year I seem to write more letters, letters for those applying to graduate schools, letters for those looking for their first postdoc or assistant professor jobs, letters for tenure cases. Why the continual increase? More and more we see young researchers taking multiple postdocs, where each round of searching requires a set of letters. Letters from senior people carry more weight, and each year I get another year senior.

But the Internet has led to an increase in requests because one can now have automated processes that send requests for letters. Harry Buhrman has two complaints about this model.

1. Often these automated requests seem cold. Even automated they could ask in a nice way, be grammatically correct and address the person directly.
2. Universities should pay at least some small amount of money to letter writers or their institutes. This will keep down the number of requests and reimburse the letter writers for some of their time.
I second Harry's first point, though for me I never even look at words in a request, I just want to know where to send the letter.

I don't agree with Harry's second issue. We have a responsibility to write letters for our students and colleagues. The marginal cost of a sending an additional letter for someone is rather small, though the universities should make the process as painless as possible. A URL I can click and then upload is best. Having to cut and paste a username and password sent in an email is already adding effort for me with no increased security. For one graduate program I had to go through ten web pages of forms to fill and verify; there is no excuse for that. Ideally I would like some place I can just deposit the letter which legitimate universities could just download as needed.

For tenure letters perhaps a payment scheme would make sense. These letters require much more effort and we only write one of them for each candidate.

## Wednesday, January 11, 2006

### My Second Home

My first trip to Amsterdam came during a whirlwind European tour in the summer of 1984 during my undergrad years. My second trip was for the 1994 Complexity Conference (then called Structures). I spent my sabbatical year in Amsterdam 1996-97 and have come back about once a year since.

For the past several years I have stayed at the same hotel (NH Tropen), rented a bike from the same shop with the same sarcastic woman behind the counter who now recognizes me when I arrive. I know where to jog, where to get groceries and gifts for the kids. I don't need instructions to get to the hotel from the airport or a map to get from the hotel to CWI or the CS department at the University of Amsterdam. Maybe there are better hotels or bike shops but familiarity makes life easier.

Some of the faces have been around since my sabbatical and well before: Paul Vitanyi, Leen Torenvliet, Peter van Emde Boas and Harry Buhrman, the last of which I have written more papers with than any other co-author. Some faces I see for the first time. Today I am going to my fourth Ph.D. defense in Amsterdam as a member of the opposition.

I don't visit the tourist sites, the museums, the "coffeehouses", the red light district. I come to see old colleagues, make new ones, hopefully prove some theorems and enjoy my second academic home.

## Tuesday, January 10, 2006

### Counting Go

This week I am making my nearly yearly visit to CWI in Amsterdam, where I spent a sabbatical year in the 90's. I will sit on the opposition on Troy Lee's Ph.D. defense on Wednesday. Troy was a paranimf at Hein Röhrig's defense where I also sat in the opposition two years ago.

At CWI we also find John Tromp who works on various puzzles. Now he is counting Go positions. Tromp and his co-author Gunnar Farnebäck show that legal positions are colorings of the grid to {white, black, empty} such that every white or black connected component borders an empty node. They have exactly counted the number of positions on boards up to 16x16 and have an asymptotic bound on larger boards.

If you really want to know the exact number of positions of the standard 19x19 board and have a server with ten terabytes of disk space to spare, John would love to hear from you.

## Saturday, January 07, 2006

### A Search Without End

Finding the 5th moon of Jupiter was a big deal. Finding the 14th moon was a big deal. But it's hard to get excited about the 63rd moon. Now imagine if Jupiter had an infinite number of moons.

New York Times, August 29, 1989

After more than a year of computation, six researchers at a California computer company have found the largest known prime number, which is 65,087 digits long and would fill more than 32 double-spaced typed pages.

While the search for the largest prime may seem like an esoteric pursuit, one of the researchers, Landon Curt Noll, said the work that permitted them to discover the number has a wide variety of commercial and scientific applications.

Associated Press, March 31, 1992
Mathematicians using a supercomputer have advanced the quest of a 17th-century French monk (Mersenne) by discovering the largest known prime number. The number begins 174 135 906 820 087 097 325 and goes on and on and on for 227,832 digits, filling 32 pages of computer paper.

Jeffrey Lagarias, a mathematician at A.T.& T. Bell Laboratories in Murray Hill, N.J., said that the discovery might have some significance in pure number theory but that "it's not going to revolutionize anything."

New York Times, March 29, 2005
An eye surgeon in Germany has discovered the world's largest known prime number—or at least his computer did. The surgeon, Dr. Martin Nowak of Michelfeld, is among thousands of participants in the Great Internet Mersenne Prime Search, one of several big projects that tap idle computers worldwide. The number, rendered in exponential shorthand, is 225,964,951-1. It has 7,816,230 digits, and if printed in its entirety, would fill 235 pages of this newspaper.

"Finding an additional prime doesn't enlighten us very much," said Dr. Andrew M. Odlyzko, a mathematician at the University of Minnesota.

Associated Press, January 3, 2006
Researchers at a Missouri university have identified the largest known prime number, officials said Tuesday. The number that the team found is 9.1 million digits long. It is a Mersenne prime known as M30402457—that's 2 to the 30,402,457th power minus 1.

The team at Central Missouri State University, led by associate dean Steven Boone and mathematics professor Curtis Cooper, found it in mid-December after programming 700 computers years ago. "We're super excited," said Boone, a chemistry professor. "We've been looking for such a number for a long time."

Why?

## Thursday, January 05, 2006

### A Classical World

We need to rewrite the traffic laws in this country because they don't handle flying cars. We don't have flying cars, you say. We might have flying cars in the next couple of decades so the current traffic laws no longer apply.

Keep that argument in mind as you read the following paragraph from David Bacon.

If today someone was to prove that P does not equal NP for a classical computer, would we be satisfied? Well certainly we would be very excited and this would be the breakthrough of the century in computation, but because the universe is fundamentally quantum mechanical, would we really be satisfied that the intractability of NP-complete problems had been shown? Quantum computers open up an entire bag of worrying about the foundations of computational complexity. It is dangerous to say this, of course: if this view is correct, then the hard work of classical computational theory might have been in vain. But if this is the correct view, then we need to begin weaning ourselves off of the classical model of computation.
Dangerous indeed. Bacon is not the first one to make such statements, Gilles Brassard made much stronger pronouncements as far back as 1990.

Did the theory of relativity mean the hard work of classical mechanics was in vain? Of course not. When we drive a car we don't need to worry about relativistic effects, they simply don't amount to much at that level.

We don't know how quantum mechanics will actually play out in a computer that would require the entanglement and manipulation of tens of thousands of quantum bits. Maybe we can harness the full power of quantum computation, maybe we can't. At this point we simply don't know.

I support research in quantum complexity, as long as quantum computing remains a possibility we should try and understand its computational power. But not until we all have fast quantum computers on our desks should we even think of abandoning classical complexity. And in our current state of affairs, where creating a quantum computer that factors faster than my daughter is a pipe dream, classical computational complexity serves us very well indeed.

## Wednesday, January 04, 2006

### The Technology of Academic Papers

The Internet has led to a complete shifts in how we deal with storing and sharing information, but when it comes to academic papers the changes we see are ad hoc and added in a piecemeal basis.

Suppose we could start from scratch and create a proper system for research papers. Here is how I would envision such a system.

XML has become the standard for storing information on the internet; it gives a simple machine-readable method for creating tree structures. Academic papers have such a tree structure (Sections, subsections, theorems, proofs, etc.) that would lend it itself well to XML. Mathematical equations should also be written using XML, we already have a MathML specification for doing this.

A academic paper XML file would only have content information, not any formatting information. For this we would use XSL files, themselves XML files that describe how to format the document. You would use different XSL files depending on whether the paper is viewed on the screen or printed, and different publishers can develop their own XSL files to have consistent looking papers. LaTeX, the system used by most theoretical computer scientists, has similar capabilities but because LaTeX does not enforce any standards, changing style files often requires considerable editing.

Researchers will not have to create these XML files directly (unless they want to) but can use word processors that will save the documents according to those standards.

For citations we should just point to a unique identifier for a paper, no longer should we need to cut and paste bibliographic information. The formatting program can go online based on the identifier to get the information to create a human readable bibliography with web links if appropriate. Most publishers already use Digital Object Identifiers (DOI), we just need DOIs to point to an XML file giving bibliographic information, have DOIs for unpublished papers and have a method for DOIs to point to a later version of a paper.

The author information on academic papers are often useless (like my postal address) or out of date as academics change locations. Each academic research should get their own DOI-like number that points to an XML file giving personal and contact information and then we only need add these DOIs to the academic papers.

Most importantly we need to have enforced standards for each of these XML documents (via XML schemas). If we can truly separate the content from the formatting of documents, and make that content available in an easy machine-readable forms, not only can researchers focus more on the writing and less on the style but will also open the door to applications that we cannot even imagine today.

## Tuesday, January 03, 2006

### A Year of Incompleteness

Edited from an email by Jan van Leeuwen and Jiri Wiedermann

The year 2006 will be a special year for the foundations of logic and theoretical computer science because on April 28, 2006 it will be 100 years ago that Kurt Gödel was born.

In 2006 it will also be exactly 75 years ago that Gödel's incompleteness theorem was published and 50 years ago that he wrote his famous letter to von Neumann which is now recognized as one of earliest recognitions of what we now know as the P-versus-NP problem.

Gödel's 100th birthday is beginning to receive some attention among logicians. There will be an International Symposium commemorating "the 100th jubilee of the birth of Kurt Gödel" in Brno (Czech Republic), the city where he was born, a Gödel Centenary 2006 Symposium in Vienna with a special lecture by Roger Penrose, and also at CiE 2006 there will be special attention for Gödel's "legacy for computability".

We note that 2006 also marks the 100th birthday of Richard Rado (well known as a great combinatorial mathematician and born on the same day as Gödel) and also of the well-known statistician William Feller.