Monday, October 23, 2006

FOCS Day 1 and Business Meeting

Janos Simon continues his reports from Berkeley.

Dick Karp gave a 1-hour invited talk in two parts. First he gave a short version of "Why TCS is important to other sciences." This is a sermon that ought to be delivered to deans, and to non-theory folks in CS.

Highlights:

• TCS points out the computational nature of natural and social processes, and provides a language for their descriptions analogous to Math providing equations that describe phenomena and methods to manipulate equations.
• TCS alters world views in other areas, providing paradigms to understand and to model.
Examples:
• Biology: cell process regulation.
• Learning: (in humans, animals) paradigms, models and algorithms from Computational Learning Theory
• Molecular self-assembly
• Strategic Behavior of companies
• Physics: Quantum Mechanics and quantum computing.
• Math: too numerous to mention, algebra, Fourier techniques, Social sciences: web, data for social interactions and behavior, etc.
Karp also gave examples of where convergence between CS and other areas benefited both disciplines including Belief propagation, error correcting codes and constraint satisfaction.

Janos' Comment: I think it is important to emphasize that CS contributes more than a passive lens: we are not a tool, but a partner.

The second part of Karp's talk was a set of examples from Computational Molecular Biology, illustrating some of the contributions/research accomplishments. He gave 5 examples of "Algorithmic Challenges in Computational Molecular Biology." [Many are associated with Berkeley because Karp is quite familiar with these]

1. Sequencing Genomes. He talked mostly about Myers' contribution to shotgun sequencing. His algorithmic ideas were crucial to the success of the technique, which, against biologists' expectations, became the standard. The technique: extract many random sequences of the genome, of known fixed length, 2, 10, 50, 150 kb pairs. read 500-1000 nucleotides from opposite ends computationally assemble the genome. Myers came up with several tricks the do NOT work for arbitrary sequences, but take into account the domain. Had to use biological knowledge.
2. Parametric Sequence Alignment. The Needleman-Wunsch algorithm (aka dynamic programming) gives best alignment of two sequences (DNA fragments) given scores for matching letters, mismatches, or matching a letter with a space (modeling insertions/deletions). What if these costs are not known? Clever algorithm by Pachter and Strumfels, using polytopes for this parametric problem
3. Inferring Evolutionary History (Daskalakis, Mossel and Roch, STOC 06)
4. Genetic Variation Zhihong Ding, Vladimir Filkov, Dan Gusfield: A Linear-Time Algorithm for the Perfect Phylogeny Haplotyping (PPH) Problem
5. Regulation of Gene Expression (ran out of time)
Some other interesting talks I heard:

Nguyen, Ong, Vadhan: Statistical zero-knowledge from any 1-way function. Omer Reingold liked this. I did too. They prove that 1-way functions (in some sense the weakest crypto assumption) is sufficient for giving a zero-knowledge proof—under new definitions. The usual one is that soundness is statistical (absolute) and zero-knowledge is computational (guaranteed only about polynomially bounded adversaries). This paper inverts the roles, answering a question proposed in 1992. The proof technique is new.

Assaf Naor presented two difficult and impressive results, showing that problems of embedding one metric space in another have a deep mathematical background, and interesting CS applications. The papers have difficult proofs that may well be worth studying.

Santosh Vempala presented a great result (with Laci Lovasz) showing how sampling log-concave functions is related to optimization problems.

I found the results on derandomization, and hardness amplification quite interesting (section 4B) and was convinced of the virtues of smoothed analysis (concurrent section 4A). Ryan O'Donnell gave a beautiful presentation of new nonapproxability results in session 5A, and I learned that honest majority is easier in a quantum setting in session 5B.

Machtey Award for Best Student Paper: Nicholas J. A. Harvey for Algebraic Structures and Algorithms for Matching and Matroid Problems.

Stats: 243 submissions, 71 accepted (~30%), ~220 attendees. By comparison
SODA 387/139 36%
ESA 215/55 26%
STOC 280/78 27%
CCC 85/28 32%

Deadlines: CCC Dec 3, program Chair Peter Bro Miltersen, STOC Nov 20 (no joint submissions to CCC)

STOC 07 will be at FCRC in San Diego. Hotel will cost 129/139 single/double. Dates are June 11-13 for STOC, June 8-16 for FCRC.

FOCS 07 will be in Providence, RI. Program chair is Alistair Sinclair. Location is the new Renaissance Hotel, currently under construction.

FOCS 08 will be in Philadelphia.

Our community received a number of prizes, previously reported here: Jon Kleinberg, Nevanlinna Prize, Danzig Prize to Eva Tardos, and co-winners of the Fulkerson Prize: Manindra Agrawal, Neeraj Kayal and Nitin Saxena, "PRIMES is in P", Annals of Mathematics, Volume 160, issue 2, 2004, Pages 781--793. and Mark Jerrum, Alistair Sinclair and Eric Vigoda, "A polynomial-time approximation algorithm for the permanent of a matrix with nonnegative entries", J. ACM, Volume 51, Issue 4, 2004, Pages 671--697. Neil Robertson and Paul D. Seymour, "Graph Minors. XX. Wagner's conjecture", Journal of Combinatorial Theory, Series B, Volume 92, Issue 2 , 2004, Pages 325--357. (not really TCS, but we'll claim it!)

There was a new proposal by Umesh Vazirani: let us have no paper proceedings. Instead, let the papers be submitted (perhaps as little as one week before the conference) to a website. This would allow postponing submissions by perhaps 2 months, and would get extra benefits (cheaper registration, text searchable, available everywhere.

A lively discussion ensued. The economics of the proposal are unclear: would it decrease attendance? Should we worry about the economics? Some people like having the paper version at the conference. Making the "electronic proceedings" have page numbers, formatting, etc. would still imply text processing and editing. On the other hand, an electronic version would have substantial advantages. Would IEEE let us do it? What about libraries who paid for the proceedings?

It was pointed out that the IEEE electronic library is unreasonably expensive, and many institutions do not have it. A quick vote was taken, and results were inconclusive. Many wanted a CD with the proceedings. About an equal number of people wanted proceedings with CD as no proceedings and electronic versions only.

Our leaders promised to study the matter.

Finally the NSF report. Bill Steiger is going back to academia, in August. He was warmly thanked for his efforts for the community. I will give a detailed report tomorrow.

1. This is a sermon that ought to be delivered to deans, and to non-theory folks in CS.

Are the slides of Karp's talk available on line anywhere? I'd show them to my dean immediately if they were :-)

2. Seems that one consequence of submitting final versions of papers "perhaps as little as one week before the conference" is that if the paper contains an error, it will only be found by the authors at the last minute. It does happen that after papers get accepted to the conference and they are later found to be incorrect, they are withdrawn in time and not published (e.g. STOC 2006). But it would seem less likely with this new system that they would be found sufficiently in advance. But then again, what's a few more buggy papers?

3. Simply polling those who made it to FOCS clearly introduces a bias--of course they would like to receive a nice big physical conference proceedings.

Meanwhile, those of us who couldn't make it are sitting out here in the cold, waiting to read the papers.

Physical copy or no, please make everything available online as soon as possible!

4. Totally agree with the anonymous at 3.

It is a biased sample. There may be external issues with electronic proceeding (like IEEE approval) but soliciting opinion electronically must be an easier task. Why move to a more complicated problem and not solve an easier self-contained (within the community) problem first.

5. The business meeting vote on prcoeedings compared a number of options of which the most relevant were:

* ststus quo (printed proceedings onsite, IEEE website - currently text searchable - several weeks later)
* printed proceedings + CD onsite, IEEE website later
* online posting + CD only (no printed proceedings, agnostic about IEEE website - not needed?)

There was a strong preference for one of the latter two (about equally divided) over the status quo. This was enough support that the TC will explore both the CD and online posting. (There is enough uncertainty about what we are able to do in this regard that it made no sense to make a binding decision at the meeting.)

Personally, I would love to add earlier online availability (a CD would be good, too) to the existing system rather than replace the printed proceedings.

As Janos said, the main reasons for the preference for elimination of printed proceedings included
* eliminate a lot of the delay between submission and the conference
* not having to carry a printed copy around the conference
* cost

From my budget experience the cost of printed proceedings is dwarfed by the cost of many other items. For STOC it was about $24/copy (which included all of the editorial costs, many of which we still might need to cover). The potential savings from aggressive budgeting on other items such as food, dwarfs this. The reasons for having some early online posting include: * broader availability and lower expense than the expensive IEEE website. * not having to wait the weeks before papers become available online. Some issues: * Indexing, table of contents, front matter, page numbers, etc probably will need to be done anyway so the time between when final version is submitted and when the fully citable version is available may not be completely eliminated. (There are about 3 weeks that the pubisher allocates to these items in the usual schedule.) * IEEE might object to online posting (copyright etc.) * Printed proceedings may be required by some libraries. * Some people prefer leafing through printed proceedings to reading papers online. * Early online availability might reduce conference attendance. * IEEE might not be as cheap as ACM in producing proceedings CDs. (ACM charges about$4/CD after one has paid for the printed run.)
* Who would be responsible for maintaining the server where the online versions would be posted? (The TC does not have such a server, nor does it have a budget to support one.)

It does seem worth exploring these options with IEEE. I think that the sentiment is very strong for having broader online access and a CD format, whatever we choose to do about printed proceedings.