Friday, May 30, 2008

R-O-S-E

Rose Sloan, daughter of UIC theorist Robert Sloan, is one of a dozen finalists in the Scripps National Spelling Bee. The finals will be broadcast tonight on the ABC network at 8 PM Eastern.

Good luck Rose!

Update: Rose lasted until round 11 and ended up tied for fourth. Congratulations!

Final standings here.

Would you buy a math book for $160.00?

(REMINDER: Complexity 2008 early-reg deadline is early-reg deadline is June 1.)

I recently needed a copy of Mathematical Gems III by Ross Honsberger since I found out that a problem I was working on has a variant that is in that book. I didn't mind getting a copy since I have some of his other books and they have lots of nice math problems and concepts. When I went to amazon I found the following website with one difference- there was also a copy available for $10.00, which I bought. Now There are only two left: one for $164.59 and one for $168.98. At that price I would not have bought it (Its in the Library of a nearby school.) (NOTE- by the time you the price may have been reduced.)

Unless the book contains actual gems, I cannot imagine that its worth $160.00. On the one hand, it is in hardcover and one reviewer gave it 5 stars. On the other hand, $160.00??? I cannot imagine anyone paying that price for it There are many good books of this type (e.g., Math Gems I is around $10.00, Math Gems II is around $27.00, there are plenty of websites of nice math stuff for free), so it is unlikely that someone really needs this particular book that badly. I may be the one who comes closest since I need the reference, but I wouldn't pay that kind of money. This is in contrast to high level monographs which may be the only source on that material. But even that may fade as the web gets more and more for free.

Thursday, May 29, 2008

Completeness Does Not Imply Complexity

I've heard discussions in PC meetings, job interviews or just someone trying to talk me into attending a seminar: Jane has a complexity result, she shows left-handed 6-SAT is NP-complete.

Sorry, completeness results are algorithms. They are proved by algorithms people using algorithmic techniques. There is simple-minded view that upper bounds are algorithms and lower bounds are complexity. But that doesn't reflect reality: Upper bound results like the PCP theorem or SL=L are complexity. It's not even clear whether a result like circuit lower bounds implies derandomization is an upper or a lower bound.

Since we both algorithmicists, like complexity theorists, lack techniques to prove general lower bounds, they instead use reductions to to show that problems are hard. This gives them a two-pronged attack on a problem where the failure to find a reduction might lead to an algorithm or vice-versa. In structural complexity, we see a similar dichotomy between inclusions and relativized separations.

There is no absolute dividing line between algorithms and complexity, but loosely algorithms deals with specific problems while complexity studies classes of problems based on some computation model with certain resource bounds. So the definition of PPAD and its relationship to other classes is complexity but the reduction from PPAD to Nash Equilbrium is algorithmic.

Wednesday, May 28, 2008

Gödel Prize

The ACM has just announced that Dan Spielman and Shang-Hua Teng will receive the Gödel prize at ICALP for their paper Smoothed Analysis of Algorithms: Why the Simplex Algorithm Usually Takes Polynomial Time.

Elsevier Happenings

Why do I remain on the editorial board of the Elsevier journal Information and Computation? Partly as loyalty to Albert Meyer, the long time editor-in-chief, who gave me my first major editorial position. But also because I believe that one can change some of the policies in Elsevier by talking to Elsevier instead of just boycotting them. And we've made some small progress. Elsevier papers are being (slowly) added to search sites like Google Scholar. And Elsevier recently announced a theoretical computer science student package, electronic access to a dozen theory-related journal for $50/year. Likely too little too late in reducing the bad will Elsevier has developed in recent years.

Among the dozen is the oddly-named Journal of Algorithms in Cognition, Informatics and Logic, a sort-of resurrection of the Journal of Algorithms whose editorial board resigned at the end of 2003. Given the new title, a manifesto and aims, the journal has moved mostly away from tradtional TCS algorithms for a more logic and AI focus. Hal Gabow tells more including how, without their knowledge, many people from our community, including some previous Journal of Algorithms editors, were mentioned as supposedly connected to this new incarnation.

A similar story happened with the Journal of Logic Programming whose editorial board had resigned in 1999 and whose journal was remade as the Journal of Logic and Algebraic Programming.

The last issue of J. Alg was volume 62 number 2. The first issue of JACIL is volume 62 number 3, so JACIL is officially just a continuation of the Journal of Algorithms. Given the vastly different editorial focus, why not just start it as a new journal? Partly to take advantage of the reputation of the former journal, but also to protect the back catalog, the valuable assets that Elsevier has in the many important papers that have years ago appeared in J. Alg and the other Elsevier theory journals.

But even for the theory journals that remain at Elsevier, like TCS, JCSS and I&C, one cannot help but notice an overall decline in the quality and quantity of the articles appearing over the last couple of years. One would hope that those missing strong papers are being sent to journals like Theory of Computing and the ACM Transactions on Algorithms and Computation Theory and a few have. But the controversies over journals are causing even greater numbers of authors in theory and throughout computer science not to bother writing journal versions of their conference papers. The main complaints about Elsevier relate to access, but no paper is less accessible than the paper not written.

Update 3/16/09 from the editors of JACIL

The Journal of Algorithms in Cognition, Informatics and Logic is severing its ties to Elsevier and is moving to a new publisher.

This is correct, but the preceding text in your blog gives the wrong impression of the correct standard procedures of setting up a journal that we have followed here. Our starting point was a substantial list of editors (about 50) from the logic and cognition communities who had accepted to be on the board of the new journal. To this we added additional names from the algorithms community and then sent a formal letter of invitation to all names on the list. The letter and attached list were private and we made it clear that this were invited editors, but included those names who had already accepted. This is common practice, and such lists always remain private until the process is complete. Unfortunately someone (most likely from the algorithms community) made it public and it ended up on your web page.

The misunderstandings and grievances generated by these actions need to be corrected, now that our community has withdrawn the journal from Elsevier, as a consequence of a report by one of our members John Lloyd.

We would therefore be grateful, if you append this entire letter to your blog

best
Dov Gabbay and Jörg Siekmann

Tuesday, May 27, 2008

CCC2008/ ACM dissertation awards 2007

Two links of interest:
  1. Complexity 2008 deadline for early registration is JUNE 1- so sign up NOW!
  2. ACM Dissertation Awards What is of interest is that the winner and two of the three runnerups are in THEORY- the winner and one of the runner ups is in Cryptography
Actually the winners are often theorists though this year it is more striking. Are theorists producing the best PhDs in computer science? I would not say that; however, for a theory PhD (and perhaps for theory in general) its easier to tell that something is good work since we have well defined problems that can have well defined solutions. Other areas do indeed produce good work, but it may take time to recognize that.

Friday, May 23, 2008

Tough Math

Back in 1992, Mattel had a controversy on their hands when Teen Talk Barbie said "Math Class is Tough." Today we hear about the difficulty of mathematics about another woman, this one running for president.
"She has fought a very energetic race, but the math just isn't there." (Tim Russert on MSNBC)

"She's mounted an extraordinarily impressive and tough campaign," said Steve Grossman, a Massachusetts superdelegate and pledged Clinton supporter. "The math is tough. Most people think the math is virtually impossible." (Boston Herald)

Obama chief strategist David Axelrod said whichever way the Clinton camp spins it, "the math is the math." (AFP)

The Clintons' War Against the Math (ABC News)

and many more.

What is our beloved field of mathematics doing to poor Hillary? Of course "math" does not describe the technical delicacies of the field, but rather to remark that in the end the nomination goes to the candidate with a majority of delegates and given the current delegate status the probability that Obama will not achieve that majority is quite low. The term "math" is also being used as a logical game-stopper—no one can make 1+1=3 no matter how hard they try.

"Math" gets played by the media as a cruel and heartless monster that many believe Hillary Clinton cannot defeat, rather than the the beautiful and ever growing field of knowledge that we love and respect.

Or maybe, as John Dickerson suggests, another scientific field now applies.

The race for the Democratic nomination…now feels like a quantum physics problem: How long can a body exist in a state approximating motionlessness without actually stopping?

Thursday, May 22, 2008

Final STOC Post

James Lee finishes up from Victoria.

Still at the business meeting (with a 15-item agenda), Cynthia explains that another rule of thumb was in place for STOC’08: If a program committee member said "this is my favorite submission," then the paper was marked for acceptance, barring a severe negative reaction from the other committee members.

Next, Bobby Kleinberg tells us about the "TheoryWiki" project, whose goal is to organize a community-wide effort to improve the presence and quality of TCS on Wikipedia. This seems like a fantastic goal. To rant tangentially while I have the chance: Unfortunately, a large contingent of our community recently contributed to the Springer Encyclopedia of Algorithms. There were various area editors who put together a list of possible articles, and then solicited authors to write them. The area editors were not paid. The authors were not paid. On the other hand, the default was for the copyright on all materials to be handed over to Springer, who will create a huge and potentially useful volume, and then sell it at very expensive prices. I agreed to write my article, but I complained quite a bit first. I was told that the process was too far along to change anything. I asked Springer if I could put it on Wikipedia. They said no. Finally, I said: I am writing an article. I am putting it in the public domain. I will give you permission to distribute it however you like; take it or leave it. They took it. Unfortunately, most authors did not make similar deals, and a tremendous amount of time and effort has been wasted (or, in the least, vastly underutilized).

Adam Kalai announces that STOC 2010 will be in Boston (where he and Yael will be joining the newly formed MSR New England). Allan Borodin reads the conceptual manifesto. Despite a lot of dissent expressed in private conversations, Mikkel Thorup is the only one to speak up. He argues that, while new models should be valued, we might also want to appreciate the kind of algorithms that are running on millions of computers around the world right now.


I’ll end with some talks I enjoyed from the rest of the conference:

  • Chris Umans gives a near-linear time algorithm to compute the composition of two multivariate polynomials over finite fields of small characteristic. This leads to asymptotically faster algorithms for factoring univariate polynomials over finite fields. The composition algorithm is inspired by the Parvaresh-Vardy and Guruswami-Rudra codes. (According to Chris, the “small characteristic” assumption has recently been lifted.)
  • Ishai, Kushilevitz, Ostrovsky, and Sahai disprove a well-known conjecture of Mansour, Nisan, and Tiwari by showing that 2-universal hash functions (from n bits to n bits) can be computed by circuits of linear size. Expander codes are the primary tool. (Their ultimate goal is more efficient crypto primitives.)
  • Adam Kalai gave an entertaining talk on "The myth of the folk theorem," joint work with Borgs, Chayes, Immorlica, Mirrokni, and Papadimitriou. The "Folk Theorem" is a collection of results from game theory which describe the Nash equilibria in repeated games, i.e. where the same one-shot game is played over and over. The "myth" of the folk theorem is that finding Nash equilibria in repeated games is easy, and it's true for two players: There is a poly-time algorithm. The authors show that once the repeated game has three players, though, finding a Nash equilibrium becomes PPAD-complete, just as in the one-shot case. The reduction from 2-NASH is pretty simple, and comes with the fantastically apt name of the "Actor-Critic game" (which, in the talk, was played by actors Jason and Nicole, and critic Lance).
  • Shachar Lovett gives an explicit construction of pseudorandom generators against low-degree polynomials over finite fields, improving over the work of Bogdanov and Viola who did this for d=2 and d=3. Lovett uses the sum of 2^d eps-biased generators (these are pseudorandom against linear functions). His analysis involves the Gowers norms, which measure the bias of random “derivatives” of a function. In very recent work, Viola has shown that one need only sum d eps-biased generators. Viola’s work does not use the Gowers norms, and is simply based on the bias of the polynomial to be fooled.
  • Spielman and Srivastava show that every n-vertex graph can be sparsified to a (weighted) subgraph containing only O(n log n) edges, where the sparse version preserves all Rayeligh quotients of the Laplacian up to a (1+eps) multiplicative error. In particular, the weights of all cuts in the sparse version are the same up to 1+eps. They also give a near-linear time randomized algorithm to sample the sparse subgraph. The sample probability of an edge is proportional to its effective resistance in the electrical network defined by the graph. They analyze the sampling procedure using work of Rudelson on central limit theorems for sums of rank-one matrices.
There were many other beautiful results presented. I suggest using the comments section to highlight your favorites.

Tuesday, May 20, 2008

STOC Business Meeting, Part I

More from Victoria by James Lee.

Jeanette Wing, the new director of CISE, kicks off the business meeting with an overview of the funding situation for theory at NSF. I think I discern two clear messages: First, NSF is part of the executive branch, so there is one clear way we can affect the budget this November. CISE has requested a 19.5% funding increase for 2009, with a 25.5% increase requested for CCF. Secondly, the best way to expand the amount of funding for theory is for algorithms people to look for money outside of pure theory venues. The opportunity to do this will hopefully be improved by having Sampath and Jeanette on our side at NSF.

Dick Karp wins the SIGACT Distinguished Service Award, for his tireless dedication to promoting and expanding TCS. Prasad and Harald are given their best paper awards. Then Cynthia gives her report on the program committee, and its decision process.

80 papers accepted out of 325 submitted (that's about 24.6%). Some notable results: Congestion and game theory goes 5/13 (38.5%), and metric embeddings goes 0/13 (0.0%). Before the committee met, they agreed on having a more open mind toward conceptual papers which might be otherwise overlooked because they lack technical depth. The following paragraph was added to the call:

Papers that broaden the reach of theory, or raise important problems that can benefit from theoretical investigation and analysis, are encouraged.
This paragraph has been kept for FOCS'08.

The committee sought to appreciate simplicity as a virtue; no longer "I like the ideas, but the proofs are simple"; instead, "I like the ideas, and the proofs are simple!" I don't know if "They changed the model so as to trivialize the problem" is also replaced by "They changed the model, and now the problem is trivial!" I think responsible analysis of a paper is probably a bit more nuanced.

Later, Madhu Sudan spoke of instances where a well-known problem had an easy solution, and this prevented a journal or conference from publishing it. This is certainly ridiculous, and I have a hard time believing that it's a frequent occurrence (of course, I have about 1% of Madhu's experience). I've seen examples where the community considered it "embarrassing" that the solution was so simple, but not where the paper itself was derided.

Personally, I love the beautiful intricacies of hard, technical proofs. It's like a little universe sprung out of the human effort poured into developing a deep understanding of some problem. There are often reoccurring characters, a unique language, a sense of history, twists and turns, all mounting towards a resounding conclusion that one only fully comprehends after multiple readings, and intense effort. But in our field, the beauty of complexity only makes sense in contrast to our search for simplicity. Simplicity is certainly a virtue.

When I have criticized a paper based on "technical simplicity," it's not because I wish the authors had purposely obfuscated their arguments. Rather, one has to understand the primary goals of a theoretical field: To approach understanding through rigor. What we are trying to understand is computation in all its forms. Toward this end, we often consider idealized versions of problems, and in this respect modeling becomes incredibly important. It comes up in algorithms: What happens if the traveling salesman wants to minimize the average latency, and not the total travel time? And it happens in complexity: What if we allow our constant-depth circuits to have mod gates with composite moduli?

In both cases, we are not confronting the actual problem we want to solve; real-life instances of salesman problems (e.g. satellite movement) probably involve other practical constraints, and (uniform) poly-size circuits can probably do a lot more than AC_0[m]. So often I have to measure the importance of a new model by how it differs technically from the old one. If simple modifications of the old TSP algorithms suffice for the minimum-latency version, it's not clear that we have learned something new (even though one could argue independently that the min-latency version is practically important!). And if AC_0[m] circuits could be simulated in a simple way by AC_0[p] circuits, then I wouldn't think as highly of a paper proving lower bounds against AC_0[m].

Maybe we can be a little more appreciative of the subtlety involved in the reviewing process, and agree that "simplicity is a virtue" is a a bit too simplistic to be the motto for a program committee.

Monday, May 19, 2008

STOC Day 1

James Lee reports from Victoria.

STOC 2008 begins. Victoria is a gorgeous city, if a bit sterile. The population feels mostly transient. The people are very friendly, and the streets are very clean. Most academic conversation turns eventually to one of two topics: Outcomes of the hiring season, and opinions on the "conceptual manifesto" (my naming); more on the latter topic in the business meeting post next.

The conference starts off strong, with Ran Raz presenting Anup Rao's optimal parallel repetition theorem for projection games (Anup had visa issues). Anup gives optimal bounds on the rate of decay of the value of a 2-player game repeated in parallel, in the case where the answers of one player determine the unique answer of the other player that causes the verifier to accept (this is the projection property). The decay rate was recently proved to be optimal by Raz, thereby disproving a strong parallel repetition theorem.

A special case of a projection game is a unique game, the topic of Prasad Raghavendra's paper Algorithms and inapproximability results for every CSP?. Prasad is one of our own, a theory student at UW; his paper was co-winner of the best paper award and sole winner of the best student paper award. For a few years now, since the KKMO max-cut paper, it has been suspected that there is an intimate connection between the unique games conjecture and the power of semi-definite programming in approximation algorithms, although it is only recently--in the work of Austrin--that this connection has begun to materialize explicitly. Prasad sets the connection in stone: He gives a general class of SDPs and a generic poly-time rounding algorithm for all of them, such that for any MAX k-CSP problem, the approximation bound achieved by his algorithm is best possible assuming the unique games conjecture. A key technical step involves converting any integrality gap for his SDP to a unique games hardness result. The talk is remarkably lucid and well-paced.

The other best paper winner is Harald Raecke, for his work "Optimal hierarchical decompositions for congestion minimization in networks." Raecke shows roughly that, given a graph G, there exists a family of trees such that any multi-commodity flow problem in G can be solved by first routing it in each of the trees (trivial), and then mapping a convex combination of those routings into G. The resulting routing in G has congestion within O(log n) of optimal. The mapping from the tree routings to routings in G is fixed, and in particular independent of the flow instance. This gives an O(log n)-approximate oblivious routing protocol, which is best-possible. His proof is a beautiful and unexpected reduction to the FRT tree embedding theorem. In another quite unexpected move, Raecke shows that his tree decomposition theorem can be used to obtain an O(log n)-approximation for the minimum bisection problem.

I expect that the next post, concerning the business meeting, will be a bit controversial. In other news, Adam Klivans loses $20 for betting that the desert contains papaya. It was mango.

Sunday, May 18, 2008

Visioning Workshop

James Lee starts his guest posts from Seattle and Victoria.

On Saturday, SIGACT, in conjunction with the Computing Community Consortium, held a workshop on Visions for Theoretical Computer Science. The goal of the workshop was to produce "vision nuggets" about exciting research themes in TCS that could have a large impact in the future. In other words, to craft PR materials that advertise TCS outside the community (most importantly, to funding agencies). Some pre-workshop socializing started off a bit dangerously, with Anna Karlin explaining that Avi Wigderson should saber the champagne since last time she ended up in the emergency room…

     

The visioning began excruciatingly early (certainly before I could see clearly), but it started off with some good news from Sampath Kannan, the new director of the Computing and Communications Foundations (CCF) division at NSF:  We're moving up in the world (or at least in the new NSF bureaucracy tree).  CCF will be restructured into three top-level clusters:
  • Algorithmic Foundations
  • Communication and Information Foundations
  • Hardware and Software Foundations
STOC/FOCS/SODA/CCC-esque theory will fall into the first cluster.  Besides the hopefully inevitable consequences of getting us closer to the root, there were some more subtle ones, e.g. computational geometry and quantum computation will no longer be funded separately from the rest of theory (Sampath was careful to distinguish quantum computation from e.g. quantum information and quantum engineering which don't fall into this cluster).

Then we broke into groups to "brainstorm" the nuggets; the groups were arranged into categories based on nugget sketches submitted ahead of time:  computational complexity, data-centric computing, economics and game theory, natural science, parallel computing/networks/architecture, and security/privacy/reliability.  By lunch time, various nuggets emerged, with potential titles like "Debunking the privacy vs. utility myth"  (followed by an argument about whether this constitutes a double negative and should be replaced by "Bunking the privacy vs. utility reality"?).  Watch the wiki for polished nuggets appearing in the near (hopefully) future.

The workshop was not without controversy, with Leonid Levin and Avi diametrically opposed on the number of nuggets we should be creating.  Leo thought we should have 0 nuggets, since the future of science cannot be mandated by committee.  Avi, on the other hand, treated the nuggets much like crack (the more the better).  At one point, a group wondered "Should we merge these two nuggets into one?" with Avi replying (paraphrased) "But they're so fundamentally important, why not split them into three?"  In the end, we seemed to find a happy medium (especially once Levin realized that our goals were less as "Gestapo" and more as "PR firm").  In the mean time, the view out the window of the UW CSE department provided a calming distraction.



After the workshop, a large contingent of the participants boarded a seaplane for the trip to STOC 2008.  First Rocco Servedio loaded Karp's luggage.  Then he flew us to Victoria.  See you in Canada.

         

Thanks to the organizers:  Bernard Chazelle, Anna Karlin, Richard Ladner, Dick Lipton, and Salil Vadhan for all their hard work in designing a productive and non-too-painful day of workshopping.  Credits to Claire Mathieu for some of the pictures.

Friday, May 16, 2008

Reminder: Register for Complexity 2008

REMINDER: Register for COMPLEXITY 2008. Deadline is June 1 for early registration; however, the earlier the better. Register here.

Why should you go?
  1. If you are a beginning student in theory you should go to see what research is happening. Something you see in a talk may inspire a PhD topic.
  2. If you are a student in theory who already has a topic in Complexity then you should go to see how your topic connects to other branches in Complexity. And to talk to other people who may know stuff about your topic.
  3. If you are a student in theory who is working in algorithms then you should go to broaden your horizons.
  4. If you are NOT a theorist than should you go? Depends- if you want to get into theory or if you have a passing interest then certainly. If NOT then... well, there may be some other reason to go.
  5. If you are an adjunct, postdoc, lecturer, professor, research scientist, or some other category that I can't recall, some of the above reasons still apply to you.
It is likely that you won't follow some of the talks. Realize that just knowing that some area of research is out there is good. A talk can be a good place to find this out and get references. Also, some talks you can skip, hang out in the halls, and meet your fellow theorists.

Thursday, May 15, 2008

The Week Ahead

Neither Bill or I will be attending the upcoming STOC in Victoria. Have no fear, we have once again enlisted an excellent guest poster to keep you all abreast of the latest happenings.

On Saturday in Seattle, there will be a Visioning Workshop with two goals.

  1. Identify broad research themes within theoretical computer science that have potential for a major impact in the future, and
  2. Distill these research directions into compelling "nuggets" that can quickly convey their importance to a layperson.
We have workshops like this every now and then (remember Portland?), and it is good for our community to occasionally step back and make the case for theory, both to attract good researchers and funding, but also to ourselves so we don't lose sight of the basic reasons of why we do what we do.

At the STOC business meeting, Borodin will discuss his co-authored letter about conceptual contributions that has already appeared on Scott's blog. Nobody seriously argues against papers with important conceptual points, rather we have the problem that STOC and FOCS have gotten to the point that they accept only a fraction of the strong papers in a given year and difficult decisions have to be made and it is much easier to recongize a strong technical paper than a strong conceptual one. Still both the STOC and FOCS 2008 committees are fighting back with the new line added to the call.

Papers that broaden the reach of theory, or raise important problems that can benefit from theoretical investigation and analysis, are encouraged.
We can only recognize true conceptual greatness when it stands the test of time conflicting with computer science's deadline-driven conference system. Something has to give.

Wednesday, May 14, 2008

Surveyed to Death

So far in May I got requests to fill in surveys for Consumer Reports, industry research in IT management, a hotel I recently stayed in, new Lyric Opera dining options and at Northwestern: Internal Communications, International Office, course management system, library space planning, research computing needs and dealing with prospective grad students. The Internet, particularly sites like Surveymonkey make surveys very simple to create and distribute. Each survey promises to take only a small amount of my time and in some cases (like Lyric Opera Dining) I actually care about the outcome. But since I get constant requests, I tend to skip nearly all such surveys.

If you ask the average person on the street which is more accurate: a random sampling of 1200 people or an online survey open to all, most will (incorrectly) say the latter. On-line surveys suffer from statistical skewing—they only measure people who take the time to fill out surveys. And as people like me get inundated with requests, the only ones to fill out surveys are people with strong opinions about the topic or those with too much time on their hands and the results of these survey will be a quite poor reflection of reality.

If every survey writer only sent their surveys to a small randomly selected group of people, then each of us would have very few surveys to fill out and could take the time to do so. But we can't expect surveyors to act so responsibly, nearly all surveys will suffer. So don't bother with the surveys. Open up an on-line suggestion box, a message board or a blog and get the discussion going. Use words instead of meaningless statistics to guide your decisions.

Tuesday, May 13, 2008

The problem with making websites

In my last post I had as a side comment that I maintain a website of applications of Ramsey Theory to Computer Science. One of the comments pointed out two papers that are not on it (but will be soon) and someone else said he had used it. GREAT on both counts!.

When making a website of applications of Ramsey Theory to Computer Science or website of satires of Bob Dylan. or a website of Funny Math Songs (coming soon) or any list or a website of famous people known only by one name (hoping somone else does this, but I have a pretty good list) one encounters various problems:
  1. What is an Application? What is Ramsey Theory? What is Computer Science? These are not important questions, but when making a list they need to be answered. For example, is using Gowers Techniques that have been used in Ramsey Theory count? Probably yes since most people looking at my website on applications of Ramsey Theory will care about that. Do I count papers that use computer programs to find Ramsey Numbers (or VDW numbers or...). I have not, thats not really an application. What about computer science papers (hmmm- how do you define that?) that give constructive lower bounds on Ramsey Numbers? (I have begun a website on Constructive Ramsey Numbers that makes no pretense of being close to complete.) If you include to much you lose coherency. Better indexing might help, but I don't have that much time to spend on this. (I'd have more if I didn't do this blog :-).)
  2. What is a Bob Dylan Satire? If Bob Dylan sings it, then can it be a Bob Dylan satire? (Yes). If William Shattner sings Mr. Tamborine Man very badly, and its funny, but he did not intend it as satire, is it a satire? (Yes). If someone just sings incoherently but its not funny is a Dylan satire? (No) Here my criteria is mostly Do I find it funny?, or is there Some other reason to include it? But in the end its my call and might be arbitrary at times. Fortunately, in this one area, I may be the worlds leading authority so the answer might be If Bill Gasarch says its a Dylan Satire, then it is.
  3. Funny Math Songs- When I get around to this one I will use the Is it funny? criteria. Otherwise you are stuck with lots of stuff that uses math very tangentially- For example, in Bob Dylan's song Tangled up in Blues he has one line Some are mathematicians, some are carpenters wife's. One line does not a math song make. Also, there are some songs about computers being hard to use (The best one- Where's the Service by The Pheremones.) I would not include this. Should I include it in funny songs about computer science. No- its not science. But it is funny. Alas- so many websites to make, so little time.
  4. More generally, when making a list you need to balance the need to be complete with the need to be coherent. And many unimportant questions need to be answered, such as What is an application. This may help sharpen your mind and teach you things, but it can also drive you into pointless arguments with Dylan Fans.

Monday, May 12, 2008

What is an application?

What is an application?
  1. When I took Algebraic Topology the professor said at one point I will now show you an application of homotopy theory at which point the one physics major taking the class woke up and said An application! Finally! Is it an application to quantum field theory? The professor said No, we will use homotopy theory to show that every polynomial with complex coefficients has a complex root The Physics student went back to sleep. (Short sketch of proof: Using Homotopy theory you can show that the complex plane and the punctured complex Plane (remove the origin) are different topologically- the former has trivial homotopy group, while the later has homotopy group Z. Therefore there is no `nice' map between them. If there was a poly p(z) with no roots then you can use this to get a nice map between the two.)
  2. When I took Ramsey Theory the professor said at one point I will now show you an application of Ramsey theory at which point the one physics major taking the class woke up and said An application! Finally! Is it an application to quantum field theory? The professor said No, we will use Ramsey's Theorem to show that, for all m, there exists an n so that, for all sets of n points in the plane, no three colinear, there exists m that form a convex m-gon. The Physics student went back to sleep. (Short sketch of proof: Let n be the 3-hypergraph ramsey number such that for any 2-coloring of the 3-sets of [n] there is a homogenous set of size m. Given the n points in the plane, color sets-of-three as follows: if the number of points in the triangle formed by the 3 points is ODD then color it RED, otherwise BLUE. There will be m points such that every set of 3 has the same parity inside it. One can show that these m points form a convex hull of an m-gon. First step of this proof: if one of the points is inside the convex hull then its inside a triangle formed by three of the other points. NOTE1: Much better bounds are known. NOTE2: Finding the smallest n is called the Erdos-Szekeres problem or the happy ending problem. See this paper for a survey.)
  3. I have a website of website of applications of Ramsey Theory to Computer Science. One of the first ones was Yao's paper Should tables be sorted?. This paper shows that in the Cell Probe Model, if the universe is big enough then yes indeed, tables should be sorted. (Short Sketch: Assume there is a scheme for, given n elements of the ordered universe U, stores them in an array of length n cells. Let the universe U be of size the n-hypergraph Ramsey number such that for any n!-coloring of the n-subsets of U there is a homogenous set of size 2n-1. Color an n-subset of U by the permutation it is stored in. There will be 2n-1 elements such that any subset of n is stored in the same permuation. Assume that it is SORTED (if not then it is a fixed perm to make it SORTED). One can show that if the list is sorted then binary search is the best way to find an element. See this paper for a survey. )


So, are these applications or not? The first one applies topology to algebra. The second one applies Ramsey Theory to the Erdos-Szekeres problem. The third applies Ramsey Theory to Data Structures.

The first and third seem like legit applications. The second one is suspect- applying one Erdos-style branch of combinatorics to another. But they are different branches. One metric of how legit an application is might be how far apart the fields are.

Friday, May 09, 2008

Teaching Parallelism

Uzi Vishkin wrote these ideas on how to teach a parallel computing course as a comment on my earlier parallelism post.

The basic claim is that:

  • It does not make sense to have a new platform of general-purpose parallel computing succeed the established serial platform without having a one-to-one match of EVERYTHING, including algorithms and data structures.
  • In particular, it does not make sense to teach parallel programming without teaching parallel algorithms and data structures. The gap between programming and algorithms must be bridged, so that the continuum from algorithms and data-structures to programming will resemble as much as possible the continuum in serial computing.
  • Since the PRAM theory is the only serious candidate developed in nearly 3 decades of research, PRAM algorithms have got to be taught.
I expect theorists to endorse this argument and use it to convince their colleagues that PRAM algorithms need to be taught. But, I have to be frank. I am concerned that some of us will do the following: teach a course on parallel algorithms as a purely theory course WITHOUT any connection to programming. This will miss the point as it ignores the need to relate algorithms to programming. The Q&A at the end of this text elaborate further on the programming issue.

As others have implied, you can find several fine sources for PRAM algorithms. For this reason, my comments below mostly focus on a way to address the parallel programming issue:

  1. In class presentation.
    1. Read Section 2.1 entitled XMTC in FPGA-Based Prototype of a PRAM-On-Chip Processor. It reviews a modest extension to the C programming language called XMTC that allows PRAM-like programming. XMTC essentially adds only 2 basic commands to C: Spawn and PS (for prefix-sum).
    2. Devote a total of around 15-20 minutes similar to slides 37-39 in these slides to present XMTC. Slide 40 can guide a discussion.
  2. Supporting documentation. The students should then be referred to: the XMTC Manual and the XMTC tutorial.
  3. Programming assignments. Please look up under assignments on this course page.
  4. Running programming assignments. The UMD PRAM-On-Chip project is on track for public release by the end of June 2008 of:
    1. a cycle accurate simulator of the PRAM-On-Chip machine, and
    2. a compiler from XMTC to that machine.
    The will allow your students to run XMTC code on an emulated 64-processor PRAM-On-Chip machine. To remind you, a hardware prototype of such a machine (using FPGA technology) has been in use at UMD since January 2007. A compiler that translates XMTC to OpenMP will also be released, giving your students an alternative way to run their assignments.
Finally, please note that this type of programming cannot be too difficult. I have given a 1-day parallel algorithms tutorial to a dozen high school students in Fall 2007 and subsequently some of them managed to do on their own 8 programming assignments. In fact, the above link to programming assignments gives these 8 programming assignments. The only help the high school student got was one office hour per week by an undergraduate teaching assistant. They did not get any school credit for their work. Their participation was in the context of a computer club after completing their regular school work (8 periods per day).

If you are looking for code examples, you are welcome to write to me.

Here are some Q&A:

Q: I never learned parallel programming formally, but I picked up some ideas in my free time from Java/MPI/OpenMP/etc. How do any of these relate to XMTC parallel programming?

A: XMTC parallel programming is simpler and different.

Q: The problem of algorithms being taught independently of programming is present within the exclusively serial world. What would you say to the many theorists who are resistant to the idea of having a heavy programming component in their courses?

A: IMHO the serial case is completely different. Most students have experienced/learned serial programming BEFORE taking the serial algorithms course. This is NOT the case for parallel programming. My experience is that students learn best if parallel programming is coupled with parallel algorithms. The main difference is that the parallel algorithms course is where parallel programming should be FIRST taught. The reason is that parallelism requires introduction of some first principles representing an "alien culture" to students. In contrast, serial computing is related to: (i) mathematical induction, (ii) the way our brain instructs our body (as a single processor), etc. There is nothing out there that prepares us for parallel computing.

Q: What text do you use for teaching parallel algorithms?

A: I have been using my class notes.

Warm thanks to Adam Smith and Aravind Srinivasan for their helpful comments on an earlier draft of this text.

Thursday, May 08, 2008

Electronic Commerce and Prediction Markets

Registration has opened for the upcoming ACM Electronic Commerce Conference (which I am general chair) in Chicago July 10-12. The conference is immediately followed by AAAI and The Third World Congress of the Game Theory Society both also in the Chicago area. Before the EC conference is a series of workshops and tutorials covering topics from on-line advertising to social networks.

One of those workshops covers an area that has excited me for several years now, The Third Workshop on Prediction Markets. Prediction markets aggregate information quite efficiently in ways we don't yet fully understand and remains a fertile area of study. Legal limitations on betting have restricted the applications of prediction markets, particularly in the US, but that might change soon. The Commodity Futures Trading Commission (CFTC) is asking for public comment for regulations of prediction markets. Their concept release gives a nice discussion of the legal issues. Will this lead to more legitimate real money markets in the US? Time will tell.

More from Pennock and Masse.

Wednesday, May 07, 2008

Any Questions?

A speaker in a seminar talk loves to get questions during the talk for this means that at least one person is trying to follow the talk. A talk with no questions means everyone is either completely following the talk or is completely lost, most likely the latter.

Each question though involves three parties: the questioner, the speaker and the rest of the audience. A good talk has a certain rhythm and questions can disturb that rhythm. So how does the audience feel about the questions? Depends on the question.

  1. Questions that clarify the model or some aspect of the proof. We need these questions to properly follow the talk. When others ask these questions, I learn that I really hadn't understood the model when I had thought I had.
  2. Questions that argue against the model or results. Usually entertaing but can often degenerate into a long argument. The host needs to become a moderator and has to give one of those one-time nerd jokes that have become standard lexicon: "Take this discussion off-line."
  3. Questions that point out mistakes. Usually annoying and serves no purpose unless, of course, it takes down the whole proof.
  4. Questions that prove how smart the questioner is. The most annoying. I cringe whenever I hear a question starting with the word "So".
At the end of the talk the questions usually suggest various extensions to the work that can often go on forever. Most of the audience just wants to escape but is too polite to leave. The host again needs to end the discussion. Having food in another room to continue the discussions in can help immensely.

Tuesday, May 06, 2008

Vanished from the web- of more interest...

In my last post I told of a Masters Thesis that vanished from the web, into the night. I suspect I am one of the few people who wants a copy, hence this incident will not attract any wider attention. This is a contrast to today's tale of vanishing.

A while back a talented fellow named Kevin Ryan recorded and put on the web Dylan hear a who, which was 7 Dr. Suess stories sung in Dylan style. (Since I own what is probably the largest collection of Bob Dylan satires in the world-- 127 satires and an additional 14 songs that I don't count as satires but others do--- this was a must have.)

Kevin Ryan got a Cease-and-desist order from the Dr. Suess people to remove it, and he did, as you can see here. One version of the story, which seems correct, is in this article

One Moral of the story: If you find a SOMETHING on line that you may want to keep, DOWNLOAD IT. Do NOT depend on it still being there later. But there is a different issue here:

Is what the Dr. Suess people did legal? I do not know. Is what the Dr. Suess people did moral? I do not know. Is what the Dr. Suess people did stupid and against their own interests? Yes. I can picture someone hearing Dylan hears a who and going out and getting some Dr. Suess books. I cannot picture hearing it and therefore not getting some books. Businesses need to devolp different business models for the e-world in which we live. For example, they may have worked out a deal where a link to purchase Dr. Suess books is on that same website and/or an advertisement. It is likely there are other possiblities. For more on this, read the book wikinomics, which I might blog about at some later date.

Monday, May 05, 2008

If you find something online download it NOW

A few months ago I was looking into some of the origins of Ramsey Theory (in particular I was looking at what Hilbert needed Hilberts Cube Lemma for) and I came across the following online
Combinatorial Number Theory: Results of Hilbert, Schur, Folkman, and Hindman by Yudi Setyaan. A Thesis submitted in partial fulfillment of the requirements of the defense of Master of Science in the Department of Mathematics and Statistics. Simon Fraser University, July 1998
I printed it out and still have it. It was not helpful for what I wanted, but it was interesting and I'm glad to have it.

Recently I wanted to email it to someone else so I searched for it again. Its gone! Now you have to pay for it at amazon. I also looked for the author on line to see if he might email me a copy (I doubt he gets any money from it and I suspect he would be delighted to find out the someone actually read it.) Couldn't find the authors email address, though I am hopeful that I will.

Moral of the story: If you find a document on line that you may want to keep, DOWNLOAD IT. Do NOT depend on it still being there later.

Friday, May 02, 2008

Report on Sym for Lipton's 60th bday (guest post Ken Regan)

(Guest Post by Ken Regan)

A two-day symposium in honor of Richard J. Lipton's 60th brithday was held April 27--28 in the brilliant new Klaus Advanced Computing Center at Georgia Tech. The wide variety of talks influenced by Dick Lipton's ideas attested his ability to say something deep about many subjects. Deep and still keeping to a dictum of Hilbert featured on one speaker's slides: the simple attracts. An example referenced in many talks was his ("one-paragraph") proof of the self-reducibility of the permanent Wikipedia entry). Another referenced his co-authorship of a March 2008 report to the Georgia Secretary of State with recommendations and advisories on electronic voting.

The first talk by Richard Karp carried the message that problems of the Hitting-Set kind are easier most often in practice than their worst-case NP-complete pedigree leads one to expect. This was supplemented by Neal Young's second-day talk involving Lipton's question, "Is it hard to generate random hard instances of NP-complete problems?" Ravi Kannan spoke on the practicality of finding good approximate equilibria for non-zero-sum games and market situations. Dan Boneh showed the extent to which even certified-sound cryptosystems become vulnerable when keys k are used to encrypt data that overtly contains k, or when there are closed cycles of encodings of multiple keys. He also explained how Lipton's kung-fu wielding of the Chinese Remainder Theorem in attacks on RSA and kin slowed web servers by 8%. Anita Jones surveyed the urban landscape of computer security, and propounded the sequence of system calls made by a program as a signature by which to identify malware.

Michael Rabin described a practical implementation of zero-knowledge protocols for high-stakes auctions, one point being efficiency gains from coding on the arithmetic rather than going all the way down to Yao's oblivious ZK circuit verification. Nisheeth Vishnoi presented a polynomial-time algorithm that distinguishes the "Yes" case of the "Unique Games Conjecture" from a tighter "No" case asserting also that the underlying graph is an expander (paper). This evinces a surprising difference between "Unique Games" and non-unique Constraint Satisfaction Problems, and suggests that if the UGC holds at all, any proof must differ widely from traditional proofs establishing hardness of CSPs.

Erik Winfree's multimedia talk "Is DNA Computing?" demonstrated that whatever one feels about the feasibility of large-scale DNA computers (on which Lipton followed Adleman's first paper with a neater formulation), DNA activity is undeniably computational.

Giovanni diCrescenzo talked on Lipton's idea of storing keys not as small files but parceled among huge hunks of stored data, so that intrusion attempts to recover them leave huge footprints, applying hashing and extractors to implement it. I surveyed the frontiers of super-linear lower bounds, including the Lipton-Tarjan separator theorem's relevance and Dick's part in time-space tradeoffs for SAT, and used Allender-Koucky's observation as a segue to super-polynomial lower bounds. I outlined my position that "Very High Degree" methods are capable of surmounting known barriers and may be necessary.

Dick's longtime friend and associate Richard DeMillo surveyed Lipton's contributions to fundamental problems of software correctness. Wenke Lee focused on how 'bots have opposite behavior and intent to viruses, and how the company Damballa founded by Merrick Furst, him, David Dagon, and Lipton combats botnets.

Parikshit Gopalan presented new ideas on the lower bound frontier of ACC[m] for composite m, and continued a running gag on the power of Chinese remaindering. But Avi Wigderson planted a Monty Python foot on further progress by demonstrating that almost all known complexity-class results preserve themselves under an arithmetical form of relativization, while P vs. NP and most other frontier relations do not. Memo to new graduate students honing their technical abilities by building oracles A separating classes C from D: the ante just got upped to building both A and a low-degree extension A' such that C^A is not contained in D^{A'}, so then proving C contained in D would require "non-algebrizing techniques". And various vice-versas...all of which tighten the rules of equation-solving needed to build A. One sentence of hope remained on Avi's slides actually by Scott Aaronson: methods that go beyond treating (feasibly-constructed) multilinear extensions as black-boxes can possibly evade the new barrier. Jin-Yi Cai closed by presenting a "Holographic" algorithm toolkit of subversive power, whose steep learning curve is helped by its affinity with quantum computation.

On the learning-curve subject, I came away with the impression that although it is often higher for cryptographic protocols than algorithms, more of the former actually get implemented. Of course, Karp has always stood for implementing algorithms, and Neal Young reported on how his beats simplex for its target domain, but I'm just reporting my positive impressions of security applications from the two days. In all it was a rich meeting, with "not too many embarrassing stories" for Dick and lots of energy.

Thursday, May 01, 2008

The ID Conundrum

I called human resources at Northwestern with a question about health insurance. After she had trouble tracking me in the system we discovered Northwestern had the wrong social security number for me. Northwestern take great care to hide my SSN, using an employee number on my Faculty ID card and allowing me electronic access to my paycheck with nary a social security number in site. Northwestern would probably not use my SSN at all except they need it to report taxes which would have caused all sorts of havoc had, by pure luck, I didn't catch the mistake.

That's the problem with the social security number. We've become so scared of using the SSN, the number has become useless. What we need is a unique public ID that we are not afraid of using. In my ideal world, we would all have a public and private ID. With someone's public ID I could use it to call them, text them, IM them, email them even send them postal mail by simply writing their ID number on the envelope and the post office's computers will know how to route the mail. People would use their private ID to log onto some central server to set the places that the public ID points to as well as block certain users and deal with privacy restrictions.

You could use your public and private ID to log onto all your services so you don't need to keep separate accounts on various webpages. Both Northwestern and the University of Chicago have single electronic IDs and passwords to access email, benefits and wage information, course information, get wifi access and much more.

Now that people avoid using the SSN as a public ID, cell phone numbers and email addresses are beginning to play that role. Privacy advocates have slowed down efforts to have a public ID for a variety of reasons. But the great need for an ID means the market will start using whatever it has available and isn't it better to carefully design a proper public/private ID than have some ad-hoc market-driven system instead.