Tuesday, April 27, 2010

Trading Money for Computation

Since the beginning of complexity we talked about time complexity t(n) as a function of the input size. But it has been the inverse of this function that we really care about: We have a roughly fixed number t of computation steps available on our computer and we want to know the largest problem we can solve within that running time. The value t has increased dramtically under Moore's Law and we care about how much larger a problem we can solve with our larger running times.

You could get around the limited time with a large investment in additional computer hardware but that rarely made sense to run a single algorithm. But in several (non-theory) talks I've seen recently you see speakers talking about solving their large problems by renting time on Amazon's servers. You can now cheaply buy computing time on the cloud via Amazon or Microsoft and soon via Google.

Cloud computing doesn't make all computing cheap. One still needs to have good parallelization since the problem gets run on many servers with limited communication between them. And all the servers in the world won't beat out exponential growths in running time. The P versus NP problem remains a problem. Nevertheless we need a good formal framework to study these problems.

Computational complexity has in the past adapted well to new computation models from the PRAM to biological and quantum computers. But we are seeing new computing paradigms in multicore and cloud computing and theory seems late to the party. There was a nice SODA paper on MapReduce, the basic cloud computing operation, but for the most part theorists haven't tackled the cloud computing model and only a few have looked at multicore. Theory can say much about new computational methods, both in how we can take advantage of them and what they can't do, but only if we make the effort to develop the proper models to capture these new approaches.


  1. But we are seeing new computing paradigms in multicore and cloud computing and theory seems late to the party.

    I see and hear statements like this from time to time, and they are making me more and more confused. Do people really think PODC, DISC and SPAA are systems conferences? At this point, what I hear when someone says something like that, is, "I don't recognize 'The Art of Multiprocessor Programming' as theory, because it's not written using formalisms that I am familiar with." Mainstream complexity might be showing up late to the party, but theorists have been studying models -- including "cloudy" models (though under different names) -- since at least the mid-1980s.

    In addition to any hard science issues, such as difficulty understanding the language of a subfield that has evolved along a different literature track, I've noticed an "attitude barrier" that might be hindering complexity's entrance into the cloud. For example, I guest-blogged about PODC 2009 on Michael Mitzenmacher's blog, and noted that the Best Paper of the conference had been rejected from STOC. An anonymous commenter noted that this seemed standard -- a paper doesn't get into a top conference, so gets resubmitted to a second-tier conference, and gets in. That is, I suppose, one possible explanation, but there are so many others, including the fact that the paper had extremely involved proofs, and it might not have been a good fit for the likely attendees at STOC. To whatever extent "different formalisms" is kneejerk-translated into "less significant concepts and techniques," that will hold back both STOC and complexity theory.

    And all the servers in the world won't beat out exponential growths in running time. The P versus NP problem remains a problem.

    I guess my response to this is, "Yes and no." AI practitioners work on solving NP-complete (or worse) problems all the time, and the formal verification community solves PSPACE-complete (or harder) problems all the time. It's a big deal in some fields to have an algorithm that is "almost tractable." The ability to parallelize on a grander scale will make a lot more problem instances solvable, even though P/NP is still real. On top of that, I spend most of my research time thinking about biomolecular computation, where maybe each "core" operates far slower than a microprocessor, but there are 100 million "cores" available to process on. (Presenting input to the cores, and reading output, is another story altogether.) What becomes almost-tractable in a setting like that? I don't know, but I bet complexity theory already does -- or quickly would, if more mainstream theorists took distributed computing more seriously.

  2. Can someone explain the difference between "multicore" and "parallel"? i.e. can't we use parallel algorithms for multicore machines? Or is multicore something different?

  3. This is a great topic (IMHO) in part because it points to emerging links between math, science and engineering that are natural (to use Dick Lipton's word-of-the-week).

    Let's ask (for example): What dynamic/informatic systems can we pullback into the cloud naturally?

    Well, from the viewpoint of category theory, it turns out that mathematicians, scientists, and engineers all use pretty much the same approach.

    Step I: Shrink the dynamical space (using techniques variously called 'thermostats', 'Lindblad processes', 'sparsification', 'order reduction'); these techniques can be viewed as based on common grounds in compressive dynamics/sensing.

    Step II: Localize the dynamics (using techniques variously called 'preconditioning', 'effective Hamiltonians', 'molecular potentials'); these techniques can be viewed as based on common grounds in localization theory.

    Step III: Pullback the dynamics into the cloud and integrate trajectories there (using techniques variously called 'simulated annealing', 'trajectory-based simulation'); these techniques can be viewed as based on common grounds in the (emerging) theory of cloud computation.

    Step IV: Pushforward the integrated trajectories back to reality (this step is the all-important "user interface").

    This point-of-view, that cloud computing is all about the three-step sequence (1) pullback-to-the-cloud, (2) integrate-efficiently, (3) pushforward-to-the-user, is nice because it integrates naturally with category theory ... so that we can describe broad classes of cloud computation in terms of a shared mathematical foundation.

    Hey, that's what systems engineering is all about! :)

  4. Anon #1:

    The difference between the two is slight, but important. For one, parallel machines have a much higher communication overhead. But the really big difference between the two is that multicore machines have access to the same exact information, and need to sychronize access. With parallel machines it's very possible that machine A has data A but not B and machine B has data B but not A.

    Anyone else want to expound?

  5. Theorists have gone through a lot of trouble to make their models clean and general and to abstract away the details of the hardware. Do we really need new models for every flavor of computation? It's not clear to me that all of these technologies are that different from, say, PRAMs or streaming.

  6. Anonymous asks: Do we really need new models for every flavor of computation?

    A short answer is "Yah never know 'til yah try." A longer answer is "Never ask your codes to do anything unnatural."

    In the cloud environment, new methods of computing become technologically natural ... e.g., "one processor per basis vector" ... this suggests new abstractions and new complexity measures that (if we are lucky) will prove to be mathematically natural.

    To me, these new abstractions seem like good news ... because Nature herself is suggesting them to us.

  7. Theory is way too late to the party. Just look up how many different models of concurrency exist and how many are modeled accurately and completely by programming languages (none). Cloud computing is a misnomer; it should be called cloud i/o.

  8. @anon (response 2), this-statement-is-false:
    if only life where that simple.... There are several questions involved:
    1) how strong are the cores involved
    2) how many cores you have
    3) how many levels of memory sharing do you have
    4) what are the locking mechanisms on these memory layers
    5) how fast is the access to each of these memories.

    Different answers to these question give you different things:
    - Fermi has several thousands very weak cores, with three different types of memories (aside from the main memory). Very good for data parallelism, hard to program compared to traditional machines.
    - Larrabee was supposed to have 80 medium strength cores, with shared and non shared caches. It is considered a horrible failure (amazingly hard to make good use of the potential).
    - Nahalem gives you few (currently up to 6) strong cores, with no shared caches, but with very fast global memory access.
    - clouds: typically thousands of standard cheap cores, with relatively expensive communication. very good for task parallelism.

    IMHO, the problem is that while the TCS community was sleeping, the industry had to find solutions, so they did, but without the theory to support it. So they did. Sometimes it works very well, sometimes it's ok, and sometimes it doesn't.

  9. Perhaps I am naive, but is there a large difference between map reduce and a BSR PRAM?

  10. Parallel computing, distributed computing, cloud computing, ...

    But at the end of the day, in theory at least, they are basically the same thing. I think Lance's post was more toward incorporating economics into the theory. But it is not just money, and it is not new. A good theory is one that can be adopted to particular models easily, not inventing new theories each time with the trend and usage of new names for old concepts (which is more about marketing and advertisement than computer science).

  11. 1) There's also Muthu's paper at SODA09, proving analogies between some streaming "complexities" (or similar) and the previous. Similarly, some progress on the area can be found in a recent article by Mitzenmacher on the arxiv.

    2) It baffles me, that such request for study is moving through the TCS community. I'd rather find more natural an exchange with the PODC/SPAA/.. etc theorists. Or am I missing something?

    3) I believe one possibility is to take existing cloud computing models and understand their power. Another, which is perhaps more typical of TCS community, would be to introduce the world to novel usages of such means.
    I.e. shouldn't we start designing algorithms for the cloud?

  12. You should certainly be aware of Dan Bernstein's various writings about the often misunderstood nature of brute force search in cryptanalysis:


    His work on parallel NFS is also classic in discussing computation cost: