Wednesday, March 19, 2025

A Failure to Communicate

With care you can explain major ideas and results in computational complexity to the general public, like the P v NP problem, zero-knowledge proofs, the PCP theorem and Shor's factoring algorithms in a way that a curious non-scientist can find interesting. Quanta magazine keeps coming back to complexity. because we have a inherently interesting field.

So why am I having such a difficult time with the new Ryan Williams result, that time can be simulated in nearly quadratically less memory, or more precisely DTIME(\(t(n)\)) \(\subseteq\) DSPACE(\(\sqrt{t(n)\log t(n)}\)), based on the Cook-Mertz space-efficient tree evaluation algorithm.

Many results in complexity are quite specialized and technical but this shouldn't be one of them. Ryan's result involves a major new relationship between time and memory, the two most basic resource measures that goes back to the 1960s. Everybody understands time--waiting while their computer or phone is spinning. They know about memory, at least that you have to pay more for an iPhone with more of it. Yet still I'm challenged to explain this result.

Maybe because for historical reasons we use SPACE instead of MEMORY, or the pesky \(\log t(n)\) factor. Someone said they got lost in the logarithms. But we can talk about the gist of the result without mention space or logarithms.

It might be the \(t(n)\). We take for granted that we measure resources as a function of the problem size but outsiders could find it a challenging concept. For P v NP I just talk about efficient computation, here you can't avoid more specific running times. 

Or maybe people don't think about time and memory. Most operations on a computer happen close to instantaneously. Nobody thinks about memory unless they run out of it. 

Or because we don't have good real-world examples. When would you prefer a space \(\sqrt{t(n)}\) time \(2^\sqrt{t(n)}\) procedure to a space \(t(n)\) time \(t(n)\) algorithm? 

AI is supposed to be good at this. OK, Claude take it away.

Ryan Williams' breakthrough paper shows that computers can solve complex problems using far less memory than previously thought possible. For 50 years, computer scientists believed that problems requiring a certain amount of time (t) needed at least t divided by a small factor of memory. Williams proved that these same problems can actually be solved using only about the square root of t memory. He achieved this by connecting these problems to the Tree Evaluation problem and leveraging a recent algorithm by Cook and Mertz. This result makes significant progress toward resolving the longstanding P versus PSPACE problem, suggesting that many algorithms could potentially run efficiently on devices with much less memory than we currently use.

Imagine a navigation app trying to find the optimal route through a complex city network. Before this discovery, engineers believed that calculating detailed routes required either substantial memory or accepting slower performance on memory-limited devices. Williams' theorem suggests these calculations could run using dramatically less memory—potentially reducing requirements from 100 MB to just 10 KB (roughly the square root). This breakthrough could enable sophisticated navigation features on devices with severe memory constraints, such as smart watches, older phones, or embedded car systems, allowing them to handle complex routing problems with multiple stops or constraints without performance degradation.

20 comments:

  1. It's funny to see how the AI starts bullshitting in the second paragraph. And for those who don't understand the result, it might even be vaguely plausible.

    ReplyDelete
  2. I think Claude was trying to do what ordinary people want: give an example using specific numbers. It just failed miserably. Ordinary folks don't have a vivid intuitive sense of "memory that's the square root of time". They probably wonder: if you take the square root of ten seconds, how many megabytes is that? You can say that's not the right question, but then you have to tell them what is the right question.

    ReplyDelete
  3. TBH That second paragraph reminds me a lot of closing paragraphs in a lot of pop science journalism that usually exaggerates the importance of whatever result the article is about.

    ReplyDelete
    Replies
    1. I strongly suspect this similarity is not coincidental. LLMs try to imitate language they saw in their training data. Claude saw a lot of pop sci explanations of computer science and math that sound like this so when you ask it for a pop sci description of a result in computer science, this is what it gives you.

      Delete
  4. The first paragraph is also bullshit! "computer scientists believed that problems requiring a certain amount of time (t) needed at least t divided by a small factor of memory": That is not true!

    For the difficulty of the result, I guess that one point is that the standard TIME(t) ⊂SPACE(t) is actually TIME(t) ⊂TIME-SPACE(t), while the new result does not preserve time. But maybe, an explanation can be tried that way: "An algorithm that uses time t uses a memory of size at most proportional to t. A question becomes: ok but if we simply try to replace our algorithm by an equivalent one that uses the smallest possible amount of memory, whatever time it takes, how small can this memory be? As far as we know, we are not able to rule out that an exponentially smaller memory could be sufficient (that is of size proportional to log t). But the only known result was much more modest: a memory of size t/log t is sufficient. Ryan's new result is that √(t log t) is sufficient: the previous result was almost t, the new is almost √t."

    ReplyDelete
  5. Here's my attempt.

    The theorem says that if an algorithm runs for t steps, then it can be simulated by another algorithm that uses roughly sqrt(t) bits of memory. This is surprising, because the original algorithm might use up to t bits of memory. The "catch" is that the new algorithm is very slow, i.e., it takes way more than t steps. Consequently, the theorem is not necessarily very useful for practical computing. Nevertheless, the theorem has a ton of scientific value, because it tells us something new and surprising about how time and memory constraints affect computation.

    ReplyDelete
    Replies
    1. This strikes me as a pretty good lay explanation, though it's sometimes hard for me to predict what will or won't make sense to people outside of math and CS.

      Delete
  6. I had a similar problem trying to explain this to some family members. I started by trying to say that this result implies that all fast algorithms can be run using very little memory. But when I add the caveat that the low memory algorithm is so slow that no one would want to use it, they get confused again. Then I said it makes a little progress on a very fundamental problem that would have serious implications, even if this isn't enough progress to make most of those implications. That seemed to satisfy them, but a similar thing could be said about nearly every result in complexity theory.

    ReplyDelete
  7. Graphs are not usually as aesthetically pleasing as other graphics but in the form of a gif, they could be a nice addition to an article on complexity theory. Quanta Magazine has a YouTube video, 11 minutes and 12 seconds into which they zoom further and further out of a graph of f(n) = 2^n. https://youtu.be/pQsdygaYcE4?t=672

    ReplyDelete
  8. You are challenged because practically it has no significant consequences and conceptually there is no reason why this might not be true.

    The importance of this result is just the fact that we have made some epsilon progress in an area that has been extremely hard to make any progress for decades.

    ReplyDelete
    Replies
    1. I agree that the result does not have any obvious practical use (though plenty of results in computer science that have no immediate practical use sometimes lead to practical applications later; think of the early work on interactive proofs and zero-knowledge proofs, for example). However, I think this result is much more significant than "making epsilon progress." This is not like a new algorithm for matrix multiplication which lowers the exponent by .00001 or something. This is really a new type of result that is not very similar to results that have been proved before and that most people probably wouldn't have predicted would even be true. It is really a conceptual advance in a way that "epsilon improvements" are usually not.

      Delete
    2. I don't think it is a conceptual advance.

      You see it as conceptual advance because you know it is technically hard to make progress in this area.

      It is a major technical advance but it is not really a conceptual advance. If you say it is a conceptual advance tell me how it has altered our understanding of basic concepts regarding memory and time?

      Maybe you can say it introduced some new concepts or ideas, but are those concepts field-internal or things that you can explain to a lay person why they are important in under 5 min?

      For someone who doesn't know complexity theory, their reaction to this result would be so what? And they would not be wrong.

      Maybe this will eventually lead to more conceptually interesting things but at this point the excitement is purely technical difficulty really. There is no reason why one would be surprised by knowing this result is true.



      Delete
    3. It is indeed a conceptual advance and not just technical progress in a hard area.

      The conceptual advance is the realization that if you can solve a problem in a certain amount of time then you can solve it (perhaps much more slowly) in an amount of space that is *much smaller* than the amount of time that the original solution used.

      This is actually fairly surprising when you stop to think about it. For many algorithms, a reasonable portion of the steps of the algorithm consist of saving some new information in memory and that this information can't usually be safely deleted until the algorithm is over. For example, when using BFS or DFS to search for a path in a graph, each time we encounter a new vertex we need to remember that we've encountered that vertex so that we don't visit it again and there's no obvious way to "forget" some vertices until we've explored the entire graph or found a path. If an algorithm like this runs in time t(n) then it will also use ~t(n) space. And intuitively it feels plausible that there are some algorithms like this where the space usage cannot be decreased even if you are willing to accept a much slower running time.

      Moreover, this seems to have not just been an intuitively plausible belief, but also a common one. I've never heard any computer scientist even speculate that Williams's result might be possible (and as Williams mentions in his paper, some papers have even proved results conditional on assumptions that are contradicted by Williams's new result).

      It seems to me that this realization has "altered our understanding of basic concepts regarding memory and time." It tells us that you can do much more with limited space than we had realized. And not just for some specific problem, but, in some sense, for *any* problem. As I said in a reply to someone else's comment, it seems to me that this is as big of a change in our understanding of space-bounded computation as Savitch's theorem or the Immerman-Szelepcsenyi theorem (and actually bigger than either imho). It is the kind of result whose statement you could explain to someone as soon as they knew the basic definitions of space- and time-bounded computation.

      One more point that I'd like to address. You say "Maybe you can say it introduced some new concepts or ideas, but are those concepts field-internal or things that you can explain to a lay person why they are important in under 5 min?" Depending on what you mean by "lay person," this seems like an impossibly high standard to me. Could you explain error-correcting codes or Shannon entropy to someone who is not in math/CS/Statistics/Physics in 5 minutes? Could you explain zero-knowledge proofs? I've tried both of these and it's not as easy as you might think. I do believe, however, that you could explain the statement of Williams's result and why it is interesting to a smart undergrad who has only seen the most basic definitions in complexity theory (e.g. DTIME and DSPACE). And that, to me, seems like a far better standard for "conceptual advance" than what you have in mind.

      Delete
  9. Put differently, if you didn't know this is a progress in a very hard research area, would you be excited about this result?

    I for one can say I definitely would not be.

    ReplyDelete
    Replies
    1. I disagree with this. I think this result is almost as exciting as results like Savitch's theorem or the Immerman-Szelepcsenyi theorem: fundamental results about the abilities of space-bounded computation which are relatively easy to state but also surprising and whose proofs are clever but not hard to understand. I think that someone who understands all the definitions involved (e.g. of Time(t) and Space(t)) and has spent even a few hours thinking about them could easily understand Ryan Williams's result and see why it is somewhat surprising and interesting. Of course, you do need to know the basic definitions and think that they are interesting things to study, but the statement of this result is really not very technical at all.

      Delete
  10. It is blatantly obvious you can reuse space, reusing time is much harder. Until last month i had only heard of an upper limit time = e^space or something. Now someone found the lower limit near sqrt, I would have guessed much higher.

    ReplyDelete
    Replies
    1. time(f) <= e^space(f), but also trivially space(f) <= time(f). it is the latter one that the new result is relevant to, not the former.

      Delete
    2. It is blatantly obvious that you can reuse space in the sense that you can delete what was saved in that space and do some new computation. But that's not what the proof of this result is about. Instead, it's about reusing space *without deleting what was saved there,* which is much more surprising. For one explanation of this concept that is independent of Williams's result, see the blog post by James Cook here: https://www.falsifian.org/blog/2021/06/04/catalytic/.

      Delete
  11. is it surprising that we can trade off time for space intuitively?

    in programming, it is pretty common in many cases where you can save memory if you don't store stuff and need to compute them when needed. many things do the reverse, e.g. hashing, dynamic programming/memoize, caching, ...

    practically storage is extremely cheap compared to compute costs. Just check out the prices of storage and compute on your favorite cloud provider.

    so the more interesting here would be a result that shows you can systematically and solve problems significantly faster if you use a bit more memory.

    something like DTimeSpace(f,g) is in DTimeSpace(f^0.5, g^2) or DTimeSpace(f/2, 2g) (fix the model so const factors are meaningful). Now if someone proves such a result, even with some additional restrictions on the class of problems, that would be very interesting.

    ReplyDelete
    Replies
    1. storage in general might be cheap, but fast psychically-small low-power storage is not.

      just check the price difference on different RAM configurations on iPhone.

      Delete