Computational Complexity: Defending Theory

Wednesday, June 11, 2025

Defending Theory

In the June CACM, Micah Beck writes an opinion piece Accept the Consequences where he is quite skeptical of the role of theory in real-world software development, concluding

It is important that we teach practical computer engineering as a field separate from formal computer science. The latter can help in the understanding and analysis of the former, but may never model it well enough to be predictive in the way the physical sciences are.

I certainly agree that theoretical results can't perfectly predict how algorithms work in practice, but neither does physics. The world is much more complex, both computationally and physically, to perfectly model. Physics gives us an approximation to reality that can help guide engineering decisions and theory can do the same for computation.

You need a basis to reason about computation, lest you are just flying blind. Theory gives you that basis.

Let's consider sorting. Beck complains that Quicksort runs in \(O(n^2)\) time in the worst case even though it is used commonly in practice, while the little-used Insertion sort runs in worst-case \(O(n\log n)\). Let's assume Beck meant an algorithm like heapsort that actually has \(O(n\log n)\) worst-case performance. But theorists do more than fixate on worst-case performance, Quicksort runs in \(O(n\log n)\) on average, and on worst-case if you use a random pivot, or a more complex deterministic pivoting algorithm. Introsort combines Quicksort efficiency and worst-case guarantees and is used in some standard libraries.

Beck worries about secondary storage and communication limitations but theorists have studied sorting in those regimes as well.

The other example he talks about is about a theoretical result that one cannot use an unreliable network to implement one that is completely reliable while textbooks consider TCP to be reliable. But in fact TCP was designed to allow failure because it took account of the theoretical result, not in spite of it.

Beck ends the article talking about Generative AI where theory hasn't kept up with practice at all. Beck calls for using classical AI tools based on formal logic as guardrails for generative AI. However, the lack of theoretical understanding suggests that such guardrails may significantly weaken generative AI's expressive power. Without adequate theory, we must instead rely more heavily on extensive testing, particularly for critical systems.

There are stronger examples Beck could have used, such as algorithms that solve many NP-complete problems efficiently in practice despite their theoretical intractability. Even here, understanding the theoretical limitations helps us focus on developing better heuristics and recognizing when problems might be computationally difficult.

I agree with Beck that relying solely on the theoretical models can cause some challenges but rather than have the students "unlearn" the theory, encourage them to use the theory appropriately to help guide the design of new systems.

Beck's call to separate software development from theory only underscores how important we need to keep them intertwined. Students should know the theoretical foundations, for they shape problem solving, but they should also understand the limitations of these models.

6 comments:

Anonymous1:34 PM, June 11, 2025
Wow, what an embarrassing and poorly researched / fact-checked article. Mis-stating basic facts (e.g., running time of insertion sort and the sorting lower bound). Citations to geeksforgeeks.org. And, sorting is a problem where the theory does an amazing job explaining the practice, even to the level of why dual-pivot quicksort implementations are faster than classic quicksort implementations (despite performing more compares and exchanges). If the author had "learned" the theory, he might be a bit less quick to tell students to "unlearn" it.
ReplyDelete
Replies
Anonymous4:06 PM, June 11, 2025
He who loves practice without theory is like the sailor who boards a ship without a rudder and compass and never knows where he may cast.

Not my opinion only, Leonardo da Vinci wrote it.
ReplyDelete
Replies
Anonymous9:11 PM, June 11, 2025
While at it, let's also get rid of Math. It has no relation to real world after all.

The real problem here is that many more applied computer scientists are closer to experimental physics, but the system they deal with are often much simpler than experimental physics. Experimental physics values theoretical physics, cause without it you cannot really do modern physics.

One other aspect is that a lot of computer science is human constructs, so it is more engineering than science what Beck does. E.g. a civil engineer who is designing a simple building doesn't need to know cutting edge theoretical physics. Once we get to civil engineer and theoretical physics, you might hear the civil engineer not being able to see the value of theoretical physics for his own work.

On the other hand, I think we can point to a lot of practical things that came out of theory. E.g. a lot of cryptography especially rely on theoretical computer science.

On the other side, we need to accept that a lot of people we train don't end up as scientists but as engineers, and maybe the weight that more practical topics a student should learn is lower than it should be while the amount of theory they should learn is more than it should be.
ReplyDelete
Replies
Josh Burdick11:21 AM, June 14, 2025
I think that most of the algorithms classes I've taken have said something along the lines of: "Yes, big-O notation is hiding constants, and the constants matter. But the asymptotic difficulty can still be useful to know about."

And I agree with people who've said that, just because a theory is approximate, doesn't mean the theory isn't helpful. (I'd guess that people could design bicycle transmissions, knowing about gear ratios, without including relativistic effects.)

It's arguably nice to know an upper bound on an algorithm, even if that bound isn't exact. Beck has a good point, though, that the reminders that theory is imperfect is important. Partly because of constants, and questions of exactly how hardware operates. (There're plenty of upper bounds which model things like disk and cache, though.)

Also, there are some methods, such as the simplex method, and SAT solvers, which have terrible upper bounds, but which are nonetheless useful in practice. (I think loopy belief propagation is another such method, in which convergence isn't guaranteed. I don't know how much people use it in practice, though.)
ReplyDelete
Replies
Anonymous3:52 AM, June 15, 2025
I have been in the industry for over a decade and I have yet to see a place where I would use any of the things I learned in algorithms course.

Everything I need is packaged in standard libraries. A small number of people need to implement these algorithms but once implemented, it is just a library.

Same for many other aspects of CS, e.g. I have never really needed to go any OS code, though it is useful to have a high level understanding of how OS works, and same with Networking. I never implement TCP/IP, I just use the standard libraries for them.

What I need to know is how to troubleshoot a system and there was zero courses on that topic. What I need to to know is how to work with other engineers in a shared repository and there were zero courses on that topic. What I need to know is being able to write maintenance code and there were zero courses on that topic. What I need to know is how to write a design fix for a system with trade offs and there were zero courses on that. What I need to know is how to refactor legacy code or get up to speed in a new team and there were zero courses on that. What I need to know is how to write tests for my code and system and there were zero courses on that.

It is not just theory. The courses in the CS curriculum do not cover what most CS graduates actually need and cover a lot that they won't need.

CS on many schools is more designed to be a precursor to CS graduate school, not what most students actually need.
ReplyDelete
Replies
Anonymous6:24 AM, June 18, 2025
I would argue that neural networks were first developed as a theory (in the physics sense) way back, and only much later computer scientists got them to work in practice.
ReplyDelete
Replies

Add comment