Wednesday, June 11, 2025

Defending Theory

In the June CACM, Micah Beck writes an opinion piece Accept the Consequences where he is quite skeptical of the role of theory in real-world software development, concluding

It is important that we teach practical computer engineering as a field separate from formal computer science. The latter can help in the understanding and analysis of the former, but may never model it well enough to be predictive in the way the physical sciences are.

I certainly agree that theoretical results can't perfectly predict how algorithms work in practice, but neither does physics. The world is much more complex, both computationally and physically, to perfectly model. Physics gives us an approximation to reality that can help guide engineering decisions and theory can do the same for computation.

You need a basis to reason about computation, lest you are just flying blind. Theory gives you that basis.

Let's consider sorting. Beck complains that Quicksort runs in \(O(n^2)\) time in the worst case even though it is used commonly in practice, while the little-used Insertion sort runs in worst-case \(O(n\log n)\). Let's assume Beck meant an algorithm like heapsort that actually has \(O(n\log n)\) worst-case performance. But theorists do more than fixate on worst-case performance, Quicksort runs in \(O(n\log n)\) on average, and on worst-case if you use a random pivot, or a more complex deterministic pivoting algorithm. Introsort combines Quicksort efficiency and worst-case guarantees and is used in some standard libraries.

Beck worries about secondary storage and communication limitations but theorists have studied sorting in those regimes as well. 

The other example he talks about is about a theoretical result that one cannot use an unreliable network to implement one that is completely reliable while textbooks consider TCP to be reliable. But in fact TCP was designed to allow failure because it took account of the theoretical result, not in spite of it.

Beck ends the article talking about Generative AI where theory hasn't kept up with practice at all. Beck calls for using classical AI tools based on formal logic as guardrails for generative AI. However, the lack of theoretical understanding suggests that such guardrails may significantly weaken generative AI's expressive power. Without adequate theory, we must instead rely more heavily on extensive testing, particularly for critical systems.

There are stronger examples Beck could have used, such as algorithms that solve many NP-complete problems efficiently in practice despite their theoretical intractability. Even here, understanding the theoretical limitations helps us focus on developing better heuristics and recognizing when problems might be computationally difficult.

I agree with Beck that relying solely on the theoretical models can cause some challenges but rather than have the students "unlearn" the theory, encourage them to use the theory appropriately to help guide the design of new systems.

Beck's call to separate software development from theory only underscores how important we need to keep them intertwined. Students should know the theoretical foundations, for they shape problem solving, but they should also understand the limitations of these models.

Monday, June 09, 2025

The New Godel Prize Winner Tastes Great and is Less Filling


David Zuckerman

The 2025 Gödel Prize has been awarded to Eshan Chattopadhyay and David Zuckerman for their paper

Explicit two-source extractors and resilient functions

which was in STOC 2016 and in the Annals of Math in 2019. 

We (Bill and Lance) care about this result for two different reasons.

BILL: The result has applications to constructive Ramsey---

LANCE: Ramsey Theory? Really? This is a great result about

Eshan Chattopadhyay
pseudorandomness! In fact the only interesting thing to come out of Ramsey Theory is the Probabilistic Method (see our discussion of this here). 

BILL: Can't it be BOTH a great result in derandomization AND have an application to Ramsey Theory. Like Miller Lite: Less Filling AND Tastes Great (see here)

LANCE: But you don't drink!

BILL: Which means I can give a sober description of their application to Ramsey Theory.

All statements are asymptotic.

Let \(R(k)\) be the least \(n\) so that for all 2-colorings of \(K_n\) there is a homog set of size \(k\).

Known and easy: \(R(k)\le 2^{2k}/\sqrt{k} \)

Known and hard: \(R(k) \le 3.993^k \). How do I know this is true? Either I believe the survey papers on these kinds of results (see here) or a former student of mine emailed me a picture of a T-shirt that has the result (see here) from (of course) Hungary.

Known and Easy and Non-Constructive: \(R(k)\ge k2^{k/2}\)

Can we get a constructive proof? There were some over the years; however, the paper by Eshan Chattopadhyay and David Zuckerman improves the constructive bound to exponential in \(  2^{(\log k)^\epsilon}.\) 

SO Lance, why do you care?

LANCE: First of all when I chose this paper as one of my favorite theorems (a far bigger honor than the so-called Gödel Prize) I gave the post the clever title Extracting Ramsey Graphs that captures both the pseudorandomness and the Ramsey graphs. But of course the Ramsey result is just a minor corollary, the ability to get a near perfect random bit out of two independent sources of low min-entropy is the true beauty of this paper. 

BILL: You have no sense of good taste.

LANCE: Well at least I'm not less filling.

Wednesday, June 04, 2025

Rules vs Standards


You can write laws that are very specific, like the US tax code, or open to interpretation like the first amendment. In the literature these are known as rules and standards respectively. 

In computational complexity, we generally think of complexity as bad. We want to solve problems quickly and simply. Sometimes complexity is good, if you want to hide information, generate randomness or need some friction. But mostly we want simplicity. How does simplicity guide us in setting guidance, either through rules or standards?
 
Rules are like a computer program. Feed in the input and get an output. Predictable and easy to compute. So why not always have tight rules?

Nobody ever gets a computer program right the first time, and the same goes for rules. Rules can be overly restrictive or have loopholes, leading to feelings of unfairness. Rules can require hoops to jump through to get things done. Rules don't engender trust to the ones the rules apply to, like very tight requirements on how grant funds can be spent. We know that in general we can't predict anything about how a computer program behaves, so why do we trust the rules? 

A good example of a standard is that a PhD dissertation requires significant original research. Rules are things like the exact formatting requirements of a thesis, or statements like a CS thesis must contain three papers published in a specific given set of conferences. 

As an administrator I like to focus on making decisions based on what's best for my unit, as opposed to ensuring I followed every letter of every rule. Because if you live by the rules, you'll die by the rules.  People will try to use their interpretation of the rules to force your hand.

Sometimes we do need strict rules, like safety standards, especially for people unfamiliar with the equipment. Structured rules do give a complete clarity of when an action is allowed. But it also gives an excuse. Have you ever been satisfied by someone who did something you didn't like but said "I was just following the rules"?

Even strict rules tend to have an out, like a petition to take a set of courses that don't exactly match the requirements of a major. The petition is a standard, open to interpretation to capture what the rules don't. 

As a complexity theorist I know what programs can't achieve, and as an administrator I see the same with rules. I prefer standards, principles over policies. Set your expectations, live by example, and trust the people, faculty, staff and students, to do the right thing and push back when they don't. People don't want strict rules, but they mostly act properly when they believe they are trusted and have wide latitude in their work. 

Monday, June 02, 2025

Complexity theory of hand-calculations

 (Thanks to David Marcus who sent me the video I point to in point 4 of this post. Tip for young bloggers (if there are any) you can have a half-baked idea for a post and then someone sends you something OR you later have an idea to make it a full-baked idea for a post. That's what happened here. So keep track of your half-baked ideas.)

1) When I was 10 years old I wanted to find out how many seconds were in a century. I didn't have a calculator (I might not have known what they were). I spend a few hours doing it and I got AN ANSWER. Was it correct? I didn't account for leap years. Oh well.

(An astute reader pointed to a website that does the centuries-to-seconds conversion as well as many other conversions. It is here. If such was around when I was a kid, what affect would it have on my interest in math? Impossible to know.) 

2) Fast forward to 2024: I wanted to find the longest sequence of composites all \( \le 1000\). One long sequence I found by hand is the following (I also include the least prime factor):

114-2, 115-5, 116-2, 117-3, 118-2, 119-7, 120-2, 121-11, 122-2, 123-3, 124-2 , 125-5, 126-2

length 13.

I wanted to find the answer WITHOUT using a computer program or looking at list of primes online (though I do allow a calculator just for division and multiplication). 

Of more interest mathematically is trying to prove that there is no sequence of length 14. (If there is, then perhaps the attempt will lead us to it.) 

Assume there was a sequence of consecutive composites \(\le 1000\) of length 14.

Map each one to the least prime that divides it. 

At most 7 of them map to 2

At most 3 of them map to 3

At most 2 of them  map to 5

At most 1 of them  map to 7.

At most 1 maps to 11. (Can look at 11*p for all primes \(11\le p \le 89\) and see any of them are in a sequence of length 14.) 

I'll stop here. This is getting tedious and might be wrong. So I asked Claude. It gave me a  sequence of composites of length 19. Here it is (I include the least prime factor):

888-2, 889-7, 890-2, 891-3, 892-2, 893-19, 894-2, 895-5, 896-2, 897-3, 898-2, 899-29, 900-2, 901-17, 902-2, 903-3, 904-2, 905-5, 906-2.

Can one show by hand that there is no sequence of length 20? 

3) The more general question: what is the complexity of finding the longest string of composites all \( \le n\) . This is actually many questions:

a) By hand: by which I mean only allowing multiplication and division and only of numbers \(\le n.\)

b) Theoretically. Use whatever fancy algorithms you want.

c) Theoretically but can assume some Number theory Conjectures that are widely believed. The Wikipedia page on prime gaps is here. (ADDED LATER- an astude commenter pointed out that we want LARGE gaps between primes, but the wikipedia article is about SHORT gaps between primes.) 

d) Do a,b,c for the set version which is as follows:  Given \(n\) and\( L\) determine if there a sequence of consecutive composites of length L that are all \(\le n\).  

4) Does anyone else care about calculation-by-hand? Yes! There are people who want to compute\(\ pi\) to many places JUST BY HAND. Here is a video about them here. Spoiler alert: they did very well. 


Wednesday, May 28, 2025

The Hilltop Story

 

On Route 1 in Saugus, Massachusetts, about a twenty minute drive from Cambridge, stood the Hilltop Steak House. When I went to graduate school in the late 80's, Hilltop led all restaurants in the United States by sales (about $30 million in annual revenue) and volume (about 2.5 million customers).

I went there several times during graduate school. Despite the size, about 1500 seats, always a long wait but worth it to get a good steak at a price a grad student could afford. 

But in 1994, Hilltop was bought by an investment company from the original Giuffrida family. The cost of labor went up and to keep costs reasonable, the new owners cut the quality of the meat. No longer a place to get good steak at a good price and with changing tastes, they lost customers and would finally close in 2013.

Computer Science as an academic discipline has had its Hilltop moment with tremendous growth pretty consistently from 2010 through about 2023. But with the growth of AI and an uncertain job market, if we don't maintain quality and adjust to the new needs and interests of our students, CS may become just a road sign on Route 1.

Sunday, May 25, 2025

Some are Mathematicians, some are Carpenters' Wives, Some are Popes.

 (Trivia: What song has the lyric Some are Mathematicians, some are Carpenters's wives ? It's not a parody song, though sometimes it's hard to tell a  parody song from a so-called real song.)

In my post about Pope Leo XIV I made the following comments in different parts of the post:

Pope Leo XIV has a degree in Mathematics.

Prevost [his pre-Pope name] has a degree in mathematics from Villanova. 

He is not the first Pope to know some mathematics.

I also wrote:

Since Pope Leo XIV was a mathematician, as Pope he won't only know about sin but also about cos.

Someone emailed me about this line, not to say it was a bad joke or even a good joke, but to say 

Since Pope Leo XIV was a mathematician: What qualifies one to be considered a mathematician?

A few thoughts on this question.

1) I blogged about this topic here. Hence today I will discuss issues I did not discuss then. 

2) Robert W  Prevost wrote a book that (just from the title) seems to use some math:

Probability and Theistic Explanations, see here.

that was published in print in 1990 and online in 2023.  I wonder if it will sell more copies now.  I am tempted to ask for a free copy to read and do a review of, but I'm not sure I really want to read it. 

One would think that if someone named Robert Prevost wrote a book that seems to use math and theology then it would be the Robert Prevost who is now called Pope Leo XIV. I thought that. A commenter on my blog thought that. But Lance read an earlier version of this post and pointed out that 

Robert W Prevost NE Robert  F Prevost AND

Robert F  Prevost = Pope Leo XIV.

Hence, alas, the author of the book is NOT Pope Leo XIV. It's striking how plausible it would be that Pope Leo IS the author.  The book STILL might sell more copies since people may think it's by the Pope. 

Robert W Prevost's Wikipedia page is here.

Robert F Prevost's Wikipedia page is here.

3) If someone keeps LEARNING math but doesn't DO math I WOULD consider them a mathematician.

4) If someone is a math crank then the question of are they a mathematician will depend on how cranky they are.

5) If one KNOWS a lot of math but is neither learning anymore or doing any more (perhaps myself when I retire) can you consider them a mathematician?

6) If someone gets a PhD from MIT in Pure math but then goes to industry and programs would you consider them to be a mathematician?

7) If X is NOT a math crank and X considers themselves a mathematician, are they a mathematician?

8) I WOULD consider applied math to be math. This should not need to be said but there may be some pure-math-snobs reading this post. Computer scientists, statisticians, are more of a borderline case that, without being a snob, I might not consider mathematicians.

9)  Someone posted on the blog where this came up Does Lance consider himself a mathematician? I asked Lance and he said:

For the next 37 days I consider myself a Dean. After that, who knows?





Wednesday, May 21, 2025

The Blog of Record


On Saturday, I had my last Illinois Tech graduation as dean before I step down at the end of June. The College of Computing had nearly 1600 graduates and I shook many, many hands that morning.

After graduation I caught a plane to Washington, DC to attend the wedding of my daughter's college friend. We were invited because we became good friends with the bride's parents when we lived in Atlanta. My last trip before Covid was to DC for an NSF panel and this was my first trip to Washington since. 

The out-of-town guests were housed at the Hyatt Regency Bethesda, coincidentally the same hotel that hosted STOC 2009My favorite STOC talk of all time took place in that building. But I remembered nothing about the venue except for the jogging path near the hotel.

No trip to the DMV would be complete without seeing my co-blogger Bill Gasarch in person for the first time in years. We chatted about many things, most notably his last few posts where he joked about the new pope's undergraduate math thesis, taken seriously by both our readers and ChatGPT. I reminded Bill that we are the blog of record, often used by wikipedia as a primary source for much in and out of complexity. Later Bill took out the fake thesis titles.

With graduation behind me, not much more for me to do as dean before my term ends. On the plane ride home, I thought about the question everyone asks me: What's next? I have no idea, but it's going to be fun.

Sunday, May 18, 2025

Is Satire Dangerous in the AI-Age?

There have been times when satire has been mistaken for reality. A list of Onion stories that were mistaken for reality (or was it a mistake?) is here. When I say mistaken for reality I mean that a large set of people were fooled.

My own Ramsey-History-Hoax (blog here, latest version of the paper here) has fooled some people; however the number of people is small since the number of people who know the underlying math is small. 

In my last blog (see here) I said that the Pope Leo XIV majored in math (that is true) and then I gave a false title for his thesis (I HAVE SINCE REMOVED THE ENTIRE PASSAGE). 

 Later in the post I said that his ugrad thesis was not on that topic, but  gave another false title (I HAVE RMEOVED THAT AS WELL.) 

I thought the reader would know that it was false, but one comment inquired about it so I left a comment admitting it was false.

This is all very minor: Not that many people read this blog and very few non-math people would care about what the topic of the  Pope's undergraduate thesis.

The last part of the last sentence is false. Its the POPE! People Do care about his background. 

But surely my blog post isn't so well read so as to make the fictional  title of his thesis a hoax that fools a lot of people. 

Even so, I left a comment wondering if LLM's might learn the incorrect title of the Pope's ugrad thesis. 

A reader named E posted the following:

 It might be too late. I did this search this evening:

E: Did Pope Leo XIV study Ramsey Theory?

Gemini: Pope Leo XIV, whose given name is Robert Francis Prevost,
earned a Bachelor of Science degree in mathematics from Villanova
University in 1977. His undergraduate thesis focused on Rado's Theorem
for Nonlinear Equations.

0) This may not be too bad- one would have to ask about The Pope and Ramsey Theory to get that answer. But in the future this answer might pop up on the question`What did the Pope Study as an Undergraduate' or similar questions.

1) Might future satires or April Fool's Day jokes be mistaken for reality in the future by AI and hence reach a much larger audience than this blog does?

2) If so, should we be careful with what we post (not sure how to do that)?

3) What about people who have a much larger following than complexityblog  (yes, there are such people)?

4) In the past one had to be a celebrity or similar to change peoples perception of reality (see Stephen Colbert and Wikipedia here). Now a complexity blogger may be able to change people's perception of reality. Hence I ask

Is Satire Dangerous in the AI-Age?