Monday, May 11, 2026

Searches Are Weird! No they're not! Bad coding style?

In David Marcus's guest post on good coding style (see here)  he reviewed a book from 1986 called  "Professional Pascal."

I wondered if it was still in print and could be bought:

1) I went to Amazon and searched all products for Professional Pascal. I got this, which is not that book.

2) I then restricted to books, and I got the same, though later on the page I got a relevant book, here.

3) I then searched for Professional Pascal on Google, and got the Amazon site for the book here.

David thought this was weird. I did not. As I put it:

A computer does something which makes no sense. This is common, hence it's not weird.

Why did the search from outside of Amazon do better than the search inside of Amazon?

Speculation

1) Search is just a really hard problem.

2) The coders at Amazon did not use good coding style. They should read the book. If they can find it. 

Wednesday, May 06, 2026

When do we know someone has died

As the blog of record in computational complexity, we like to bring attention to those in the community who have left us. When we learn of someone in our field who has died, Bill and I will talk to each other and decide whether we should do a social media post or a full blog post, and who should write it, Bill, me, or someone else. In fact, if I call Bill, he'll often answer the phone with "who died?"

We also remember those who passed away during the year in our end-of-year post.

One challenge is how we actually know when somebody has died. Consider Michael Rabin. His death was announced on Wikipedia based on the following announcement in Haaretz, an Israeli newspaper.

Haaretz Obit (Translated by Google)

That's a pretty simple obituary for a very famous computer scientist. Rabin is a common name in Israel, and there easily could have been another professor named Michael Rabin somewhere in the country. Every mention of Michael Rabin's death that I saw was just citing the Wikipedia article, nothing from Tal Rabin or some other source that cited the family.

By the time Bill put up the first Rabin post on April 22, we figured that had our Michael Rabin not died, someone would have come forward about it. 

Tony Hoare is a different story, where our blog was one of the first to break the news. I heard from two separate people that they had heard from the family that he had passed away. It helped that I was in Oxford at the time, where Hoare spent much of his career.

And too often a theoretical computer scientist passes away but the news never reaches us and we don't remember them. It's always sad when someone passes, but it is a good opportunity to remember how they helped shape our field. But we need your help to know when someone has passed away. So if you know someone in our community has passed away, please let us know, and how you know, so that we can know we know.

Sunday, May 03, 2026

A few notes on Michael Rabin

Michael Rabin passed away on April 14,2026. I blogged about him here

My post listed results of his that proved upper and lower bounds on problems. My point was that he proved upper and lower bounds for MANY different levels- from decidable to regular. And I am sure I left out some of his results. 

Here are some things I did not mention.

1) Rabin and Scott shared the Turing Award in 1976.  My not mentioning it raises the following question:

If I want to say someone has an impressive set of results which is a better way:

listing the awards they've won, or

listing  their results. 

I leave this to the reader. 

2) I had Rabin for two graduate courses at Harvard: Algorithms and Complexity Theory. He was a great teacher and gave insights into the results, some of which he had either proven or worked on.

3) I recalled thanking him in my PhD Thesis. So I dusted it off to see what I had said: 

The many courses I have taken at Harvard and MIT have helped me create this thesis. I am especially indebted to Michael Rabin, Mike Sipser, and Michael Stob for their excellent courses in algorithms, complexity theory, and recursion theory. Their pedagogy has been an inspiring example of what good teaching can and should be.

What is the probability that all three great teachers were named Michael? I do not know, however, I suspect Michael Rabin could have told me. 




Wednesday, April 29, 2026

Because It Doesn't Have To

My favorite quote about networking came from Jim Kurose.

The Internet works so well because it doesn't have to.

The IP and lower layers of the internet stack make no promises of delivery. Complete failure fulfills the protocol. This allows for simpler and more powerful protocols without the extra complexity needed to guarantee success. TCP aims for delivery basically by restarting the IP communication when it fails, and even TCP can report failure to the layers above.

We can say the same about modern artificial intelligence.

Machine learning works so well because it doesn't have to.

With the softmax function that neural nets use to determine the probability of outputs, neural nets never completely rule out a possibility, always giving it at least some tiny probability. In cases where the complexity is just too difficult, neural nets give several possibilities with nontrivial probabilities, as I described in my recent post, where a machine learning model would generate a uniform distribution to capture the output of a pseudorandom generator. Instead of rigidly forcing the model to give us a specific answer, by looking at distributions we allow the models to make mistakes.

Thus a machine learning model can be correct when it makes probabilistic guesses in situations too complicated to solve directly, which allows it to achieve its best possible performance. Because we allow the models to make mistakes, they have the flexibility to solve complex problems far more frequently.

Sunday, April 26, 2026

LEAPing into the Future of Coding

A few months ago in Oxford, Bernard Sufrin, an emeritus fellow, said he's looking to hire a student to implement LEAP (Logic Engine for Argument by Pointing), a way to teach logic by proving basic logic theorems via pointing and clicking. Rahul Santhanam said why not give it to AI. Bernard said AI can't handle this task. I thought why not give it a try.

I gave Claude a single prompt that Bernard helped me formulate:

Architecture of a system to support proof by pointing in first order logic using LEAP

About an hour later, without giving additional details, we had a working prototype. Give it a try. I put the code and some more details on GitHub.

Watching Claude work was amazing. Claude created an architecture based on the paper Proof by Pointing by Yves Bertot, Gilles Kahn and Laurent Théry.

Claude then asked me:

Would you like me to proceed with implementing any of these layers?

Why not? So I told Claude to go ahead.

It started coding and it created 78 test cases and kept debugging itself until all those test cases worked out. It took me longer to get the program working on the web than Claude took to program it. I sent the link to Bernard who responded "Wow! I take it back if the program was all synthesized from the prompt." 

When I asked Bernard a month later if I could post about this program, he agreed, though he had some additional comments.

The consolation for me is that the outcome, though surprising and I suppose delightful (though it didn't manage rules for quantifiers), didn't refute my conjecture that Claude et al cannot do "architecture". The architecture (MVC) is stock; the "proof by pointing" hint led it to a paper of the same name which gave details of how to derive terms/formulae/goals in the (unfinished) proof from screen coordinates. Maybe you could give a substantively different prompt.

It may be of interest to know that the Bornat/Sufrin JAPE program, still on GitHub, was eventually a lot more ambitious when it came to selecting things: terms, subterms, goals. But it took more than 2 hours to build!

Bernard has a point. It wasn't just the fifteen-word prompt alone. Claude leaned on the Bertot et al. paper to guide its design and implementation. Still, that Claude did architect, build and debug the system from the prompt and the paper is truly impressive. We've truly crossed a threshold for coding, far beyond what would have been possible just a few months earlier.

Wednesday, April 22, 2026

Michael Rabin Passed Away on April 14, 2026, at the age of 94

Michael Rabin passed away on April 14, 2026 at the age of 94.  (Scott Aaronson has also blogged about his passing, see  here.) 


I had many points to make about him; however, the first one got so long that I will just do that one for today's blog post.

Rabin is an extremely well-rounded \(\ldots\) computer scientist? Computer scientist seems too narrow, and the point of this point is that he was well rounded. So I will start this thought again.

The following is an extremely important question that permeates computer science, mathematics, and I am sure other fields:

Given a problem, how hard is it?

Note that this is a rather broad problem since the terms problem and hard are very broad.  And by hard I mean upper bounds and lower bounds.

Rabin had worked on this problem in many different domains. I list them in roughly the order of hardness, starting from the hardest.

1) Recursive Algebra: Mathematicians had proven that every field has an algebraic closure (an extension that is algebraically closed).  In 1960 Rabin asked and answered affirmatively: Does every computable field (the elements of the field are computable, \(\times\), \(+\), are computable) have a computable algebraic closure. This was an early result in what was later to be Nerode's Recursive Math Program which later became a subcase of the  Friedman/Simpson Reverse Math Program.

2) The Decidability of S2S: In 1969  Rabin proved that the the second order theory of 2 successors (S2S) is decidable. In a course I had with him he taught the proof that the weak second order theory of 1 successor (WS1S) is decidable. I teach that when I teach Automata Theory since it brings together Finite Automata and Decidability.  Here are my slides:here

S2S is one of the only decidable theories where one can actually state theorems in math of interest in them.  (I may blog about that some other time.)

S2S is the decidable theory with the hardest proof of its decidability. Rabin's proof used transfinite induction, though later proofs did not.

Personal Note: I was the subreferee on a paper by Gurevich and Harrington that simplified the proof tremendously, back in the early 1980's. Their proof is the one to read now.  Rabin was happy that the proof was simplified. The proof is still hard, just not as hard. 


3) In 1974  Fisher & Rabin showed that any algorithm to decide Presburger arithmetic required time \(\ge 2^{2^{cn}}\) for some constant \(c\). I asked chatty if better is known and it gave me answers that didn't make sense. An earlier version of this paragraph said so and had some incorrect statements in it. One of the commenters told me what is TRUE which made it easier to look it up. Anyway, there is a triple-exp algorithm, and there is a complexity class that Presburger is complete for---see the comment. Spellcheck thinks Presburger is spelled incorrectly. I can understand why it thinks so, but its wrong. 

When Rabin taught this result he pointed out that Hilbert and others not only thought that math was decidable but also that perhaps that algorithm could be used to really do math. The complexity results show that even if a theory is decidable it may still be really hard. (With AI maybe we can use computers to do math, but that is way too big a topic, and too big a tangent, to get into within a blog-obit).

4) In 1972 Rabin proved the following. Given linear forms \(L_1(x),\ldots,L_m(x)\) over the reals (so the intent is \(x\in R^n \)) we want to know if there exists \(x\in R^n \) such that, for all \(1\le i\le m \), \(L_i(x)>0 \).  The model of computation is a decision tree where each internal node can ask a question of the form \(L(x) {\rm\  RELATION\  } 0 \) where RELATION can be any of \(<,=,>\). Each leaf is labelled YES or NO.  Then the depth of the tree is \(\ge 2^{\Omega(n)} \).  This is an early paper in decision tree complexity.

5) In 1977 the Handbook of Math Logic came out. Rabin did the article on Decidable theories. He mentioned P vs NP as being really important and being the next logical (no pun intended) direction for logic. He was the only author to mention P vs NP.

6) While Rabin did not invent randomized algorithms he worked on them a lot early on.  In 1977 Solovay and Strassen obtained a polytime randomized algorithm for primality.  In 1976 Rabin noticed that Miller's Primality algorithm (which showed that if the Extended Riemann Hypothesis is true then primality is in P) could be modified to be a randomized polynomial time algorithm for primality. While preparing this blog post I noticed that I often hear about the Miller-Rabin algorithm (many cryptography protocols need a primality algorithm) but I hardly ever hear about the Solovay-Strassen algorithm. I asked Google AI why this was. In brief: Miller-Rabin is faster, has a lower probability of error, and is simpler. Is Google AI correct? I think yes since Miller-Rabin is used and Solovay-Strassen is not. I might be employing circular reasoning here. 

The Miller-Rabin primality test might be his second best known work, the best known being the last item on this list, the result of Rabin and Scott that NFA's are equivalent to DFA's.  (It's last since it is the lowest complexity class he worked on.)

 
7) Rabin obtained other randomized algorithms. I mention two:

a) Rabin-Shallit: When teaching automata theory I often give my students the following sets in NP and ask them to determine which ones are known to be in P:

\( \{ x \in N \colon (\exists y)[x=y^2] \} \)

\( \{ x \in N \colon (\exists y_1,y_2)[x=y_1^2+ y_2^2] \} \)

\( \{ x \in N \colon (\exists y_1,y_2,y_3)[x=y_1^2+ y_2^2+ y_3^2] \} \)

\( \{ x \in N \colon (\exists y_1,y_2,y_3,y_4)[x=y_1^2+ y_2^2+ y_3^2+y_4^2] \} \)

\( \{ x \in N \colon (\exists y_1,y_2,y_3,y_4,y_5)[x=y_1^2+ y_2^2+ y_3^2+y_4^2+y_5^2] \} \)

etc.

The input is in binary so searching for all possible \(y_1,y_2,y_3,y_4,\ldots,y_n\) is exponential in the length of the input.

I leave the first three to the reader.

This one:

\( \{ x \in N \colon (\exists y_1,y_2,y_3,y_4)[x=y_1^2+ y_2^2+ y_3^2+y_4^2] \} \)

is sort-of a trick question. Every number is the sum of 4 squares, so this set, and all later ones, are trivially in P. But that raises the question: how to find \( y_1,y_2,y_3,y_4 \)? This is a curious case of a problem that is in NP, the decision part is in P (in fact, trivial),  but finding the witness seems hard.

When I first assigned this problem I then looked up what was known about finding the \(y_1,y_2,y_3,y_4\). 

In 1986 Rabin and Shallit showed that, assuming ERH, there is a randomized \( O(\log^2 n) \) algorithm. I am surprised that you need both ERH and Randomness. This seems to be a less-well known result though I don't know why since it's a natural question.

b) Karp-Rabin: In 1987 Karp and Rabin obtained a really fast, really simple (that's a plus in the computer science world)  randomized pattern matching algorithm.  Since it is fast and simple I wondered if it is really used. It is! To quote Google AI:

Yes, the Karp-Rabin algorithm is used in the real world, particularly in scenarios requiring the detection of multiple patterns simultaneously, such as plagiarism detection,data deduplication, and bioinformatics.

Is Google AI correct? I leave that as an exercise for the reader. 

8) In 1979 Rabin devised a cryptosystem whose security is equivalent to factoring. How come RSA became the standard and not Rabin's system? Broadly two possibilities

a) RSA was better.

b) RSA was faster to the marketplace and other random factors.

Which is it? I leave that to the reader.

9) In 1958 Rabin and Scott showed that NFAs are equivalent to DFAs. This may be his best known work.



Thursday, April 16, 2026

Machine Learning and Complexity

At Oxford I focused my research and discussions on how we can use the tools of computational complexity to help us understand the power and limitations of machine learning. Last week I posted my paper How Does Machine Learning Manage Complexity?, a first step in this direction. Let me give a broad overview of the paper. Please refer to the paper for more technical details.

Instead of focusing on the machine learning concepts like neural nets, transformers, etc., I wanted to abstract out a model defined in terms of complexity, as time-efficient non-uniform computable distributions with minimum probabilities. Let's unpack this abstraction.

Neural nets as next token predictors don't always give the same next token, rather they give a distribution of tokens, which leads to a distribution on text of length up to the size of the context window. The way probabilities are computed via a softmax function guarantee that every output can occur, with at least an exponentially small probability, the "minimum probability" in the abstraction. 

In computational complexity, we study two main kinds of distributions. A distribution is sampleable if there is an algorithm that takes in uniformly random bits and outputs text according to that distribution. A distribution is computable if we can compute not only the probability that a piece of text occurs, but the cumulative distribution function, the sum of the probabilities of all outputs lexicographically less than that text. Every efficiently computable distribution is efficiently sampleable but not likely the other way around.

Neural nets as next token predictors turn out to be equivalent to computable distributions. We need these kinds of distributions both on how neural nets are trained but also on how we use them. Computable distributions allow for conditional sampling which lets us use a large-language model to answer questions or have a conversation. You can't get conditional sampling from a sampleable distribution. 

Neural nets have a limited number of layers, typically about a hundred layers deep which prevents them from handling full time-efficient (polynomial-time) computations. They can get around this restriction either with reasoning models that allow a model to talk to itself, or by directly writing and executing code which run efficient algorithms of any kind.

Most of the algorithms we use, think Dijkstra, or matching, are uniform, that is the same algorithm on graphs of twenty nodes as the algorithm on twenty million. But neural nets change their weights based on the distributions they train on. These weights encode a compressed view of that data and the extra computation needed to process that data. So we have different algorithms as our technology and data sources improve. I wrote more about this connection between AI and nonuniformity last fall. 

What does this abstraction buy us?

Restricting to computable distributions forces machine learning algorithms to treat complex behavior as random behavior, much like we treat a coin flip as a random event because it is too complicated to work out all the physical interactions that would determine which side it lands on.

We illustrate this point by our main result. Let D be the distribution of outputs from a cryptographic pseudorandom generator and U be the uniform distribution of words of the same length. If \(\mu\) is a time-efficient non-uniform computable distribution with minimum probabilities and \(\mu\) does as good or a better job than U of modeling D then \(\mu\) and U are essentially the same distribution. Machine learning models the complex PRG as a uniform distribution, simplifying what we can't directly compute by using randomness. 

More in the paper and future blog posts.

Tuesday, April 14, 2026

Guest Post from Peter Brass, Former NSF Theory Director, on the NSF budget.

 Guest post from Peter Brass, Former NSF Theory director (though not affiliated with the NSF now) on the White House NSF budget for FY 2027.

---------------------------------------------

Dear Colleagues

A week ago the White House released the NSF budget request for FY 2027( here) so I want to provide you an update. As always, this is just my interpretation, I am not connected to the NSF any more

The general news is bad, we get a replay of FY 2026 (FY 2026 is October 2025 to September 2026)

For context: the NSF enacted budget in FY 2023 was $9.5B. The CISE budget was $0.9B. Budgets start with a request by the White House, and then get negotiated with Congress.

A year ago, the  White House made a FY 2026 budget request of $3.9B, which was a ~60% cut from the previous level. After negotiation with Congress, the enacted budget became $8.7B

However, a long period of FY 2025 were funded by continuing resolutions, and continuing  resolutions are based on the White House budget request, so for the first half of FY 2025 the  60% cut was in effect, and awards only now pick up. In addition, there was the government shutdown, and the removal of NSF from the Eisenhower Ave building in Alexandria, so processing will be delayed.

The cancellation of awards, although it caused a huge media echo, had a very small impact outside the EDU and SBE directorates, about 2% of awards were affected, and perhaps 1/3 of them were restored.

The FY 2027 budget request asks for $4.8B, but subtracting $0.9B reserved for a new antarctic research vessel, the request is the same $3.9B.  We hope that Congress will again restore much, but clearly the White House continues not supportive of basic research. The proposed cuts for NSF are a larger percentage than for NASA, NIH, NIST, or the DoE Research.  The request reduces the CISE budget from $0.9B to $0.3B.

The NSF projects a slight decrease in the number of proposals, and a huge decrease in the funding rate, across all research proposals a decline from 18% to 7%. The average award size and award duration are projected to increase slightly, giving much fewer people a bit more resources.

So far NSF has not done any hiring since the start of the current administration, and although Congress restored much of the research funding, it did not restore the cuts in the staffing level.  Rotators continue to end their rotation, and are not replaced, the current plan seems to discontinue the concept of a rotator entirely.
With the reduction is program staff, the remaining program directors are overworked, and are put in charge of programs where they have no previous connection to the research community, without time to establish the connection.  The programs themselves remain unchanged, as they are governed by the solicitations, but the people managing the programs can give less time to the individual program. And the money available depends on the outcome of the budget process.

That is the current situation, or at least the plan of the White House; we will see what Congress makes of it.

TO DO: tell your congress member that you thank for their previous support of basic research at the NSF, and hope they continue it.



Thursday, April 09, 2026

Afterthoughs on Banach Tarski and the Miracle of loaves and Fishes

I posted about using the Banach-Tarski Paradox(BT)  to explain the miracle of Loaves and Fishes (LF) here.

Darling says that whenever I fool my readers or my students then I have to tell them later, so I'll tell you now: The story about me meeting with Pope and talking about the BT Paradox (that would be a good name for a rock band: B-T-Paradox) was not true. I think my readers know that.

 

1) I first learned the Banach-Tarski Paradox as a grad student in 1981 when I read Hillary Putnam's article Models and Reality where he writes on Page 470:

One cannot simply sit down in one's study and ``decide'' that ``V=L'' is to be true, or that the axiom of choice is to be true. Nor would it be appropriate for the mathematical community to call an international convention and legislate these matters. Yet , it seems to me that if we encountered an extra-terrestrial species of intelligent beings who had developed a high level of mathematics, and it turned out they rejected the axiom of choice (perhaps because of the Tarski-Banach Theorem), it would be wrong to regard them as simply making a mistake. To do that would, on view, amount to saying that acceptance of the axiom of choice is built into our notion of rationality itself; that does not seem to me to be the case.

I agree with him and I wonder if we accept AC too readily. See my blog post on the BT paradox and my wife's strong opinion (she's against it).    

2) Back in 1981 my first thought was I wonder if someone has thought to use the BT paradox to explain the LF? And if so, were they serious or was it some kind of  joke? And does the Pope really get a discount at Pope-Yes? 

I also had the meta-thought (which I could not have said as cleanly as I will now):

 I wonder how I could find out if anyone else has thought of the BT-LF connection? 

Recall that back in 1981 the Internet was but a glint in Al Gore's eyes.  So back then I could not find out if anyone else had that thought of BT-LF. 

But now I can! And indeed, as I expected, some other people have made the connection of BT to LF: 

A tweet and a Reddit thread discussing the tweet: here. Not serious 

A serious article, I think, here   

A 24-page article about Holy Water and BT.  I can't imagine an article that long being a parody so I think its serious, see here. On the other hand, there is a 12-page article about Ramsey Theory and History that I think is supposed to be a parody, see here. (My proofreader points out that a different definition of parody is A feeble or ridiculous imitation so the article may well be a parody in that sense.

A parody article, I think, here

I am sure there are more. 

3) I had thought of doing a blog post about BT and LF  a long time ago,  but Pope Leo having a math degree was the final push I needed.

4) The word cardinal has three very different meanings: (a) a type of infinity, (b) a position in the Catholic church, (c) the bird.  Same for large cardinal.

5) One of my students who proofread the post thought that people will know it's a hoax since I am a vegetarian and hence would not eat at Pope-Yes, even if the Pope was paying. 

 

Sunday, April 05, 2026

Fun Little Solutions

Here are the solutions to the problems I posted last week.

Problem 1

A language \(L\) is commutative if for all \(u\), \(v\) in \(L\), \(uv = vu\). Show that \(L\) is commutative if and only if \(L\) is a subset of \(w^*\) for some string \(w\). The "only if" direction is surprisingly tricky.

Answer

For the "if" direction, suppose \(L \subseteq w^*\). Then every \(u, v \in L\) can be written as \(u = w^i\) and \(v = w^j\), so \(uv = w^{i+j} = vu\).

For the "only if" direction, assume \(L\) is commutative. We may assume \(L\) contains a non-empty string. Let \(u\) be the shortest such string, let \(m\) be the greatest common divisor of the lengths of all non-empty strings in \(L\), and let \(w\) be the prefix of \(u\) of length \(m\). We use the following lemma.

Lemma. If \(xy = yx\) with \(x, y\) non-empty, then both are powers of their common prefix of length \(\gcd(|x|, |y|)\).

Proof of Lemma. By strong induction on \(|x| + |y|\). If \(|x| = |y|\), comparing the first \(|x|\) characters gives \(x = y\), and both equal that string to the first power. If \(|x| < |y|\) (WLOG), comparing the first \(|x|\) characters of \(xy\) and \(yx\) shows \(x\) is a prefix of \(y\). Write \(y = xz\). Then \(x(xz) = (xz)x\) simplifies to \(xz = zx\). Since \(|x| + |z| < |x| + |y|\), the inductive hypothesis gives that \(x\) and \(z\) are powers of their common prefix of length \(\gcd(|x|, |z|) = \gcd(|x|, |y|)\). Since \(y = xz\), \(y\) is a power of this prefix too. \(\square\)

Since \(m\) divides \(|u|\) and, for each \(v \in L\), the lemma gives \(u = r_v^{|u|/\gcd(|u|,|v|)}\), combining these periodicities shows \(u = w^{|u|/m}\). 

Now for any non-empty \(v\neq u \in L\), commutativity gives \(uv = vu\), so by the lemma, \(u\) and \(v\) are both powers of the prefix \(r\) of \(u\) of length \(\gcd(|u|, |v|)\). Since \(m\) divides both \(|u|\) and \(|v|\), it divides \(|r|\), so \(r\) is itself just \(w\) repeated \(|r|/m\) times. Therefore \(v = w^{|v|/m}\). \(\square\)


Problem 2

Let an NP machine be sparse if for some \(k\), it has at most \(n^k\) accepting paths for every input of length \(n\). The class FewP is the set of languages accepted by sparse NP machines. Let \(\#N(x)\) be the number of accepting paths of \(N(x)\). Show that if P = FewP then \(\#N(x)\) is polynomial-time computable for any sparse NP machine \(N\).

Answer

The obvious approach is to create a machine \(N'(x, k)\) that accepts if there are at least \(k\) accepting paths of \(N(x)\). But this fails: if \(N(x)\) has \(2n\) accepting paths then \(N'(x, n)\) will have exponentially many accepting paths.

Instead, define \(N'(x, w)\) that accepts if \(N(x)\) has an accepting path starting with \(w\), and use tree search to find all the accepting paths. 


Problem 3

Let PERM(\(L\)) be the set of all permutations of the strings in \(L\). For example, PERM(\(\{000, 010\}\)) = \(\{000, 001, 010, 100\}\). Are regular languages closed under PERM? How about context-free languages?

Answer

Regular languages are not closed under PERM. Let \(L = (01)^*\); then \(\mathrm{PERM}(L) \cap 0^*1^* = \{0^n 1^n\}\), which is not regular. Similarly, context-free languages are not closed under PERM in general: \(\mathrm{PERM}((012)^*)\) is not context-free.

However, over a binary alphabet \(\{a, b\}\), if \(L\) is context-free then \(\mathrm{PERM}(L)\) is context-free. Over a binary alphabet, two strings are permutations of each other if and only if they have the same number of each letter, so \(\mathrm{PERM}(L)\) depends only on the Parikh image \(\Pi(L) = \{(|u|_a, |u|_b) : u \in L\}\).

By Parikh's theorem, \(\Pi(L)\) is semilinear for any CFL \(L\), and every semilinear set is the Parikh image of some regular language \(R\). Thus \(\mathrm{PERM}(L) = \mathrm{PERM}(R)\), and it suffices to show \(\mathrm{PERM}(R)\) is context-free for regular \(R\).

Given a DFA \(A\) for \(R\), construct a PDA that, on input \(w\), nondeterministically selects a rearrangement \(u\) of \(w\) while simulating \(A\) on \(u\). The stack tracks the running imbalance \(\Delta_i = |\{a\text{'s in } u_1\cdots u_i\}| - |\{a\text{'s in } w_1 \cdots w_i\}|\) in unary, while the finite control tracks the DFA state and sign of \(\Delta_i\). The PDA accepts iff the DFA reaches a state in \(F\) and the stack is empty (i.e., \(|u|_a = |w|_a\)), establishing \(\mathrm{PERM}(R)\) is context-free.


Problem 4

Suppose you have a one-tape Turing machine where we allow the transition function to move the head left, right, or stay put. Show there is an equivalent one-tape Turing machine that only moves the head left or right — and do it without increasing the size of the state space or tape alphabet.

Answer

For each pair \((q, a)\) with \(\delta(q, a) = (p, b, S)\), precompute the stay-closure: keep applying \(\delta\) while the move is \(S\). Since only the scanned cell changes, the process evolves on the finite set \(Q \times \Gamma\). Exactly one of three outcomes occurs: you reach \((p', b', D)\) with \(D \in \{L, R\}\); you enter \(q_{\mathrm{acc}}\) or \(q_{\mathrm{rej}}\); or you fall into an \(S\)-only cycle. Define \(\delta'\) by:

  • if the first non-stay step is \((p', b', D)\), set \(\delta'(q, a) = (p', b', D)\);
  • if the closure halts, write the last \(b'\) and move (say \(R\)) into \(q_{\mathrm{acc}}\) or \(q_{\mathrm{rej}}\);
  • if it \(S\)-loops, write any fixed symbol and move (say \(R\)) into \(q_{\mathrm{rej}}\).

Leave all non-\(S\) transitions unchanged. Then \(Q\) and \(\Gamma\) are unchanged, no \(S\)-moves remain, and the accepted language is preserved.


Problem 5

Let E be the set of problems solvable in time \(2^{O(n)}\). Show unconditionally that E \(\neq\) NP.

Answer

EXP, the set of problems solvable in time \(2^{n^{O(1)}}\), has a complete problem that lies in E. So if E = NP then NP = EXP, which gives E = EXP, violating the time hierarchy theorem.

Note this proof does not say anything about whether or not E is contained in NP or vice versa.


Problem 6

Show there is a computable list of Turing machines \(M_1, M_2, \ldots\) such that \(\{L(M_1), L(M_2), \ldots\}\) is exactly the set of computable languages.

Answer

This is impossible if the \(M_i\) are all total Turing machines (halt on all inputs). But I never made that requirement.

Let \(N_1, N_2, \ldots\) be the standard enumeration of all Turing machines. Define \(M_i(x)\) to accept if \(N_i(y)\) halts for all \(y < x\) and \(N_i(x)\) accepts. If \(N_i\) is total then \(L(M_i) = L(N_i)\). If \(N_i\) is not total then \(L(M_i)\) is finite and hence computable. Thus \(\{L(M_1), L(M_2), \ldots\}\) contains all computable languages and no non-computable ones. 

Wednesday, April 01, 2026

I helped the Pope's with his latest Encyclical (His Math Background Helped)

I blogged about Pope Leo XIV here. Pope Leo XIV has an undergraduate degree in mathematics. He saw my post and asked for my help with his latest encyclical. 

LEO: Let's have lunch together at Popeyes.

BILL: Why Popeyes?

LEO: The name is Pope-yes so I get a discount.

BILL: Your treat. [We met at Pope-yes and had the following discussion.]

LEO: I am working on an encyclical to resolve the tension between miracles in the Bible and modern science. 

BILL: What's the issue?

LEO: The Bible has miracles in it that seem to violate the laws of science. There are a few ways to resolve this cosmic conflict.

a) The miracles are allegorical. This is insulting to both God and Man. 

b) The miracles can be explained by natural phenomena. For example:

The Red Sea was split by  a big wind. This is acceptable. The timing of the big wind is the miracle.

BILL: Let me guess the problem: There are some miracles that cannot fit into modern science.

LEO: Exactly!  And I hope that Christians who are scientists (not to be confused with Christian Science, see here) will take up the study of miracles and see how they can fit into modern science.

BILL: Give me an example of a miracle that cannot be resolved with modern science and we'll see what we can do about that.

LEO:  Recall the miracle of loaves and fishes:

---------------------------------------------

A crowd of 4000 came to hear Jesus preach. When he was done they were hungry. 

Jesus told his disciples: 

I have compassion for these people; they have already been with me three days and have nothing to eat.  I do not want to send them away hungry, or they may collapse on the way. What food do we have?

The disciples responded:

Seven loaves and a few small fish.

Jesus told the crowd to sit down on the ground. Then he took the seven loaves and the fish and when he had given thanks, he broke them and gave them to the disciples, and they in turn gave to the people. They all ate and were satisfied. There were even leftovers. 

--------------------------------------------

So how could Jesus take seven loaves of bread and a few fish and feed thousands of people? How can this be explained with modern science?

BILL: I have a way to resolve it but you may not like it.

LEO: Let's hear it.

BILL: Jesus used the Banach-Tarski paradox  (see here) --- when he broke the bread,  he divided one loaf into 5 pieces, some of which were not measurable, and put them back together to get two loaves. Repeat until you can feed 5000 people. Same with the fishes.

LEO: Great! Why wouldn't I like that?

BILL: It only works if you're pro-(axiom of) choice. 

LEO: I'll have to run this by a subset of my advisors.

BILL: Which subset?

LEO: The Large Cardinals


Sunday, March 29, 2026

Fun Little Problems

Occasionally I run into what I consider fun problems in complexity, that require just a little bit of out of the box thinking. They require some background in theory, but nothing too deep. Some of these problems have been mentioned before on my blog or social media.
  1. A language \(L\) is commutative if for all \(u\),\(v\) in \(L\), \(uv=vu\). Show that \(L\) is commutative if and only if \(L\) is a subset of \(w^*\) for some string \(w\). The "only if" direction is surprisingly tricky.
  2. Let an NP machine be sparse if for some \(k\), it has at most \(n^k\) accepting paths for every input of length \(n\). The class FewP is the set of languages accepted by sparse NP machines. Let \(\#N(x)\) be the number of accepting paths of \(N(x)\). Show that if P=FewP then \(\#N(x)\) is polynomial-time computable for any sparse NP machine \(N\). 
    Richard Beigel gave me this problem and told me the second thing I tried would work. He was right.
  3. Let PERM(\(L\)) be the set of all permutations of the strings in \(L\). For example, PERM(\(\{000,010\}\)) is \(\{000,001,010,100\}\). Are regular languages closed under PERM? How about context-free languages?
  4. Suppose you have a one-tape Turing machine where we allow the transition function to move the head left, right or stay put. Show there is an equivalent one-tape Turing machine that only moves the head left or right. Not hard, but now do it without increasing the size of the state space or tape alphabet.
  5. Let E be the set of problem solvable in time \(2^{O(n)}\). Show unconditionally that E \(\ne\) NP.
  6. Show there is a computable list of Turing machines \(M_1,M_2,\ldots\) such that \(\{L(M_1),L(M_2),\ldots\}\) is exactly the set of computable languages. A computable list means there is a computable function \(f\) such that \(f(i)\) is a description of \(M_i\). 

Wednesday, March 25, 2026

My Oxford Term

High table dinner at Magdalen

My time in Oxford has come to an end and I head back to Chicago this week. I was a visiting Fellow at Magdalen (pronounced "maudlin") College for the Hilary Term.

There's a six week break between the eight-week Hilary and Trinity terms. They work the fellows hard during the terms with teaching, tutoring, admissions, hiring and various other administrative functions. All the events, seminars, workshops, high-table dinners are scheduled during the term. Pretty much nothing between terms, and many domestic students are forced out of their housing, and many of the fellows/professors leave town as well. An interesting strategy when us Americans get just a week for spring break. 

I came here for research, working mostly with my former PhD student Rahul Santhanam, a tutorial fellow at Magdalen, and his students. More on the research in a future post.

I took full advantage of the Magdalen college life, working in the senior common room, having lunch in the winter common room, evensong in the chapel with an outstanding choir and organ, and high table dinner in the hall. I had the same experiences as Magdalen fellows have for centuries including CS Lewis, Oscar Wilde and Erwin Schrödinger. There's also a summer common room with a secret door to the old library, and by old it predates most American universities. Magdalen looks like such a traditional old college that some recent Oxford-set shows, including My Oxford Year and Young Sherlock, had extensive filming there. 

As I mentioned earlier, community focuses on the college not on the departments. I had an office in the CS building but didn't spend that much time there. Every day at Magdalen particularly at lunch and dinner, I had great conversations with lawyers, biologists, historians, archivists, literature, music historians, stained-glass restorers and the numismatist who manages the 300,000 coin collection of the Oxford Ashmolean museum.

One dinner I sat next to the COO of the new Ellison Institute of Technology, a ten billion dollar venture in Oxford but independent of the university, funded by the Oracle CEO. She talked considerably about the famous pub, the Eagle and the Child. The pub, nicknamed the Bird and the Baby, was famous as the meeting place of the Inklings, a group of writers including Lewis and Tolkien. It never reopened after Covid and was purchased by Ellison and currently being renovated. 

Another visiting fellow, Elaine Treharne, was giving a talk on Medieval poetry the same week I talked about complexity and machine learning. We went to each other's talk. Hers in the brand new Schwarzman Centre for the Humanities, the same Schwarzman from MIT's College of Computing and mine in a CS building that's a mish mash of other buildings. She outdrew me two to one.

Sunday, March 22, 2026

A $100 gift card could be legit. A $1000 is obviously a Scam. What should scammers do?

 If I get an email offering me a $1000 for I DON"T KNOW SINCE  I ignore it and don't even bother looking for other signs it is a scam. 

If I get an email offering me $100  I may look more carefully and often they are legit (most common is to give a per-publication review of a math book---sometimes just questions, but more often a written report). 

Most offers I get are either $1000 or $100. Today I got one for $750 which inspired this post (I ignored the offer without checking). 

Which nets more people $100 or $1000?

1) If people are like me then $100 fools more people. But people like me will still CHECK CAREFULLY. I sometimes feed the email into ChatGPT for an opinion to see if it's a scam. (Spellcheck still things ChatGPT should be spelled catgut.) 

2) Are there people who would fill out the survey (or whatever) for $1000 but not for $100? I ask non rhetorically as always. Are such people more gullible?


Would scammers make more money if they offered $100 instead of $1000 ?

1) More people would fall for the $100 scam. Or maybe not---do some people not bother if it's only $100?

2) Depending on how they are scamming you, will they get less out of it if they only offer $100?

Here are types of scams:

1) They send you a check for $1000 + x and say WHOOPS- please email us a check for $x. I've heard of this in the Can you tutor our daughter in math? scam. For this one, offering $1000 nets the scammer more money since for $100+x, x will be smaller than $1000+x.

2) They want to harvest your personal information. For these I don't think they will gain more if they do 1000 vs 100. 

One more thought:

1) I said that for $100 I take it seriously but for $1000 I don't

2) I said that $750 I do not take it seriously.

3) What's the cutoff?  Obviously $289.

Wednesday, March 18, 2026

Bennett and Brassard Win the Turing Award

Gilles Brassard and Charlie Bennett

Charlie Bennett and Gilles Brassard will receive the 2025 ACM Turing Award for their work on the foundations of quantum information science, the first Turing award for quantum. Read all about it in The New York Times, Science and Quanta.

Bennett and Brassard famously met in the water off a beach during the 1979 FOCS conference in Puerto Rico. That led to years of collaboration, most notably for their quantum secure key distribution protocol. The basic idea is pretty simple and does not require quantum entanglement. Alice sends a series of random bits, either straightforward or rotated 45 degrees. Bob measures each of these bits in randomly chosen bases. They discard the bits where they used different bases and use some of the remaining bits to check for eavesdropping, which would collapse the state, and others to set the key.

Bennett and Brassard and four other authors showed how to teleport a quantum bit using entanglement and two classical bits. Bennett with Stephen Wiesner gave the dual superdense coding protocol of sending two classical bits using a single quantum bit. 

Bennett and Brassard, with Ethan Bernstein and Umesh Vazirani, showed that in black-box setting, quantum computers would require \(\Omega(\sqrt{n})\) queries to search \(n\) entries, matching Grover's algorithm. For some reason, the popular press rarely covers these results that limit the power of quantum computing. 

I've had the pleasure of knowing both Bennett and Brassard since the 1980s, both just full of enthusiasm and wonderful ideas. 

Let me end by saying I don't see this as a Turing award for quantum computing. Once (if?) we get large scale machines, we'll certainly see Turing awards, if not Nobel Prizes, for Peter Shor, Umesh Vazirani and others.

Sunday, March 15, 2026

For \(R^3\) the problem is open. That's too bad. We live in \(R^3\)

(If you live in Montgomery County Maryland OR if you care about Education, you MUST read this guest blog by Daniel Gottesman on Scott Aaronson's blog HERE.) 

(This post is a sequel to a prior post on this topic that was here. However, this post is self-contained---you don't need to have read the prior post.)  

(Later in the post I point to my open problems column that does what is in this post rigorously. However , that link might be hard to find, so here it is:  HERE)



BILL: I have a nice problem to tell you about. First, the setup.

Say you have a finite coloring of \(R^n\).

mono unit square is a set of four points that are

(a) all the same color, and

(b) form a square of side 1. The square does not need to be parallel to any of the axes.

DARLING: Okay. What is the problem?

BILL:  It is known that for all  2-colorings of \(R^6\) there is a mono unit square.

DARLING: \(R^6\)? Really! That's hilarious! Surely, better is known.

BILL: Yes better is known. And stop calling me Shirley.

DARLING: Okay, so what else is known?

BILL: An observation about the \(R^6\) result gives us the result for \(R^5\). (The \(R^5\) result also follows from a different technique.) Then a much harder proof gives us the result for \(R^4\). It is easy to  construct  a coloring of \(R^2\) without a mono unit square. The problem for \(R^3\) is open.

DARLING: That's too bad. We live in \(R^3\).

DARLING: Someone should write an article about all this including proofs of all the known results, open problems,  and maybe a few new things.

BILL: By someone you mean Auguste Gezalyan (Got his  PhD in CS, topic Comp Geom, at  UMCP), Ryan Parker (ugrad working on Comp Geom at UMCP), and Bill Gasarch (that's me!)  Good idea!

A FEW WEEKS LATER

BILL: Done! See here. And I call the problem about \(R^3\) The Darling Problem.

DARLING: Great! Now that you have an in-depth knowledge of the problem---

BILL: Auguste and Ryan have an in depth knowledge. Frankly I'm out of my depth.

DARLING: Okay, then I'll ask them:  What do you think happens in \(R^3\) and when do think it will be proved?

AUGUSTE: I think there is a 2-coloring of \(R^3\) with no mono unit square.

RYAN: I think that for every 2-coloring of \(R^3\) there is a mono unit square.

BILL: I have no conjecture; however, I think this is the kind of problem that really could be solved. It has not been worked on that much and it might just be one key idea from being solved. It is my hope that this article and blog post inspires someone to work on it and solve it. 

OBLIGATORY AI COMMENT

Auguste asked ChatGPT (or some AI) about the problem. It replied that the problem is open and is known as The Darling's Problem. This is rather surprising---Auguste asked the AI about this before I had submitted the article (it has since appeared) and before this blog post. So how did AI know about it? It was on my website.  I conjecture that Auguste used some of the same language we used in the paper so the AI found our paper. The oddest thing about this is that I don't find this odd anymore. 

 COLOR COMMENTARY  

The article appeared as a SIGACT News Open Problems Column. Are you better off reading it there or on my website, which is pointed to above. The SIGACT News version is (a) behind a paywall, and (b) in black and white. The version on my website is (a) free access, and (b) uses color. You decide.

Tuesday, March 10, 2026

Tony Hoare (1934-2026)

Turing Award winner and former Oxford professor Tony Hoare passed away last Thursday at the age of 92. Hoare is famous for quicksort, ALGOL, Hoare logic and so much more. Jim Miles gives his personal reflections.

Jill Hoare, Tony Hoare, Jim Miles. Cambridge, 7 September 2021

Last Thursday (5th March 2026), Tony Hoare passed away, at the age of 92. He made many important contributions to Computer Science, which go well beyond just the one for which most Maths/CompSci undergraduates might know his name: the quicksort algorithm. His achievements in the field are covered comprehensively across easy-to-find books and articles, and I am sure will be addressed in detail as obituaries are published over the coming weeks. I was invited in this entry to remember the Tony that I knew, so here I will be writing about his personality from the occasions that I met him.

I visited Tony Hoare several times in the past 5 years, as we both live in Cambridge (UK) and it turned out that my family knew his. As a Mathematics graduate, I was very keen to meet and learn about his life from the great man himself. I was further prompted by a post on this blog which mentioned Tony a few times and summarised a relevant portion of his work. I took a print out of that entry the first time I visited him to help break the ice - it is the green sheet of paper in the picture above.

Tony read the entry and smiled, clearly recalling very well the material of his that it referenced, and then elaborating a bit, explaining how vastly programs had scaled up in a rather short space of time and how they typically require different methods than many of those he had been developing in the early days.

I was aware that Tony had studied Classics and Philosophy at university so I was keen to learn how one thing had led to another in the development of his career. He explained that after completing his degree he had been intensively trained in Russian on the Joint Services School for Linguists programme and was also personally very interested in statistics as well as the emerging and exciting world of computers. This meant that after his National Service (which was essentially the JSSL) he took on a job 'demonstrating' a type of early computer, in particular globally, and especially in the Soviet Union. He described the place of these demonstrations as 'fairs' but I suppose we might now call them 'expos'. In a sense, this seemed like a very modest description of his job, when in fact - reading up on Tony's career - he was also involved in the development of code for these devices, but perhaps that's a historical quirk of the period: being a demonstrator of these machines meant really knowing them inside and out to the point of acting on the dev team (AND, one might deduce, being fluent in Russian!).

Tony would tell these stories with a clarity and warmth that made it clear that certainly he was still entirely 'all there' mentally, and that his memory was pinpoint sharp, even if there were some physical health issues, typical for anyone who makes it so far into their 80s (and, as we now know, beyond!).

A story that I was determined to hear from the source was the legendary quicksort 'wager'. The story goes that Tony told his boss at Elliott Brothers Ltd that he knew a faster sorting algorithm than the one that he had just implemented for the company. He was told 'I bet you sixpence you don't!'. Lo and behold, quicksort WAS faster. I asked Tony to tell this story pretty much every time we met, because I enjoyed it so much and it always put a smile on both of our faces. To his credit, Tony never tired of telling me this story 'right from the top'. I had hoped to visit again in the past year and record him telling it so that there was a record, but unfortunately this did not happen. However, I discover that it is indeed recorded elsewhere. One detail I might be able to add is that I asked Tony if indeed the wager was paid out or if it had merely been a figure of speech. He confirmed that indeed he WAS paid the wager (!). A detail of this story that I find particularly reflective of Tony's humble personality is that he went ahead and implemented the slower algorithm he was asked to, while he believed quicksort to be faster, and before chiming in with this belief. It speaks to a professionalism that Tony always carried.

About 50% of our meetings were spent talking about these matters relating to his career, while the rest varied across a vast range of topics. In particular, I wanted to ask him about a story that I had heard from a relative, that Tony - whilst working at Microsoft in Cambridge - would like to slip out some afternoons and watch films at the local Arts Picturehouse. This had come about because on one occasion a current film in question was brought up in conversation and it transpired Tony had seen it, much to the bemusement of some present. The jig was up - Tony admitted that, yes, sometimes he would nip out on an afternoon and visit the cinema. When I met Tony and gently questioned him on this anecdote he confirmed that indeed this was one of his pleasures and his position at Microsoft more than accommodated it.

On the topic of films, I wanted to follow up with Tony a quote that I have seen online attributed to him about Hollywood portrayal of geniuses, often especially in relation to Good Will Hunting. A typical example is: "Hollywood's idea of genius is Good Will Hunting: someone who can solve any problem instantly. In reality, geniuses struggle with a single problem for years". Tony agreed with the idea that cinema often misrepresents how ability in abstract fields such as mathematics is learned over countless hours of thought, rather than - as the movies like to make out - imparted, unexplained, to people of 'genius'. However, he was unsure where exactly he had said this or how/why it had gotten onto the internet, and he agreed that online quotes on the subject, attributed to him, may well be erroneous.

One final note I would like to share from these meetings with Tony is perhaps the most intriguing of what he said, but also the one he delivered with the greatest outright confidence. In a discussion about the developments of computers in the future - whether we are reaching limits of Moore's Law, whether Quantum Computers will be required to reinvigorate progress, and other rather shallow and obvious hardware talking points raised by me in an effort to spark Tony's interest - he said 'Well, of course, nothing we have even comes close to what the government has access to. They will always be years ahead of what you can imagine'. When pressed on this, in particular whether he believed such technology to be on the scale of solving the large prime factorisation that the world's cryptographic protocols are based on, he was cagey and shrugged enigmatically. One wonders what he had seen, or perhaps he was engaging in a bit of knowing trolling; Tony had a fantastic sense of humour and was certainly capable of leading me down the garden path with irony and satire before I realised a joke was being made.

I will greatly miss this humour, patience, and sharpness of mind, as I miss everything else about Tony.

RIP Tony Hoare (11 January 1934 - 5 March 2026)

Sunday, March 08, 2026

How does AI do on Baseball-Brothers-Pitchers

In my graduate Ramsey Theory class I taught Kruskal's tree theorem (KTT) which was proven by Joe Kruskal in his PhD thesis in 1960.  (Should that be in a graduate Ramsey Theory class? There are not enough people teaching such a course to get a debate going.) A simpler proof was discovered (invented?) by Nash-Williams in 1963.

The theorem is that the set of trees under the homeomorphism ordering is a well quasi order.

But this blog post is not about well quasi orderings. It's about baseball brothers and AI.

The Kruskals are one of the best math families of all time. See my post on math families.The Bernoullis are another great math family.  What makes both families so great is that they had at least THREE great math people. Most have two.

Having taught the KTT and talked briefly about math families, I was curious how ChatGPT would do on the better-defined question of largest number of wins by a pair of brothers in baseball.  So I asked my students to look that up and include which tools they used,  as a HW problem  (worth 0 points but they had to do it). 

I wrote up 9 pages on what the answer is (there are some issues) and what the students' answers were. See my write up.

In case those 9 pages are tl;dr, here are the main takeaways

1) 7 of the answers given were just WRONG no matter how you look at it.

2) 7 had either the Clarksons who played in the 1880's, so don't count as modern baseball, or there were three of them, or both.  Even so, one could argue these are correct

3) 1 got it right. (That's   one got it right  not  I, bill g, got it right. It can be hard to distinguish the numeral for one, the letter capital i, and the letter small L. I blogged about L vs I here in the context of Weird AI vs Weird AL).


4) There were 13 different answers, which I find amazing.

As usual, when I study some odd issue, I learn a few other things of interest, at least to me, which may also have life lessons, though YMMV. 

a) Around 85% of pitchers in the Hall of Fame have won over 200 games. Dizzy Dean (his brother was also a pitcher which is why I was looking at this) got into the Hall of Fame with only 150 wins. Why? For 6 years he was the most dominant pitcher in the game. In addition (1) there was some sympathy for him since his career was cut short by an injury he got in an All-star game, and (2) he had a colorful personality and people liked him. The four least-wins for a HOF are: Candy Cummings (W-L 145-94).  [he is said to have invented the curveball], Dizzy Dean (W-L 150-83) , Addie Joss (W-L 160-97 )[he only played 9 years, which is less than the 10 needed for eligibility in the HOF, but he died so he was given an exception], Sandy Koufax (W-L 165-87) [he was dominant, like Dean, for a short time]. Sandy is the only one who is still alive. (W-L means Win-Loss. For example, Candy won 145 games and lost 94.)


b) Modern baseball starts in 1900. But this is arbitrary. In 1904 Jack Chesbro won 40 games which would never happen in modern baseball. But you have to draw the line someplace.  History is hard because there are fuzzy boundaries.

c) I had not known about the Clarkson brothers. 

d) My interest in this subject goes back to 1973. In that year I heard the following during a baseball game and, ever since I began blogging, I wondered how I could fit it into a blog post:


An old baseball trick question is now gone!  Just last week [in 1973] if you asked

What pair of brothers in baseball won the most games?

the answer was Christy and Henry Mathewson. Christy played for 16 seasons and had a W-L record of 373-188, while his brother Henry played in 3 games and had a W-L record of 0-1. So their total number of wins is 373 which was the record.

But this last week [in 1973] the Perry brothers, Gaylord and Jim, got 374 wins between them.  Hence the question

What pair of brothers in baseball won the most games?

is no longer a trick question since both Perrys are fine pitchers [Gaylord made it into the Hall of Fame; Jim didn't, but Jim was still quite good.]

As you will see in my writeup, the Perrys' record was broken by the Niekros.

e) Christy and Henry are a trick answer because Henry wasn't much of a player with his 0-1 W-L record. Are there other pairs like that? Greg (355 wins) and Mike (39 wins) Maddux might seem that way but they are not. While Mike only had 39 wins he had a 14-year career as a relief pitcher. Such pitchers can be valuable to the team, but because of their role they do not get many wins.



SO the usual question: Why did AI get this one wrong? Lance will say that the paid version wouldn't have and he may be right. David in Tokyo might say that ChatGPT is a random word generator. Others tell me that ChatGPT does not understand the questions it is asked (my proofreader thinks this is obvious).  I'll add that the pitcher-brother  question has more ambiguity than I thought- do you count pitchers before 1900? Do you count a brother who won 0 games? What about 3 brothers (not the restaurant)?

Wednesday, March 04, 2026

The Purpose of Proofs

In discussions of AI and Mathematics, the discussion often goes to mathematical proofs, such as the the First Proof challenge. So let's look at the role of proofs in mathematics.

Without a proof, you don't even know whether a theorem is true or false. It's not even a theorem until you have a proof, just a conjecture or hypothesis. You might have some intuition but you don't know the hardness of a proof until you find it. Even then that only gives you an upper bound on hardness as someone might find a simpler alternative proof in the future,

Proofs give an understanding of why a theorem is true. A proof can look like a piece of art, especially if the proof works in a new or clever way, maybe even a proof from the book like Kelly's Proof of the Sylvester–Gallai theorem. A mathematician often has their most satisfying experiences finding a proof of a theorem they or others have had challenge proving earlier.

Conjectures drive the need for proofs, but often it goes the other direction. A failed proof leads to a new conjecture and maybe a different theorem altogether. My time-space tradeoffs came out of a failed attempt to show NP different from L. Too often I see papers with theorems that are really just what the best proof they can find achieves.

Proofs are supposedly objective, either it is a valid proof or it isn't, the biggest reason AI tackles proofs as we can grade them as right or wrong. Most academic papers have high-level proofs and a referee however needs to make a subjective decision whether the proof has enough details to count as a legitimate proofs. If a proof has minor fixable errors, then it isn't a proof but we usually count it as one.

Now we have proof systems like Lean where we can make proofs truly objective. There's a large mathlib project to formalize much of mathematics in Lean, and talk of a cslib as well. 

Lean excites me less, I'm not sure what we gain by formalizing proofs besides certainty. Will Fermat's last theorem be any more true once we formalize it? One could argue Lean also helps AI reason about mathematics in a grounded way, though could AI just get lost in the logical weeds?

I worry about the heavy task of converting proofs to Lean, even with AI help, takes away from time spent finding and proving new theorems and we lose the beauty of the proofs. If there was a fully lean verified proof of P ≠ NP that you couldn't understand, would you find that satisfying?

Sunday, March 01, 2026

Goodhart's law: Ken Jennings and Types of Knowledge

Goodhart's lawWhen a measure becomes a target, it stops being a measure.  

I was watching the show Masterminds where Ken Jennings is one of the Masterminds. Here is what happened: 

Brook Burns (the host): The only vice president in the 20th century with initials H.H. was Hubert Humphrey. Who was the only vice president in the 19th century with initials H.H?

Bill Gasarch shouting at the screen: Hannibal Hamlin, Lincoln's VP for his first term! This is really obscure so this may be a rare case where I get a question right that Ken Jennings does not know!

Ken Jennings: Hannibal Hamlin. 

Bill Gasarch: Darn! However, I suspect he studied lists of presidents and vice presidents for the purpose of doing well on Jeopardy and now other shows. My knowledge is more natural in that I read books on presidents and vice presidents (best book, maybe the only book,  on Vice Presidents: Bland Ambition).  My knowledge of it is different from his. I tend to think my knowledge is more legitimate, though it would be hard to make that statement rigorous. If on quiz shows they asked follow-up questions, that might help alleviate this problem, if it is indeed a problem.

Misc: 

Ken Jennings is a Mormon so he does not drink or know about drinks. However, before going on Jeopardy he studied drinks for the show.

Ken Jennings does know novelty songs and that is legit since novelty songs comes up rarely on Jeopardy so I doubt he studied them in his preparation for going on Jeopardy. 

a) He knew that Shel Silverstein wrote A boy named Sue which was sung by Johnny Cash

b) He knew that Johnny Cash did a cover of Shel Silverstein's I'm being swallowed by a boa constrictor.

c) During his streak there was a category on novelty songs. He got 4 out of the 5 correct, and the one he got wrong I could tell he really knew. I got them all right, and faster, but Ken Jennings gets my respect for his legit knowledge of novelty songs. See here for the questions and answers. 

Goodharts law (maybe): Jeopardy is supposed to be about what people know. Is it supposed to be about what people study? Does studying for it help you  for things other than quiz shows?

When I watch a quiz show and get a question right there are levels of legitimacy:

1) I know the area naturally. 

2) I saw the question and answer on a different show and and I then looked it up and know more about it. So this is now legit.

3) I saw the question and answer on a different show and just know it as stimulus-response with very little understanding. Example: Kelly Clarkson was the first winner on American Idol, but I have never seen the show and only vaguely know how it works.

4) I saw the question and answer on the show I am watching, the exact episode, and I am watching a repeat. Sometimes I know stuff I don't normally know and then say OH, I've seen this episode before. 

Back to my point:

Is Ken Jennings memorizing lists a less-legit form of knowledge? If so, how to make that statement rigorous?

Is this what AI does? See also Chinese Room since that's also about a device that gets the right answers but perhaps for the ''wrong'' reason. 



Wednesday, February 25, 2026

A Probability Challenge

Last week I had the pleasure of meeting Alex Bellos in Oxford. Among other things Bellos writes the Guardian Monday puzzle column. He gave me a copy of his latest book, Puzzle Me Twice, where the obvious answer is not correct. I got more right than wrong, but I hated being wrong. Here is one of those puzzles, Sistery Mystery (page 28), which is a variation of a puzzle from Rob Eastaway

Puzzle 1: Suppose the probability of a girl is 51% independently and uniformly over all children. In expectation, who has more sisters, a boy or a girl?

Go ahead and try to solve this before reading further.

In any family with both boys and girls, each boy will have one more sister than each girl. For example in a family with four girls, each boy has four sisters and each girl only has three. Thus boys have more sisters on average.

Wrong, it's exactly the same. To see this consider Alex, the sixth child of a ten-child family. The number of Alex's sisters is independent of Alex's gender. This is a pretty robust result, it doesn't depend on the probability of a child being a girl, or if we allow non-binary children, or if the distributions aren't identical, say the probability of a girl is higher for later kids in a family. All you need is independence.

So what's wrong with my initial intuition that in every family boys have more sisters than girls. Eastaway suggests this gets balanced by the families of a single gender, but this happens rarely for large families. Instead it's a variation of Simpson's paradox. The naive argument doesn't account for the fact that girls are overrepresented in girl-heavy families. Consider a family of two boys and eight girls. Each of the two boys has eight sisters but four times as many girls have seven sisters, which adds to the expected value more to the girls than the boys.

If you lose independence the solution may not hold, for example if we have identical twins. 

I'll leave you with one more puzzle.

Puzzle 2: Suppose you are in a country where each family has children until they get their first boy. In this country, do boys or girls have more sisters on average?

Answer below.

In Puzzle 2 we lose independence and if Alex is a girl, she's more likely to be in a family with many girls. Indeed if boys and girls have equal probability, when you work out the infinite sums in expectation a boy will have one sister and a girl will have two.

Sunday, February 22, 2026

ChatGPT gets an easy math problem wrong (I got it right). How is that possible?

A commenter on this post asked for me (or anyone) to solve the problem without AI:

A,B,C,D,E are digits (the poster said A could be 0 but I took A to be nonzero)  such that

ABCDE + BCDE + CDE + DE + E = 20320.

(CLARIFICATION ADDED LATER: We allow two letters to map to the same digit.) 

I solved it completely by hand. You can try it yourself or look at my solution which is here. I found seven solutions. 

I THEN asked ChatGPT to give me all solutions to see if I missed any. 
 
I had it backwards. ChatGPT missed some solutions.  The entire exchange between chatty and me is here.

I asked it how it could get it wrong and how I can trust it. It responded to that and follow-up questions intelligently. 

Note that the problem is NOT a Putnam problem or anything of the sort. But I've read that AI can solve Putnam problems. SO, without an ax to grind I am curious- how come ChatGPT got the abcde problem wrong.

Speculative answers

1) The statement AI has solved IMO problems refers to an AI that was trained for Putnam problems, not the free ChatGPT. For more issues with the AI-IMO results see Terry Tao's comments here

2) ChatGPT is really good when the answer to the question is on the web someplace or can even be reconstructed from what's on the web. But if a problem, even an easy one, is new to the web, it can hallucinate (it didn't do that on my problem but it did on muffin problems) or miss some cases (it did that on my problem). 

3) It's only human. (It pretty much says this.)

4) The next version or even the paid version is better!  Lance ran it on his paid-for chatGPT and it wrote a program to brute force it and got all 7 solutions. 

5) I said ChatGPT got the problem wrong. If a student had submitted the solution it would get lots of partial credit since the solution took the right approach and only missed a few cases. So should I judge ChatGPT more harshly than a student? Yes. 

The question still stands: How come ChatGPT could not do this well defined simple math problem. 




Wednesday, February 18, 2026

Joe Halpern (1953-2026)

Computer Science Professor Joseph Halpern passed away on Friday after a long battle with cancer. He was a leader in the mathematical reasoning about knowledge. His paper with Yoram Moses, Knowledge and Common Knowledge in a Distributed Environment, received both the 1997 Gödel Prize and the 2009 Dijkstra Prize. Halpern also co-authored a comprehensive book on the topic.

Halpern helped create a model of knowledge representation which consisted of a set of states of the world, and each person has a partition into a collection of sets of states, where states are in the same partition if that person can't distinguish between those states. You can use this system to define knowledge and common knowledge, and model problems like muddy children. It also serves as a great framework for temporal logic

Halpern led the creation of the Computing Research Repository (CoRR), a forerunner of arXiv, and would later moderate CS papers for arXiv. 

Joe Halpern was the driving force behind the Theoretical Aspects of Rationality and Knowledge (TARK) conference, which attracts philosophers, economists, computer scientists and others to discuss what it means to know stuff. I had two papers in TARK 2009 in Stanford. But my favorite TARK memory came from a debate at the 1998 TARK conference at Northwestern. 

Consider the centipede game, where two players alternate turns where each can either play to the right (R/r), or defect (D/d) to end the game immediately, with payouts in the diagram below.

The game is solved by backward induction, working out that in each subgame the player does better defecting.

The debate asked the following. Player 1 needs to think about the backward induction of the future moves, considering the case where player 2 played right in its first move. But this is an irrational move, so why should you assume player 2 is being rational when playing its second move later on?

Someone said such reasoning is fine, like when we assume that square root of two is rational, in order to get a contradiction. The counter argument: Square root of two does not "choose" to be irrational.

Thank you Joe for helping us think about knowledge and giving us the forums to do so.

Sunday, February 15, 2026

Assigning Open Problems in Class

I sometimes assign open problems as extra credit problems. Some thoughts:

1) Do you tell the students the problems are open?

YES- it would be unfair for a student to work on something they almost surely won't get.

NO- Some Open Problems are open because people are scared to work on them. Having said that, I think P vs NP is beyond the one smart person phase or even the if they don't know it's hard maybe they can solve it phase.

NO- See Page 301 of this interview with George Dantzig where he talks about his mistaking an open problem for a homework and ... solving it.

CAVEAT---There are OPEN PROBLEMS!!! and there are open problems???  If I make up a problem, think about it for 30 minutes, and can't solve it, it's open but might not be hard. See next point. 

 I tell the students:

This is a problem I made up but could not solve. It may be that I am missing just one idea or combination of ideas so it is quite possible you will solve it even though I could not. Of course, it could be that it really is hard. 

A friend of mine who is not in academia thought that telling the students that I came up with a problem I could not solve, but maybe they can,  is a terrible idea. He said that if a student solves it, they will think worse of me. I think he's clearly wrong.  If I am enthused about their solution and give NO indication that I was close to solving it (even if I was) then there is no way they would think less of me.

Is there any reason why telling the students I could not solve it but they might be able to is a bad idea? 

2) Should Extra Credit count towards the grade? (We ignore that there are far more serious problems with grades with whatever  seems to make them obsolete: Calculators, Cliff  notes,  Cheating, Encyclopedias, Wikipedia, the Internet, ChatGPT, other AI, your plastic pal who's fun to be with.) 

No- if they count towards the grade then they are not extra credit. 

I tell the students they DO NOT count for the grade but they DO count for a letter I may write them.

What do you do? 

Thursday, February 12, 2026

The Future of Mathematics and Mathematicians

A reader worried about the future.

I am writing this email as a young aspiring researcher/scientist. We live in a period of uncertainty and I have a lot of doubts about the decisions I should make. I've always been interested in mathematics and physics and I believe that a career in this area would be a fulfilling one for me. However, with the development of AI I'm starting to have some worries about my future. It is difficult to understand what is really happening. It feels like everyday these models are improving and sooner rather than later they will render us useless.

I know that the most important objective of the study of mathematics/science is understanding and that AI will not take that pleasure away from us. But still I feel that it will be removing something fundamental to the discipline. If we have a machine that is significantly better than us at solving problems doesn't science lose a part of its magic? The struggle that we face when trying to tackle the genuinely difficult questions is perhaps the most fascinating part of doing research.

I would very much like to read about your opinion on this subject. Will mathematicians/scientists/researchers still have a role to play in all of this? Will science still be an interesting subject to pursue?

There are two questions here, the future of mathematics and the future of mathematicians. Let's take them separately.

Looks like the same letter went to Martin Hairer and highlighted in a recent NYT article about the state of AI doing math and the too-early First Proof project. According to Hairer, "I believe that mathematics is actually quite ‘safe', I haven’t seen any plausible example of an L.L.M. coming up with a genuinely new idea and/or concept."

I don't disagree with Hairer but the state of the art can quickly change. A few months ago I would have said that AI had yet to prove a new theorem, no longer true.

It's too early to tell just how good AI will get in mathematics. Right now it is like an early career graduate student, good at literature search, formalizing proofs, writing paper drafts, exploring known concepts and simple mathematical discussions. But no matter how good it gets, mathematics is a field of infinite concepts, there will always be more to explore as Gödel showed. I do hope AI gets strong at finding new concepts and novel proofs of theorems, and see new math that might not have happened while I'm still here to see it.

For mathematicians the future looks more complicated. AI may never come up with new ideas and Hairer might be right. Or AI could become so good at theorem proving that if you give a conjecture to AI and it can't resolve it, you might as well not try. The true answer is likely in-between and we'll get there slower rather than faster. 

People go into mathematics for different reasons. Some find joy in seeing new and exciting ideas in math. Some like to create good questions and models to help us make sense of mathematical ideas. Some enjoy the struggle of proving new theorems, working against an unseen force that seems to spoil your proofs until you finally break through, and impressing their peers with their prowess. Some take pleasure in education, exciting others in the importance and excitement of mathematics. These will all evolve with advances in AI and the mathematician's role will evolve with it.

My advice: Embrace mathematics research but be ready to pivot as AI evolves. We could be at the precipice of an incredible time for mathematical advances. How can you not be there to see it? And if not, then math needs you even more.

Sunday, February 08, 2026

I used to think historians in the future will have too much to work with. I could be wrong

 (I thought I had already posted this but the blogger system we use says I didn't. Apologies if I did. Most likely is that I posted something similar. When you blog for X years you forget what you've already blogged on.) 

Historians who study ancient Greece often have to work with fragments of text or just a few pottery shards. Nowadays we preserve so much that historians 1000 years from now will have an easier time. Indeed, they may have too much to look at; and have to sort through news, fake news, opinions, and satires, to figure out what was true.

The above is what I used to think. But I could be wrong. 

1) When technology changes stuff is lost. E.g., floppy disks.

2) (This is the inspiration for this blog post) Harry Lewis gave a talk in Zurich on 

The Birth of Binary: Leibniz and the origins of computer arithmetic

On Dec. 8, 2022 at 1:15PM-3:30PM Zurich time. I didn't watch it live (too early in the morning, east coast time) but it was taped and I watched a recording later. Yeah!

His blog about it (see here) had a pointer to the video, and my blog about it (see here) had a pointer to both the video and to his blog.

A while back  I was writing a blog post where I wanted to point to the video. My link didn't work. His link didn't work. I emailed him asking where it was. IT IS LOST FOREVER! Future Historians will not know about Leibniz and binary! Or they might--- he has a book on the topic that I reviewed here. But what if the book goes out of print and the only information on this topic is my review of his book? 

3) Entire journals can vanish. I blogged about that here.

4) I am happy that the link to the Wikipedia entry on Link Rot (see here) has not rotted.

5) I did a post on what tends to NOT be recorded and hence may be lost forever here.

6) (This is  bigger topic than my one point here.) People tend to OWN less than they used to. 


DVDs-don't bother buying! Whatever you want is on streaming (I recently watched, for the first time, Buffy the Vampire Slayer, one episode a day, on Treadmill, and it was great!)

CD's- don't bother buying!  Use Spotify. I do that and it's awesome-I have found novelty songs I didn't know about! Including a song by The Doubleclicks  which I thought was about Buffy: here. I emailed them about that it and they responded with: Hello! Buffy, hunger games, divergent, Harry Potter, you name it.

JOURNALS- don't bother buying them, its all on arXiv (Very true in TCS, might be less true in other fields). 

CONFERENCES: Not sure. I think very few have paper proceedings. At one time they gave out memory sticks with all the papers on them, so that IS ownership though depends on technology that might go away. Not sure what they do now. 

This may make it easier to lose things since nobody has a physical copy. 

7) Counterargument: Even given the points above, far more today IS being preserved than used to be. See my blog post on that here. But will that be true in the long run? 

8) I began saying that I used to think future historians will have too much to look at and have to sort through lots of stuff (using quicksort?) to figure out what's true. Then I said they may lose a lot. Oddly enough, both might be true- of the stuff they DO have they will have a hard time figuring out what's true (e.g., Was Pope Leo's ugrad thesis on Rado's Theorem for Non-Linear Equations? No. See my blog about that falsehood getting out to the world here. Spoiler alert- it was my fault.)

QUESTIONS:

1) Am I right--- will the future lose lots of stuff?

2) If so, what can we do about this? Not clear who we is in that last sentence.