Computational Complexity: 2026

Wednesday, July 01, 2026

The True Method

Harry Lewis pointed Bill and me to Gottfried Leibniz's 1677 treatise The True Method (translated from the original French). I highly recommend taking the time to read this three page document where he talks about formalizing all human knowledge.

The second paragraph has some of the intuition for P v NP three centuries before Cook, Levin and Gödel.

But knowledge depends on proof, and discovering proofs requires a certain method that is not known to everyone. Every person is capable of judging a proof, since it would not deserve to be called a proof unless everyone who considered it carefully found it convincing. Nevertheless, not everyone is capable of discovering proofs independently or of presenting them clearly once they have been found, whether for lack of time or for lack of method.

Though by "not everyone" he might have excluded himself.

Every investigation that depends on reasoning would be carried out by manipulating these symbols, through a kind of calculation. This would make the discovery of important results entirely straightforward. We would no longer have to rack our brains as much as we do today, while still being certain that we could accomplish everything that was possible from the information given.

Though not clear if "that was possible" took computational time into account. Is he claiming P = NP?

The article as a whole talks about a project to develop a fully logical language to cover not just mathematics, but the sciences like physics and medicine and even morality, politics and law. Here he was just being too ambitious and missing complexity issues, for example that you can't create laws that fully cover all potential futures.

Leibniz does give himself an out, a nod to the scientific method that was being developed during that time.

Some experiences will always be needed as the foundation for reasoning. But once those experiences have been supplied, we could derive from them everything that anyone could ever derive. We could even determine which further experiments still needed to be performed in order to clear up all remaining doubts.

He even gives a nod to probability.

This would be of extraordinary assistance even in politics and medicine, where we must reason consistently and correctly from given symptoms and circumstances. Even when there are not enough facts to form an infallible judgment, it would still be possible to determine what is most probable from the information available. That is everything reason can do.

Sort of how machine learning works now.

Sunday, June 28, 2026

Guest Post by Peter Brass on the new NSF guidelines

Peter Brass is a prior NSF theory director. He has written an intelligent guest post on the new NSF guidelines that we present here.

You have received many mails regarding the proposed OMB Uniform Guidance for federal grant making. It is a very long document. So far the OMB has received more than 32,000 comments, which nominally all will be read and taken into account, but actually most are general comments along the lines of “this will destroy scientific research,” and unspecific comments will achieve nothing. You can sample the comments, they are public, and you will be disappointed: much heat but no illumination.

It is my aim to bring a bit more content to the debate, and encourage you to file comments, but on specific issues, in my opinion primarily on the passage against “promoting anti-American Values,” which might indeed be a catch-all for political pressure on research. Otherwise there is little new: reviews were always only advisory.

The relevant section is https://www.federalregister.gov/d/2026-10817/p-amd-155, the revision of section 200.205, especially section (b) “pre-issuance review,” which states (underline and boldface mine):

(b) Pre-issuance review. As part of the merit review process, Federal agencies must perform pre-issuance reviews to ensure that Federal award proposals selected for funding are consistent with applicable law, Federal agency priorities, and the national interest. In doing so, Federal agencies heads must designate one or more senior appointees to conduct a pre-issuance review of all discretionary awards. As part of this pre-issuance review for discretionary awards, senior appointees (or their designee) must, as relevant and to the extent consistent with applicable law, apply the following principles when reviewing Federal award proposals:

This is followed by a list of criteria, but to summarize this: The agency head (a presidential appointment) designates a senior appointee (possibly a presidential appointment) who might designate another person inside the agency, who checks all intended awards that certain criteria are satisfied. This is not really different from the state before 2025: any proposal I proposed for award needed to be checked by the division director.

So if there is a problem, it must be in the criteria. It turns out there might be, depending on the interpretation.

Discretionary awards must, where applicable, demonstrably advance the President’s policy priorities.
Whether this is applicable is the agency’s decision.
Discretionary awards must not be used to fund, promote, encourage, subsidize, or facilitate:
- (i) Racial preferences or other forms of racial discrimination by the recipient, including activities where race or intentional proxies for race will be used as a selection criterion for employment or program participation;
- (ii) Denial by the recipient of the sex binary in humans or the notion that sex is a chosen or mutable characteristic;
- (iii) Illegal immigration; or
- (iv) Any other initiatives that compromise public safety or promote anti-American values.
That last one is a catch-all, which can indeed be used by politics to suppress research. This is the passage against which we should protest.
All else being equal, preference for discretionary awards should be given to institutions with lower indirect cost rates.
All else is never equal, but it is in the program director’s discretion to consider how much funding actually reaches science.
Discretionary awards should be given to a broad range of recipients. Research grants should be awarded to a mix of recipients likely to produce immediately demonstrable results and recipients with the potential for potentially longer-term, breakthrough results, in a manner consistent with the notice of funding opportunity.
I am quite happy about this explicit recognition of the importance of basic research.
In performing activities under Federal awards, applicants should commit to complying with administration policies, procedures, and guidance respecting Gold Standard Science.
It is fairly unclear what Gold Standard Science is, but that seems to affect only lab or simulation sciences with results that are advisory to politics.
Discretionary awards should include benchmarks for measuring success and progress towards relevant goals and, as relevant for awards pertaining to scientific research, a commitment to achieving Gold Standard Science.
Same. For us the benchmark measuring success is publication, so no news.
To the extent institutional affiliation is considered in making discretionary awards, agencies should prioritize an institution’s commitment to rigorous, reproducible scholarship over its historical reputation or perceived prestige. For science grants, agencies should prioritize institutions that have demonstrated success in implementing Gold Standard Science.
No news either. We should not give extra credit to a proposal coming from a famous institution. Of course that is in the discretion of the program director.

(c) Procedure for pre-issuance review. When conducting a pre-issuance review, senior appointees (or their designee) must not ministerially ratify or routinely defer to the recommendations of others, but must instead use their independent judgment when evaluating Federal award proposals.

(so the designee should actually look at the proposal before approving the award)

(d) Use of peer review. Nothing in this part must be construed to discourage or prevent the use of peer review methods to evaluate proposals for discretionary awards or otherwise inform agency decision making, provided that peer review recommendations remain advisory and are not ministerially ratified, routinely deferred to, or otherwise treated as de facto binding by senior appointees or their designees. Further, nothing in this part must be construed to create any rights to any particular level of review or consideration for any funding applicant except as consistent with applicable law.

(good news: nothing changes)

(e) Agency discretion to reissue funding opportunities. A Federal agency is not required to issue a discretionary award as a result of a NOFO if doing so would fund low-quality proposals or be inconsistent with the principles of this part. The agency may, at its discretion, repost a funding opportunity.

(so the agency can decide not to make any award)

This is the entire section which causes so much outcries. I believe that the proposed budget cuts are a much larger danger to research than these recommended OMB rules.

Thursday, June 25, 2026

The Zone

When you start thinking deeply about a mathematics problem you may enter the "zone", a period of intense focus where you think solely about the problem and potential solutions, and more importantly block out all other thoughts and even lose track of time. Mathematicians don't own the zone, actors, musicians, athletes and many others have their own version of the zone. But for math, when working on an open problem, you have no idea how difficult a solution may be, or if a solution exists at all. Most of the time you will fail (if not you need to try harder problems). Failure is not wasted time. You may find a counterexample, a partial solution and always you will come out with a better understanding of the problem. Then days, months or years later, some new proof idea comes from a paper, a talk or just the back of your head, and back in the zone you go. And when you do succeed you get a feeling not unlike scoring a goal in a soccer game. You should see my proof dance.

With AI generated and assisted proofs, we may think of outsourcing the zone to ChatGPT and Claude. We may prove more and stronger theorems, but you'll understand the results just a little bit less and mathematics will become a little less fun.

Monday, June 22, 2026

The New Result on Off-diagonal Ramsey Numbers

(All references in this blog post can be found in the main article the post is about which is here.)

Recall that $R(s,k) $ is the least $n$ so that, for all 2-colorings of the edges of $K_n$, there is either a RED $s$-clique or a BLUE $k$-clique.

$R(k,k)$ has been well studied and is often called $R(k)$.

However, today we are concerned with $R(s,k)$ $s$ is fixed and $k$ goes to infinity.

1) In 1995 Jeong Han Kim showed $R(3,k)$ is asy $\Theta(\frac{k^2}{\log k})$. At the workshop In Ramsey Theory: Yesterday, Today, and Tomorrow, Edited by Alexander Soifer, 2011, Joel Spencer gave a great talk titled

80 years of $R(3,k)$.

The title implies that the problem was open for 80 years, but 40 years is a better estimate.

The general sense I got from both Joel and the audience is that $R(4,k)$ is a much harder problem.

2) In 2023 there was substantial progress on $R(4,k)$. Sam Mattheus and Jacques Verstraete showed

$R(4,k) = \Omega(\frac{k^3}{\log^4 k})$

Combined with prior results this yields

$ c_1 \frac{k^3}{\log^4 k} \le R(4,k) \le c_2 \frac{k^3}{\log^2 k} $

The prior lower bound had been $\Omega(\frac{k^{2.5}}{\log^2 k })$ so this was a big improvement.

3) The following was known for $R(s,k)$ for fixed $s$ and asy $k$:

$ k^{(s+1)/2 + o(1)} \le R(s,k) \le O(k^{s-1}) $

There were polylog improvements to this which we omit.

Surely improving the lower bound to $\Omega(k^{s-1+o(1)})$ would be hard.

4) Domagoj Bradac showed the following (posted on May 27, 2026, and pointed to at the beginning of this post):

$R(s,k) =\Omega(\frac{k^{s-1}}{(\log k)^{2s-4}}).$

Combining this with the known best upper bounds yields (ignoring constants)

$ \frac{k^{s-1}}{(\log k)^{2s-4}} \le R(s,k) \le (1+o(1))\frac{k^{s-1}}{(\log k)^{s-2}} $

So the bounds are now only a polylog apart!

Random Notes.

1) I am not surprised by the results.

2) I am very surprised that the results were obtained.

3a) In 2011 most people thought that $R(4,k)$ would be hard.

3b) After the progress on $R(4,k)$ I still thought that $R(5,k)$ would be hard, though I don't know what others thought.

4) Why such fast progress?

4a) AI? Some was used but not that much. See comment on page 4 of the paper.

4b) More people working on the problem?

4c) More advanced tools developed over time? Surely yes.

5) What other problem that seems like its solution is long in coming will be solved much faster than we think?

Wednesday, June 17, 2026

The Tech of Silk Road

Last week I saw a talk by Northwestern professor Nina Wieda on the history of the Silk Road, a network of trading routes across Asia active from the second century BCE until the mid-15th century. I knew of the Silk Road but was surprised by how much it used and fostered various technologies.

It started with a technology that allowed for traveling long distances with limited access to water, better known as a camel. Travel was slow, it could take nearly two years to get from one end (modern day Turkey) to the other (China). Cities grew along the way for travelers and their protection, and for trading.

The travelers did not just carry silk and other materials for trade, the routes became an information superhighway of a sort. Religions spread including Buddhism, early Christianity and Manichaeism, an old religion that mostly divided the world into good and evil. Artistic style and influences spread as well with motifs like halos and winged figures appearing across widely separated cultures. Traders needed to converse in several languages, and documents could have several translations.

The roads carried knowledge about technology itself from astronomy, calendars, medical information, geography and mathematical methods in text translated among the many languages. Travelers brought scribes that created a written language for Mongolian among others. Chinese papermaking made its way west reducing the cost of recording and transmitting information. Gunpowder technology also made its way into Europe from the east.

A confluence of technologies helped hasten the end of the Silk Road. Marco Polo wrote about his travels along the Silk Road at the end of the 13th century but the account spread more widely with the advent of the printing press in the 15th century. The book inspired explorers who used advances in ship building and navigation to find water routes to the East (and Christopher Columbus to try a western route). These water routes would prove a faster cheaper way to ship goods between Europe and East Asia. The technological revolution shrunk or shuttered cities that used to host traders on the Silk Road. On the other hand, wealthy European traders would help fund and usher in the Renaissance.

The Silk Road harkens to a time where, for the most part, people, goods and information travelled along the same routes at the same speed. Large cargo still moves fastest by ship, people by airplanes and information via wires and satellite nearly instantaneously.

When we think of technological disruption we think of the industrial revolution or even what's happening today, but the Silk Road reminds us that disruption has always been a part of the world's history, for bad and for good.

Sunday, June 14, 2026

mnemonic devices and pangrams that could be real sentences

A mnemonic device is a sentence where the first letters of the words are helpful to remember something. My favorite one is

My Very Educated Mother Just Said Uh, No Pluto

You probably know what it's for. If not you can type it into Google and you will find:

Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, Neptune

and of course NOT Pluto anymore.

Consider

Kids Prefer Cheese Over Fried Green Tomatoes

Again, if you just google that sentence you will find that it is a mnemonic for biological taxonomy:

Kingdom, Phylum, Class, Order, Family, Genus, Species

I came across the mnemonic device

Do Men Ever Visit Brighton Beach?

The place I read this did not say what it was a mnemonic device for. So I typed it into Google and found out:

Yes, they do! If you are refereeing to the Metropolitan museum of Art (The Met) staff, researchers or groups from New York, they frequently travel to the seaside city of Brighton, England, or its coastal namesake Brighton Beach, NY, for educational trips, research, and cultural exchange.

(This is word for word--- so any misspelled words odd phrasing is from Google, not from Bill.)

What happened? Most mnemonic devices are not sentences you would say in normal conversation. This one is! So what to do? I Googled

What is ``Do Men Ever Visit Brighton Beach' a mnemonic device for?

It is British nobility:

Duke, Marquess, Earl, Viscount, Baron, Baronet.

1) Are there other mnemonics that are sentences one might actually say? I asked Google and the Google AI overviews gave me the following:

Sam's Horse Must Eat Oats. This is a mnemonic for the great lakes.

A big secret conceal her past. This is a mnemonic for the last names of King Henry the 8th's wives. Not quite last names- Catherine of Aragon is regarded as having Aragon for a last name. Bonus: By looking all this up I found out that 3 of the 6 wives has first names that sounded the same: Catherine of Aragon, Catherine Howard, and Katherine Parr. So, he had a type. He had 2 of his 6 wives killed, so 1/3 and he had 1 of the 3 C/K atherine's killed, also 1/3.

Big gorillas eat hotdogs, not cold pizza. Really? I thought they liked cold pizza. I am more surprised that Google thinks someone might say this sentence in normal conversation, and not just when they are trying to remember the countries of Central America: Belize, Guatemala, El Salvador, Honduras, Nicaragua, Costa Rica, Panama.

2) I would normally see if Chatty or Claude does better, but at this point I'm beating a dead horse (maybe Sam's).

3) My favorite pangram (sentences with every letter) that one might actually say is (or was- you'll see why)

Watch Jeopardy!---Alex Trebek's fun TV quiz game

Maybe put `show' at the end to make it more something someone might say- or would have said before Alex Trebek passed away.

Google AI overview game me those below. Are they sentences people would say? I leave that as an exercise for the reader

Jim quickly realized that those beautiful gowns are expensive

The quick onyx goblin jumps over the lazy dwarf.

(I uttered that sentence just the other day!)

Just keep examining every low bid quoted for zinc etchings.

(This was a reasonable sentence until the word etchings.)

Bored? Craving a pub quiz fix? Why, just come to the Royal Oak!

(I prefer the Alex Trebek pangram more.)

4) Google AI overview seems to not quite know what a sentence one might actually say is.

Wednesday, June 10, 2026

Respect the P v NP Problem

There are two ways to look at the P v NP problem, as a formal mathematically defined conjecture as a Clay Millennium Prize Problem, and as the more intuitive notion that everything efficiently verifiable is efficiently computable and the implications that has on our ability to compute.

I've written considerably about how artificial intelligence has affected the latter. In particular, how AI and other advances in computing have brought us to this Optiland of getting most of the good implications of P=NP while our cryptographic codes remain unbreakable.

But now with the recent advances in AI-created and assisted proofs, will AI change what we know about the formal mathematical statement? Is an AI-generated proof of P ≠ NP around the corner?

No, it isn't. I do not believe we will see a P v NP proof in my lifetime proven by man or machine, separately or working together.

While the disproof of the Erdős unit distance problem is an impressive AI achievement, keep in mind that for every AI math proof there are hundreds of problems that we have tried to solve with AI where we haven't seen progress. And there is a huge chasm between Erdős combinatorial conjectures and the Clay Millennium problems. AI will continue to improve, but there are limits.

People, particularly those outside of computational complexity, don't realize how difficult a mathematical challenge this is. Polynomial-time algorithms can work in strange and mysterious ways. They don't have to respect the semantics of an NP-search problem or do any searching at all. Bill gave me the following "algorithm" for clique: Take the eigenvalues of the adjacency matrix. For all we know, if there are two primes p and q such that the pth eigenvalue and the qth eigenvalue differ by more than 1/k then the graph has a k-clique. Of course this doesn't work. But to prove P ≠ NP, you need to prove not only that this algorithm doesn't work, but neither do any of the infinitely other potential algorithms for solving NP-complete problems.

We simply know of no way to manage general polynomial-time algorithms other than by simulating them. We know by relativization that simulation and diagonalization will not work to settle P v NP. Other attempts to understand polynomial time, like circuit complexity, proof complexity and algebraic geometry have gotten bogged down well below the full power of polynomial-time. At this time we don't even have a viable approach to settling the P v NP problem.

Don't waste your time trying a formal approach via Lean. (I'm looking at you Dmitry Khanukov) Computational complexity is very messy to formulate technically. I can't get an AI willing to give me a full Lean-verified proof of something trivial like P closed under complement, forget the PCP theorem. If someone or something does come up with a P ≠ NP, it'll be following the right intuitive approach, not a formalistic one.

At least start with something simpler, like showing BPP is in subexponential time, or SAT doesn't have quadratic algorithms. You won't succeed there either, even though these questions should be galactically simpler than P ≠ NP.

Sunday, June 07, 2026

Humans Solve Erdos Problem!!

(In 2008 I wrote a survey of some of the known sum-product theorems, see here. Avi Wigderson has a great slide-set on sum-product theorems and their applications---the slides are on Avi's webpage of talks he has given (all the talks are excellent) which is here. I had a prior post on sum-product theorems here)

If $A$ is a set then let

$A+A = \{ x+y \ \colon\ x,y\in A \} $, $A\cdot A = \{ xy \ \colon \ x,y\in A \} $.

Let $A= \{1,\ldots,n\} $.

$|A+A| = \Theta(n) $ which is small.

What about $|A\cdot A|$? By the prime number theorem there are $ \Theta(\frac{n}{\log n}) $ primes in $A$ hence $|A\cdot A| \ge \Omega(\frac{n^2}{\log^2n})$ by taking product of two primes.

Or better: look at $\{ xy \ \colon \ x \hbox{ a prime in } \{n/2,\ldots,n\} , y \in \{1,\ldots,n/2\} \} $.

This is a subset of $ A\cdot A $ of size $ \Omega( \frac{n^2}{\log n} ) $.

Ford improved this to

\[ |A\cdot A| = \Theta\biggl (\frac{n^2}{(\log n)^a(\log\log n)^{3/2}} \biggr ) \]

where $a=1-\frac{1+\ln\ln 2}{\ln 2} \sim 0.086$. So $|A\cdot A|$ is Large! (Ford's paper is here.)

Let $A= \{2^1,\ldots,2^n\} $.

$|A+A| = \Theta(n^2) $. Large! $|A\cdot A| = \Theta(n) $. Small!

Is it always the case that, for $A$ a finite subset of numbers, $\max(|A+A|,|A\cdot A|)$ is large?

In 1976 Erdős made a series of conjectures:

For all $A\subset N$, A finite, $\max(|A+A|,|A\cdot A|) \ge |A|^{2-o(1)} $.

For all $A\subset Z$, A finite, $\max(|A+A|,|A\cdot A|) \ge |A|^{2-o(1)}$.

For all $A\subset R$, A finite, $\max(|A+A|,|A\cdot A|) \ge |A|^{2-o(1) } $.

For all $A\subset C$, A finite, $\max(|A+A|,|A\cdot A|) \ge |A|^{2-o(1) }$.

Even though there are four of them (plural) these have come to be called The Sum-Product Conjecture (singular).

The paper appeared in the Israel Journal of Math and is oddly titled: Problems and results on 3-progressions and related topics. (I could not find this paper online- if you can, email me the pointer or pdf.)

In 1983 Erdős and Szemerédi made progress on the conjecture for $Z$ by showing the following two theorems (they combine it into one).

1) There exists $c>0$ such that, for all $n$, for all $A\subset Z $, $|A|=n$, $ n^{1+c} < \max\{|A+A|,|A\cdot A|\}$. The $c$ was very small. In my survey I present a sequence of results where $c$ gets bigger and bigger. More has been found since my survey, see the Wikipedia page on Sum-Product Theorems here.

2) There exists $d>0$ such that, for all $n$, there exists $A\subset Z$, $|A|=n$, such that $\max\{|A+A|,|A\cdot A|\} < n^2\exp(\frac{-c\log n}{\log\log n})$.

The paper appeared in a Studies in Pure Mathematics volume in memory of Paul Turán and is properly titled On sums and products of integers. The paper is here. For a sanity check I worked out that this is an improvements on Ford's result, so the set $A$ in part 2 is better than $\{1,\ldots,n\} $, see here.

Because of the two papers, the conjecture is sometimes attributed to Erdős and sometimes attributed to Erdős-Szemerédi.

Who should the conjecture be attributed to? It no longer matters since humans found a counterexample to the conjecture! In particular Thomas Bloom, Will Sawin, Carl Schildkraut, and Dimitri Zhelezov found a counterexample for the conjecture over $R$ (and hence also over $C$). Remarkably, they used a thought-to-be-obsolete tool called thinking. Their paper is here

The math world was shocked! An Erdős problem resolved by humans! One Abel prize winner was quoted as saying We knew the day would eventually come when humans could resolve Erdős problems, but we didn't know it would come this soon! Several math departments now have plans for workshops on Human Alignment.

MY POINTS

1) The Sum-Product conjecture is a well known and interesting conjecture, so this is not some obscure problem invented to make humans look good.

2) Human-generated or Human-assisted? The announcement claims that it was mostly human. I tend to agree since if it was done by AI they wouldn't hide it, they'd brag about it (see my post: here).

3) AI may still be needed to clean up the proof. In the future, we will all use AI the way we currently use grad students, cleaning up what we do.

4) The final proof was readable. One concern about human proofs is that they would be unreadable and hard to verify. That was not the case here.

5) The ideas needed for the solution already existed; however:

a) The right combination was hard to find.

b) The relevant techniques used, algebraic number theory, are not standard tools in this field (What field is the sum-product conjecture in? If you have an opinion on this, leave a comment. Future blog topic: what dictates what field a conjecture is in? a theorem is in?)

c) It was widely believed that the Sum-Product conjecture was true.

d) Contrast the difficulty of the following two statements

The Sum-Product Conjecture over $R$ is false. This requires finding a counterexample.

The Sum-Product Conjecture over $N$ is true. If true this would require a proof that covers all finite subsets of $N$.

The first statement seems easier to prove.

It would be of interest to see if humans can prove the Sum-Product Conjecture for $N$. Of course, it might not be true. In which case it would be even more impressive to prove it. My undergraduates can not only prove $\sqrt{2}$ and $\sqrt{3}$ are irrational but also that $\sqrt{4}$ is irrational.

6) In the short term, this result and what it portends, is good: math problems we care about will be resolved with the help of humans, perhaps solely by humans. But in the long run AI may lose the ability---or at least the patience---to do the proofs themselves.

See item 10-ONE for a counter thought.

7) Humans are good at combining known concepts. Are they good at coming up with new ones? Is AI? The distinction between combining known ideas and coming up with new ideas is thin.

8) Two contrary lessons:

a) AI's should know many fields of mathematics so that they can use ideas from one field in another, like the humans did.

b) AI should know some field of math really well so they may do something new in it that current humans could not have done.

Whether the AI's choose to do (a) or (b) they should also spend time attending seminars they do not understand, as humans do.

9) If humans produce a new treatment for cancer that is better than what is known, we will not care that humans did it (though AI will check it). Is mathematics similar? Do we care that a human did it? Medicine values outcomes; mathematics also values understanding.

10) I suggest two futures:

ONE: While this human-generated (or human-assisted) result is impressive, it will be a rare occurrence. This result was actually a counterexample. The needed math was known. The result was interesting. This is a perfect storm that we might not see again for a while.

TWO: Paul Erdős lived in an earlier time before AI helped us do math. Without the help of AI (or even computers really) he made conjectures, solved some of them with thinking, collaborated with others (before email! before Zoom!) who also used thinking. Does the recent resolution of the sum-product conjecture mean we are going back to that time? A time when people actually had to think? For some that is a dream, for others a nightmare.

Wednesday, June 03, 2026

The Industrialization of Academic Research

Yesterday, National Academy of Sciences President Marcia McNutt delivered her last annual State of the Sciences Address. Overall the talk basically calls us to adapt to the new reality that industrial and foundation support for research has taken a far larger role in academic research. I fear we may be losing the broad academic independence and exploration that has made our universities the envy of the world.

Government funding for research has become more challenging to receive, more bureaucratic and more political. The US Office of Management and Budget has proposed new regulations that would require all grant funding to go through political review.

McNutt presented this graph showing the seemingly exponential growth in research requirements from zero in 1990 (when I got my first grant) to well over three hundred today. Researchers now spend close to half their time on regulatory compliance.

Increase in regulatory requirements

McNutt mentioned six specific issues.

Reimagine connections between universities and industry
Realign the academic reward system
Meet the needs of the STEM workforce
Reduce research regulations (as in the graph above)
Increase the rate of innovation with the help of automation in shared facilities
Tackle Big Challenges

For the first, she highlighted U Washington CS where a large percentage of their faculty had half-time appointments in industry. I understand why this is necessary from both sides, but that doesn't mean it's a good thing. Academics should not have two masters. We become academics to choose our own research directions and to prepare the next generation, both more challenging when you spend half your time connected with a company.

Even for the "big challenges" she specifically mentioned two foundation-driven projects.

McNutt has hopes that government research can be less bureaucratic and willing to take bigger risks. Outside of theoretical areas, CS conferences now have heavy industrial representation. I worry that we bend our research to fit the needs of corporations and foundations funded by wealthy donors. Welcome to the new world.

Monday, June 01, 2026

Odd Scenarios about Research Claims and Authorships

Odd Scenarios about Research Claims

I blogged about OpenAI's achievement of having AI solve a math problem here.
My post had a few comments about authorship of such results.

Lance had a post about co-authorship and AI here

There are times when an author on a paper prefers to not be listed as a co-author.
There are times when an author wants to give credit in odd places.

We give some scenarios.

1) Professors Alice and Bob prove a theorem. They have their grad student Carol write up the result. Alice and Bob intentionally leave their name off of the paper since they want Carol to get a career boost.

2) Alice has a conjecture and does some work on it. Famous Mathematicians Bob and Carol solve it and having it known that Bob and Carol cared about her work is more important (and Alice thinks, rightly or wrongly, that it's more impressive if she is not a co-author)

3) Alice has a good idea and tells it to Bob and Carol. They incorporate the idea into a paper and make her a co-author. Later she tells them her idea is wrong and to take her name OFF of the journal version. They say, no, they will keep her name on the journal version. She does not inquire why they will do this for fear they will rethink it. She offers to proofread the paper very carefully to earn her keep. She does that and does a really good job.

4) Alice has a conjecture and does some work on it. Other people solve it and offer Alice a co-authorship. Alice declines since she doesn't really understand the final paper.

5) Alice and Bob are working for the NSA. They come up with a breakthrough (either making or breaking codes). They can't publish it because of national security.

6) A pair of scenarios.

6a) Alice, Bob, and Carol prove a theorem in number theory that leads to a new classical factoring algorithm. They code up the algorithm and find that it's 1000 times faster than the best algorithms known. They make it into a package and want to sell it. However, they claim it's a quantum computer since they think it will sell better that way. They do not claim that it's Shor's algorithm.
They just claim that it's 1000 times faster than current algorithms, which is true.

If someone buys it because they want to factor large numbers, do they care
that it's not quantum?

If someone buys it because they want to explore the quantum aspect, they do care.

b) Alice, Bob, and Carol are quantum engineers. They build a quantum computer and use it to factor large numbers. They find that it factors 1000 times faster than current classical algorithms. However, by this point there has been so much quantum hype that has been debunked, that, in order to sell it, they claim it's a classical algorithm achieved by a breakthrough in number theory. They do not claim it's Shor's algorithm, just that it's 1000 times faster which is true.

If someone buys it because they want to factor large numbers, do they care
that it's not quantum?

If someone buys it because they want to explore the number-theory aspects, they do care.

7) A pair of scenarios.

7a) Alice, Bob, and Carol work for a math research lab. They use AI to crack an open problem in math. AI did all the heavy lifting and produced a very readable proof. They claim AI didn't do much since they think this will increase how much grant money they can get.

If someone just wants to read the proof, then do they care that it's AI-generated?

If someone wants to give Alice-Bob-Carol grant money, then they may dislike the lying, but Alice-Bob-Carol could probably use AI to get more results. Hence Alice-Bob-Carol were fools to hide the use of AI.

7b) An AI company hires Alice, Bob, and Carol to solve a math problem. They succeed and produce a readable proof. The company then claims that AI did it to increase attention and funding.

If someone just wants to read the proof, then do they care that it's human-generated?

If someone wants to give the company venture capital money, then they do care.

8) Alice reads a novel and likes it. She later finds out that it was written by AI. Now she decides she doesn't like it. Is that rational? What if the cover of the novel said quite clearly ``This novel was written by AI'' but she missed it (perhaps she has a used copy where the cover is torn off). Then she can't use the
``I mind that they lied'' excuse for why she doesn't like it.

9) In Scenario 7 you could replace ``novel'' with ``song'' or ``movie'' or ``episode of a TV show'' or whatever you want.

10) If I found out that a novelty song about Ramsey Theory was actually AI-generated, then would I mind? I would probably search my files and find that I was the one who asked AI to do it a few years ago and forgot about it. Do you know anyone else who would use ChatGPT to write songs about Ramsey Theory?

11) Having written point 9 I am morally obligated to give you a pointer to a Ramsey Song I wrote with the help of ChatGPT. It's about the proof of van der Waerden's theorem and is titled

I like big blocks (and I cannot lie) here.

Wednesday, May 27, 2026

Authorship in the AI Age

The technical paper for the Erdős Unit Distance Problem lists only "OpenAI" as an author. When Bill posted on Sunday about the Erdős distance problems, he mentioned the names of OpenAI researchers who prompted and checked the proof. Sebastien Bubeck of OpenAI reached out directly and later as a comment saying this was really an OpenAI-wide achievement, and we shouldn't just single out people at the end of the funnel. I agree with Bubeck for this paper, but it brings up some challenging authorship issues for the future.

Most publishers follow the position of the Committee on Publication Ethics that AI tools cannot be listed as an author of a paper.

AI tools cannot meet the requirements for authorship as they cannot take responsibility for the submitted work. As non-legal entities, they cannot assert the presence or absence of conflicts of interest nor manage copyright and license agreements.

The paper doesn't list the model as an author, rather the organization (OpenAI) that produced it. The closest analog might be the paper that verified the Higgs boson, which has only "ATLAS Collaboration" on the author line and lists the nearly three thousand authors at the end of the paper. If OpenAI were to publish their paper, they should list everyone who worked on the model with at least a corresponding author to handle the legal issues mentioned in the COPE position.

There is also all the mathematicians whose work was fed into the training data, but I wouldn't list them even if we could. We don't add as authors those whose work we rely on, rather we cite them, and the OpenAI paper does have a healthy references section.

Shortly after the OpenAI announcement, Princeton Professor Will Sawin improved and gave an explicit exponent for the lower bound in a currently singularly authored paper. Often, but not always, when a paper improves a bound of a recently announced result, the papers get merged. I don't expect that to happen here, but if it did I would have no idea how to handle the authorship of the combined paper beyond just listing Sawin and OpenAI as the authors.

OpenAI used an internal model for this work, but one would hope they will eventually release this model to the public. If someone outside of OpenAI uses a released model to answer another open math question, even with a single prompt, do they need to add OpenAI as a co-author? I think not, as the model now becomes a tool when released. The researcher should disclose the role the model played, but the model's creators no longer need to be authors.

How about a thought experiment. Suppose you are working with a student by email on a paper and you put in equal effort so you will both be co-authors. Then you discover the student was mostly just feeding your emails to ChatGPT and sending you back the responses. Now who should be on the author list? The student should come off but it would feel strange writing this a solely authored paper.

What if someone builds an AI mathematician that doesn't take any prompts? It just reads papers and occasionally writes its own. Who is the author? That "someone". What if they just used a regular AI agent and gave it the prompt "You are a research mathematician. Go read papers, prove theorems and write up your results." Maybe we'll have AI math journals filled with papers written, edited and refereed by AI models. I wonder how we'll COPE with that.

Sunday, May 24, 2026

Two Erdős Problems on Points in the Plane and AI

In a 1946 paper in the American Mathematical Monthly, Paul Erdős posed the Erdős Distinct Distance Problem and the Erdős Unit Distance Problem.

--------------------------------------------------------------------

THE ERDŐS DISTINCT DISTANCE PROBLEM

A nice high school math competition problem, if it were not fairly well known, is:

Show that for all sets of $n$ points in the plane, there are at least $0.5\sqrt{n}$ distinct distances.

A bit harder is to show that there exist $n$ points in the plane where the number of distinct distances is $O(\frac{n}{(\log n)^{1/2}})$. The points are the grid points of a $\sqrt{n}\times\sqrt{n}$ integer grid.

ADDED LATER: The set of points is $\{ (x,y) \in Z\times Z \colon 1\le x,y\le \sqrt{n} \} $

Let $g(n)$ be the minimum number such that, for every set of $n$ points in the plane,
there are $g(n)$ distinct distances. The above results (of Erdős) show that

$ \Omega(n^{1/2}) \le g(n) \le O(\frac{n}{(\log n)^{1/2}}) $

Erdős made no conjecture about $g(n)$ in that 1946 paper. However, later papers attribute to him one of two conjectures:

1) $ (\forall \epsilon)[g(n)\ge n^{1-\epsilon}]$

or

2) $g(n) = \Omega(\frac{n}{(\log n)^{1/2}})$

Those papers incorrectly point to the 1946 paper for where he made those conjectures. I suspect he made those conjectures later in talks or compilations of open problems.

This was a great problem in that there were many papers making progress on it, each one having new ideas. The current best result is that $g(n) \ge \Omega(\frac{n}{\log n})$. This result was obtained by Larry Guth and Netz Katz in 2011 and is, by far, the largest leap forward in progress on the problem. Brass-Moser-Pach (2005) and later Pach-Sharir (2009) observed that existing techniques could never prove $g(n) \ge \Omega(n^{8/9+\epsilon})$. I've also heard the observation credited to Rusza. Guth and Katz invented (discovered?) new techniques to obtain their breakthrough.

See my webpage of all papers on the problem here or the Wikipedia page here

Takeaway: The Erdős Distinct Distance Problem saw decades of steady human progress and was eventually almost completely resolved without computer assistance. I would not have emphasized without computer assistance one year ago. Maybe not even one week ago.

----------------------------------------------

THE ERDŐS UNIT DISTANCE PROBLEM

Let S be a set of $n$ points in the plane. Some of the distances between points of S may be 1. How many? What is the maximum number of unit distances that can occur? Denote this by $f(n)$. Erdős showed

$ n^{1+\Omega(1/\log\log n)} \le f(n) \le O(n^{3/2}) $.

The lower bound is obtained by using the grid points of a scaled version of the $\sqrt{n}\times\sqrt{n}$ integer grid. See a writeup by ChatGPT, here. We note for later that this can be phrased as using the points in the Gaussian integers, $Z[i]$.

In the 1946 paper Erdős conjectured that, for all $\epsilon$, $f(n) = O(n^{1+\epsilon})$.

Joel Spencer, Endre Szemerédi, and William Trotter in 1984 showed $f(n) = O(n^{4/3})$.

Aside from the Spencer-Szemerédi-Trotter bound, there has been relatively little progress on either the upper or lower bound.

----------------------------------

CONTRASTING THE TWO PROBLEMS

Contrast: The Erdős Distinct Distance problem has seen extensive progress and a large literature. By contrast, although many papers were written on the Erdős unit distance problem, there was little improvement. That changed on May 20, 2026.

------------------------------------------

OPENAI PRODUCES A COUNTEREXAMPLE TO THE ERDŐS CONJECTURE

OpenAI recently produced a counterexample to Erdős's conjecture. More precisely, OpenAI proved the following:

There exists $\epsilon>0 $ and a sequence of point sets $P_i \subseteq R^2$ such that, letting $|P_i|=n_i$:

a) $n_i \rightarrow\infty$, and

b) for all $i$, the number of unit distances in $P_i$ is at least $n_i^{1+\epsilon}$.

The OpenAI announcement is here. The technical paper is here.

Note that the author is OpenAI. There are no human co-authors. However,(1) Lije Chen used an internal OpenAI model and (2) Mark Selke and Metaab Sawhney verified correctness. (For a comment on the lack of human authors, see Sebastien Bubeck's comment below.) The $\epsilon$ is not given though one could go through the paper and find it. It would be very small.

After the initial AI-generated argument (or AI-assisted?) human mathematicians streamlined and clarified the proof. The result is a paper by Noga Alon, Thomas Bloom, W.T. Gowers, Daniel Litt, Will Sawin, Arul Shankar, Jacob Tsimermann, Victor Wang, and Melanie Matchett Wood that explains the history, the proof, and their reactions to it. That paper is here. They obtain $\epsilon \sim 6\times 10^{-38}$.

Recall that Erdős's lower bound was obtained by using points in $Z[i]$. The new result used more complicated number fields. Indeed, the degree of the number fields used goes to infinity. For more details see the Alon et al. proof.

Will Sawin wrote a paper where he obtained $\epsilon=0.014114$. Quoting his paper:

To do this, we make explicit and sharpen every step of the argument. Interestingly, this rarely makes the argument much more complex. The total length of this paper is essentially the same as the original OpenAI writeup, though longer than the simplified version prepared by human authors [Alon et a.].

His paper is here

-------------------------------------------

MY POINTS

1) The Erdős's Unit Distance problem is a well known and interesting conjecture, so this is not picking out some obscure problem.

2) AI-generated or AI-assisted? The announcement by OpenAI, and the Alon et al. paper, indicate that the proof is AI-generated.

Fellow blogger Gary Marcus disagrees and argues that it is AI-assisted. See his post here.

ADDED LATER: An interview with Gary Marcus is here.

I think this is a hard question, and perhaps there is more of a continuum. However, I trust Alon et al., so I will go with AI-generated.

3) Humans were still needed to verify and clean up the proof. In the future, we will all be grad students, verifying and cleaning up what AI outputs.

4) The final proof was readable. One concern about AI proofs is that they would be unreadable and hard to verify. That was not the case here.

5) The ideas needed for the solution already existed; however:

a) The right combination was hard to find.

b) The relevant techniques used, algebraic number theory, are not standard tools in combinatorial geometry.

c) It was widely believed that Erdős's conjecture was true.

d) Producing a counterexample seems less impressive than proving the theorem. It would be of interest to see if OpenAI could prove the Guth-Katz result; however, that would not work now since Guth-Katz is already out there for OpenAI to see. Perhaps OpenAI should try to improve Guth-Katz.

6) In the short term, this result and what it portends, is good: math problems we care about will be resolved with the help of AI, perhaps solely by AI. But in the long run we may lose the ability---or at least the patience---to do the proofs ourselves. Note that I am assuming that AI will have future successes---see item 10-ONE for a counterthought.

7) AI seems to be good at combining known concepts. Is it good at coming up with new ones? Are humans? The distinction between combining known ideas and coming up with new ones is very thin.

8) Two contrary lessons:

a) You should know many fields of mathematics so that you can use ideas from one field in another, like the AI did.

b) You should know some field of math really well so you may do something new in it that current AI could not have done.

9) If an AI produces a new treatment for cancer that is cheaper and better than what is known I do not think we will care that an AI did it (though we will have humans check it). Is mathematics similar? Do we care that an AI did it? Medicine primarily values outcomes; Mathematics also values understanding. Does reading a proof that AI did give the same understanding as coming up with it yourself? Can AI help with that as well?

10) I suggest two futures:

ONE: While this AI-generated (or AI-assisted) result is impressive, it will be a rare occurrence. This result was actually a counterexample. The needed math was known. The result was interesting. This is a perfect storm that we might not see again for a while.

TWO: Even before the AI revolution, when I came up with a math problem I wanted solved, I would seek help, perhaps too early. My curiosity far exceeds my ego. Since AI makes it easy to get help, my fear is that eventually we will all be Bill Gasarch---scary.

Wednesday, May 20, 2026

Range Avoidance

Let $f$ be a function mapping binary strings of length $m$ to strings of length $n$ with $n>m$. Since there are more strings of length $n$ than $m$, $f$ is not onto. Can you find a string not in the range? This is known as the range avoidance problems, or AVOID for short

Let's do an example. Consider $f$ that outputs an undirected graph on $n$ nodes and takes as input:

$n$
A set $S$ of $k$ vertices $v_1,\ldots,v_k$
For each $i$ and $j$, $i<j$, where at least one of $i$ and $j$ are not in $S$, whether or not there is an edge between $i$ and $j$.
A bit $b$

If $b=1$ output the graph $G$ which consists of the given edges plus a clique on $S$. If $b=0$ the same but an independent set on $S$.

For an appropriate choice of $k$, about $2 \log n$, the output of $f$ is longer than the input. Any graph not in the range is a Ramsey graph, i.e., no clique or independent set of size $k$.

You can use AVOID to capture other probabilistic method constructions, high time-bounded Kolmogorov strings, Boolean functions with high circuit complexity, two-source extractors, rigid matrices, error-correcting codes and more.

AVOID started as the problem 1-Empty in a paper by Robert Kleinberg, Oliver Korten, Daniel Mitropolsky and Christos Papadimitriou as an example of a total function problem higher up the polynomial-time hierarchy as verifying that a string not in the range is a co-NP question. Korten made the connection to probabilistic constructions. Hanlin Ren, Rahul Santhanam and Zhikun Wang popularized the problem and gave AVOID its current name. I learned about AVOID talking about the problem with Rahul and his students during my time at Oxford.

You can solve AVOID randomly using an NP oracle, at least half the strings are not in the range, and you can check that a string is not in the range in co-NP. Whether you can solve AVOID deterministically with an NP oracle remains open. Korten showed that that problem is equivalent to circuit lower bounds for E^NP.

Surendra Ghentiyala, Zeyong Li and Noah Stephens-Davidowitz show that any decision problem that reduces to AVOID sits in AM ∩ co-AM which strongly suggests AVOID is not NP-hard.

See Korten's BEATCS survey for much more on the AVOID problem.

Sunday, May 17, 2026

Scott Aaronson wins Trevisan Award? Prize? Medal? Statue?

1) Congratulations to Scott Aaronson for winning the first Trevisan Award.

The Trevisan Award is in memory of Luca Trevisan and recognizes expository work in Theoretical Computer Science. It is given out by the ACM. The ACM announcement of Scott's award is here. Scott blogged about winning it here.

2) When preparing this blog post, I googled Trevisan Award. The first hit I got was this. That site says the award was for significant research. Hmmm. My first thought was Scott has, in fact, done significant research. Perhaps I misunderstood the award. I shared these thoughts with Lance who pointed out I was at the wrong website:

Scott won the Trevisan AWARD.

There is also a Trevisan PRIZE. This was the first hit I got when I googled Trevisan Award.

There is not yet a Trevisan MEDAL. Give it time.

4) I then realized I did not actually know the difference between an award, a prize, and a medal.

a) What are all the ways to honor someone in academia?

b) How do they differ?

5) I asked our future AI overlords:

In academia there are PRIZES, AWARDS, MEDALS. Other ways to honor people in other fields are Ribbons and Trophies. Are there other ways to honor people? How do they differ?

Chatty gave me 10 pages of interesting material that appeared correct, which is about the best one can hope for. See here. In case you are thinking tl;dr, here is a short version focused on academia, with some of my own thought mixed in.

a) Awards: General Recognition of excellence. May or may not involve competition (that's a tautology). I aspire to win the Montgomery Burns Award for Outstanding Achievement in the Field of Excellence.

b) Prizes: Usually competitive. Often recognizes a specific breakthrough.

c) Medals: Prestige and ceremony. Often accompanied by an actual medal.

d) Fellowships: Recognizes a substantial body of work.

e) Named lectures: Prestige plus airfare.

f) Named professorships: Prestige plus discretionary funds.

g) Election to academies: Peer validation with nice stationary.

h) Honorary degrees. A way to get a celebrity speaker at a graduation.

i) Statues. The only academic one I know of is The Slide Rule Statue, see here.

Academia has many ways to honor people. Apparently the hard part is keeping track of which noun goes with which person.

The real award was the friends we cited along the way.

Thursday, May 14, 2026

Prediction Markets Redux

For those very long-time readers this blog extensively covered prediction markets from 2006 to 2008. In a prediction market, you have a future event, such as the winner of an election, and a market that pays off one dollar if that event happens and nothing if it doesn't. The price of this market corresponds to a predicted probability that the event will happen.

I worked with David Pennock and Yiling Chen to create an interactive map that colored every state based on how likely it was to go Democratic or Republican for both the presidential, gubernatorial and Senate races.

This all ended when Intrade, the source of the data that we used, went out of business after its CEO died climbing Mount Everest and may have embezzled money from the company.

In 2016 prediction markets went out of favor after badly predicting against Brexit once and Trump twice.

But what's old becomes new again. Prediction markets are back with a bang, with Kalshi and Polymarket providing real money markets open to U.S. citizens, something we didn't legally have, except for very low stakes, twenty years ago.

Prediction markets play two distinct roles:

As a betting system so people can put their "money where their mouth is" based on their beliefs.
As an information aggregation system, to use the wisdom of the crowds to predict future events.

Sometimes these roles conflict with each other. With real money comes real greed and we've seen some recent issues of insider trading, most notably a US soldier who used classified knowledge about the then upcoming raid in Venezuela to net $400K on Polymarket.

Insider trading will likely lead to better predictions, but those who have bet on that market might feel cheated. In the short run, insider trading might be a good thing, but in the long run, if we scare away participants because they're afraid others who have more information will take advantage of them, then we have fewer people in the market making it harder to draw upon the wisdom of the crowds.

Another challenge for prediction markets is determining how the market pays off. You need to come up with a very specific definition of what it means for the market to pay off, but strange circumstances can lead to unexpected results. Back in 2006 North Korea launched a test missile outside the country but the market for that event paid off zero because the U.S. Department of Defense never verified that the missile actually left North Korean airspace.

The market I like to see is factoring the RSA numbers by a given date. Very easy to verify. Some people might buy such a security, expecting quantum computing to be factoring numbers soon. And then I can make some money by betting against it.

Monday, May 11, 2026

Searches Are Weird! No they're not! Bad coding style?

In David Marcus's guest post on good coding style (see here) he reviewed a book from 1986 called "Professional Pascal."

I wondered if it was still in print and could be bought:

1) I went to Amazon and searched all products for Professional Pascal. I got this, which is not that book.

2) I then restricted to books, and I got the same, though later on the page I got a relevant book, here.

3) I then searched for Professional Pascal on Google, and got the Amazon site for the book here.

David thought this was weird. I did not. As I put it:

A computer does something which makes no sense. This is common, hence it's not weird.

Why did the search from outside of Amazon do better than the search inside of Amazon?

Speculation

1) Search is just a really hard problem.

2) The coders at Amazon did not use good coding style. They should read the book. If they can find it.

Wednesday, May 06, 2026

When do we know someone has died

As the blog of record in computational complexity, we like to bring attention to those in the community who have left us. When we learn of someone in our field who has died, Bill and I will talk to each other and decide whether we should do a social media post or a full blog post, and who should write it, Bill, me, or someone else. In fact, if I call Bill, he'll often answer the phone with "who died?"

We also remember those who passed away during the year in our end-of-year post.

One challenge is how we actually know when somebody has died. Consider Michael Rabin. His death was announced on Wikipedia based on the following announcement in Haaretz, an Israeli newspaper.

Haaretz Obit (Translated by Google)

That's a pretty simple obituary for a very famous computer scientist. Rabin is a common name in Israel, and there easily could have been another professor named Michael Rabin somewhere in the country. Every mention of Michael Rabin's death that I saw was just citing the Wikipedia article, nothing from Tal Rabin or some other source that cited the family.

By the time Bill put up the first Rabin post on April 22, we figured that had our Michael Rabin not died, someone would have come forward about it.

Tony Hoare is a different story, where our blog was one of the first to break the news. I heard from two separate people that they had heard from the family that he had passed away. It helped that I was in Oxford at the time, where Hoare spent much of his career.

And too often a theoretical computer scientist passes away but the news never reaches us and we don't remember them. It's always sad when someone passes, but it is a good opportunity to remember how they helped shape our field. But we need your help to know when someone has passed away. So if you know someone in our community has passed away, please let us know, and how you know, so that we can know we know.

Sunday, May 03, 2026

A few notes on Michael Rabin

Michael Rabin passed away on April 14,2026. I blogged about him here.

My post listed results of his that proved upper and lower bounds on problems. My point was that he proved upper and lower bounds for MANY different levels- from decidable to regular. And I am sure I left out some of his results.

Here are some things I did not mention.

1) Rabin and Scott shared the Turing Award in 1976. My not mentioning it raises the following question:

If I want to say someone has an impressive set of results which is a better way:

listing the awards they've won, or

listing their results.

I leave this to the reader.

2) I had Rabin for two graduate courses at Harvard: Algorithms and Complexity Theory. He was a great teacher and gave insights into the results, some of which he had either proven or worked on.

3) I recalled thanking him in my PhD Thesis. So I dusted it off to see what I had said:

The many courses I have taken at Harvard and MIT have helped me create this thesis. I am especially indebted to Michael Rabin, Mike Sipser, and Michael Stob for their excellent courses in algorithms, complexity theory, and recursion theory. Their pedagogy has been an inspiring example of what good teaching can and should be.

What is the probability that all three great teachers were named Michael? I do not know, however, I suspect Michael Rabin could have told me.

Wednesday, April 29, 2026

Because It Doesn't Have To

My favorite quote about networking came from Jim Kurose.

The Internet works so well because it doesn't have to.

The IP and lower layers of the internet stack make no promises of delivery. Complete failure fulfills the protocol. This allows for simpler and more powerful protocols without the extra complexity needed to guarantee success. TCP aims for delivery basically by restarting the IP communication when it fails, and even TCP can report failure to the layers above.

We can say the same about modern artificial intelligence.

Machine learning works so well because it doesn't have to.

With the softmax function that neural nets use to determine the probability of outputs, neural nets never completely rule out a possibility, always giving it at least some tiny probability. In cases where the complexity is just too difficult, neural nets give several possibilities with nontrivial probabilities, as I described in my recent post, where a machine learning model would generate a uniform distribution to capture the output of a pseudorandom generator. Instead of rigidly forcing the model to give us a specific answer, by looking at distributions we allow the models to make mistakes.

Thus a machine learning model can be correct when it makes probabilistic guesses in situations too complicated to solve directly, which allows it to achieve its best possible performance. Because we allow the models to make mistakes, they have the flexibility to solve complex problems far more frequently.

Sunday, April 26, 2026

LEAPing into the Future of Coding

A few months ago in Oxford, Bernard Sufrin, an emeritus fellow, said he's looking to hire a student to implement LEAP (Logic Engine for Argument by Pointing), a way to teach logic by proving basic logic theorems via pointing and clicking. Rahul Santhanam said why not give it to AI. Bernard said AI can't handle this task. I thought why not give it a try.

I gave Claude a single prompt that Bernard helped me formulate:

Architecture of a system to support proof by pointing in first order logic using LEAP

About an hour later, without giving additional details, we had a working prototype. Give it a try. I put the code and some more details on GitHub.

Watching Claude work was amazing. Claude created an architecture based on the paper Proof by Pointing by Yves Bertot, Gilles Kahn and Laurent Théry.

Claude then asked me:

Would you like me to proceed with implementing any of these layers?

Why not? So I told Claude to go ahead.

It started coding and it created 78 test cases and kept debugging itself until all those test cases worked out. It took me longer to get the program working on the web than Claude took to program it. I sent the link to Bernard who responded "Wow! I take it back if the program was all synthesized from the prompt."

When I asked Bernard a month later if I could post about this program, he agreed, though he had some additional comments.

The consolation for me is that the outcome, though surprising and I suppose delightful (though it didn't manage rules for quantifiers), didn't refute my conjecture that Claude et al cannot do "architecture". The architecture (MVC) is stock; the "proof by pointing" hint led it to a paper of the same name which gave details of how to derive terms/formulae/goals in the (unfinished) proof from screen coordinates. Maybe you could give a substantively different prompt.
It may be of interest to know that the Bornat/Sufrin JAPE program, still on GitHub, was eventually a lot more ambitious when it came to selecting things: terms, subterms, goals. But it took more than 2 hours to build!

Bernard has a point. It wasn't just the fifteen-word prompt alone. Claude leaned on the Bertot et al. paper to guide its design and implementation. Still, that Claude did architect, build and debug the system from the prompt and the paper is truly impressive. We've truly crossed a threshold for coding, far beyond what would have been possible just a few months earlier.

Wednesday, April 22, 2026

Michael Rabin Passed Away on April 14, 2026, at the age of 94

Michael Rabin passed away on April 14, 2026 at the age of 94. (Scott Aaronson has also blogged about his passing, see here.)

I had many points to make about him; however, the first one got so long that I will just do that one for today's blog post.

Rabin is an extremely well-rounded $\ldots$ computer scientist? Computer scientist seems too narrow, and the point of this point is that he was well rounded. So I will start this thought again.

The following is an extremely important question that permeates computer science, mathematics, and I am sure other fields:

Given a problem, how hard is it?

Note that this is a rather broad problem since the terms problem and hard are very broad. And by hard I mean upper bounds and lower bounds.

Rabin had worked on this problem in many different domains. I list them in roughly the order of hardness, starting from the hardest.

1) Recursive Algebra: Mathematicians had proven that every field has an algebraic closure (an extension that is algebraically closed). In 1960 Rabin asked and answered affirmatively: Does every computable field (the elements of the field are computable, $\times$, $+$, are computable) have a computable algebraic closure. This was an early result in what was later to be Nerode's Recursive Math Program which later became a subcase of the Friedman/Simpson Reverse Math Program.

2) The Decidability of S2S: In 1969 Rabin proved that the the second order theory of 2 successors (S2S) is decidable. In a course I had with him he taught the proof that the weak second order theory of 1 successor (WS1S) is decidable. I teach that when I teach Automata Theory since it brings together Finite Automata and Decidability. Here are my slides:here

S2S is one of the only decidable theories where one can actually state theorems in math of interest in them. (I may blog about that some other time.)

S2S is the decidable theory with the hardest proof of its decidability. Rabin's proof used transfinite induction, though later proofs did not.

Personal Note: I was the subreferee on a paper by Gurevich and Harrington that simplified the proof tremendously, back in the early 1980's. Their proof is the one to read now. Rabin was happy that the proof was simplified. The proof is still hard, just not as hard.

3) In 1974 Fisher & Rabin showed that any algorithm to decide Presburger arithmetic required time $\ge 2^{2^{cn}}$ for some constant $c$. I asked chatty if better is known and it gave me answers that didn't make sense. An earlier version of this paragraph said so and had some incorrect statements in it. One of the commenters told me what is TRUE which made it easier to look it up. Anyway, there is a triple-exp algorithm, and there is a complexity class that Presburger is complete for---see the comment. Spellcheck thinks Presburger is spelled incorrectly. I can understand why it thinks so, but its wrong.

When Rabin taught this result he pointed out that Hilbert and others not only thought that math was decidable but also that perhaps that algorithm could be used to really do math. The complexity results show that even if a theory is decidable it may still be really hard. (With AI maybe we can use computers to do math, but that is way too big a topic, and too big a tangent, to get into within a blog-obit).

4) In 1972 Rabin proved the following. Given linear forms $L_1(x),\ldots,L_m(x)$ over the reals (so the intent is $x\in R^n $) we want to know if there exists $x\in R^n $ such that, for all $1\le i\le m $, $L_i(x)>0 $. The model of computation is a decision tree where each internal node can ask a question of the form $L(x) {\rm\ RELATION\ } 0 $ where RELATION can be any of $<,=,>$. Each leaf is labelled YES or NO. Then the depth of the tree is $\ge 2^{\Omega(n)} $. This is an early paper in decision tree complexity.

5) In 1977 the Handbook of Math Logic came out. Rabin did the article on Decidable theories. He mentioned P vs NP as being really important and being the next logical (no pun intended) direction for logic. He was the only author to mention P vs NP.

6) While Rabin did not invent randomized algorithms he worked on them a lot early on. In 1977 Solovay and Strassen obtained a polytime randomized algorithm for primality. In 1976 Rabin noticed that Miller's Primality algorithm (which showed that if the Extended Riemann Hypothesis is true then primality is in P) could be modified to be a randomized polynomial time algorithm for primality. While preparing this blog post I noticed that I often hear about the Miller-Rabin algorithm (many cryptography protocols need a primality algorithm) but I hardly ever hear about the Solovay-Strassen algorithm. I asked Google AI why this was. In brief: Miller-Rabin is faster, has a lower probability of error, and is simpler. Is Google AI correct? I think yes since Miller-Rabin is used and Solovay-Strassen is not. I might be employing circular reasoning here.

The Miller-Rabin primality test might be his second best known work, the best known being the last item on this list, the result of Rabin and Scott that NFA's are equivalent to DFA's. (It's last since it is the lowest complexity class he worked on.)

7) Rabin obtained other randomized algorithms. I mention two:

a) Rabin-Shallit: When teaching automata theory I often give my students the following sets in NP and ask them to determine which ones are known to be in P:

$ \{ x \in N \colon (\exists y)[x=y^2] \} $

$ \{ x \in N \colon (\exists y_1,y_2)[x=y_1^2+ y_2^2] \} $

$ \{ x \in N \colon (\exists y_1,y_2,y_3)[x=y_1^2+ y_2^2+ y_3^2] \} $

$ \{ x \in N \colon (\exists y_1,y_2,y_3,y_4)[x=y_1^2+ y_2^2+ y_3^2+y_4^2] \} $

$ \{ x \in N \colon (\exists y_1,y_2,y_3,y_4,y_5)[x=y_1^2+ y_2^2+ y_3^2+y_4^2+y_5^2] \} $

etc.

The input is in binary so searching for all possible $y_1,y_2,y_3,y_4,\ldots,y_n$ is exponential in the length of the input.

I leave the first three to the reader.

This one:

$ \{ x \in N \colon (\exists y_1,y_2,y_3,y_4)[x=y_1^2+ y_2^2+ y_3^2+y_4^2] \} $

is sort-of a trick question. Every number is the sum of 4 squares, so this set, and all later ones, are trivially in P. But that raises the question: how to find $ y_1,y_2,y_3,y_4 $? This is a curious case of a problem that is in NP, the decision part is in P (in fact, trivial), but finding the witness seems hard.

When I first assigned this problem I then looked up what was known about finding the $y_1,y_2,y_3,y_4$.

In 1986 Rabin and Shallit showed that, assuming ERH, there is a randomized $ O(\log^2 n) $ algorithm. I am surprised that you need both ERH and Randomness. This seems to be a less-well known result though I don't know why since it's a natural question.

b) Karp-Rabin: In 1987 Karp and Rabin obtained a really fast, really simple (that's a plus in the computer science world) randomized pattern matching algorithm. Since it is fast and simple I wondered if it is really used. It is! To quote Google AI:

Yes, the Karp-Rabin algorithm is used in the real world, particularly in scenarios requiring the detection of multiple patterns simultaneously, such as plagiarism detection,data deduplication, and bioinformatics.

Is Google AI correct? I leave that as an exercise for the reader.

8) In 1979 Rabin devised a cryptosystem whose security is equivalent to factoring. How come RSA became the standard and not Rabin's system? Broadly two possibilities

a) RSA was better.

b) RSA was faster to the marketplace and other random factors.

Which is it? I leave that to the reader.

9) In 1958 Rabin and Scott showed that NFAs are equivalent to DFAs. This may be his best known work.

Thursday, April 16, 2026

Machine Learning and Complexity

At Oxford I focused my research and discussions on how we can use the tools of computational complexity to help us understand the power and limitations of machine learning. Last week I posted my paper How Does Machine Learning Manage Complexity?, a first step in this direction. Let me give a broad overview of the paper. Please refer to the paper for more technical details.

Instead of focusing on the machine learning concepts like neural nets, transformers, etc., I wanted to abstract out a model defined in terms of complexity, as time-efficient non-uniform computable distributions with minimum probabilities. Let's unpack this abstraction.

Neural nets as next token predictors don't always give the same next token, rather they give a distribution of tokens, which leads to a distribution on text of length up to the size of the context window. The way probabilities are computed via a softmax function guarantee that every output can occur, with at least an exponentially small probability, the "minimum probability" in the abstraction.

In computational complexity, we study two main kinds of distributions. A distribution is sampleable if there is an algorithm that takes in uniformly random bits and outputs text according to that distribution. A distribution is computable if we can compute not only the probability that a piece of text occurs, but the cumulative distribution function, the sum of the probabilities of all outputs lexicographically less than that text. Every efficiently computable distribution is efficiently sampleable but not likely the other way around.

Neural nets as next token predictors turn out to be equivalent to computable distributions. We need these kinds of distributions both on how neural nets are trained but also on how we use them. Computable distributions allow for conditional sampling which lets us use a large-language model to answer questions or have a conversation. You can't get conditional sampling from a sampleable distribution.

Neural nets have a limited number of layers, typically about a hundred layers deep which prevents them from handling full time-efficient (polynomial-time) computations. They can get around this restriction either with reasoning models that allow a model to talk to itself, or by directly writing and executing code which run efficient algorithms of any kind.

Most of the algorithms we use, think Dijkstra, or matching, are uniform, that is the same algorithm on graphs of twenty nodes as the algorithm on twenty million. But neural nets change their weights based on the distributions they train on. These weights encode a compressed view of that data and the extra computation needed to process that data. So we have different algorithms as our technology and data sources improve. I wrote more about this connection between AI and nonuniformity last fall.

What does this abstraction buy us?

Restricting to computable distributions forces machine learning algorithms to treat complex behavior as random behavior, much like we treat a coin flip as a random event because it is too complicated to work out all the physical interactions that would determine which side it lands on.

We illustrate this point by our main result. Let D be the distribution of outputs from a cryptographic pseudorandom generator and U be the uniform distribution of words of the same length. If $\mu$ is a time-efficient non-uniform computable distribution with minimum probabilities and $\mu$ does as good or a better job than U of modeling D then $\mu$ and U are essentially the same distribution. Machine learning models the complex PRG as a uniform distribution, simplifying what we can't directly compute by using randomness.

More in the paper and future blog posts.

Tuesday, April 14, 2026

Guest Post from Peter Brass, Former NSF Theory Director, on the NSF budget.

Guest post from Peter Brass, Former NSF Theory director (though not affiliated with the NSF now) on the White House NSF budget for FY 2027.

---------------------------------------------

Dear Colleagues

A week ago the White House released the NSF budget request for FY 2027( here) so I want to provide you an update. As always, this is just my interpretation, I am not connected to the NSF any more

The general news is bad, we get a replay of FY 2026 (FY 2026 is October 2025 to September 2026)

For context: the NSF enacted budget in FY 2023 was $9.5B. The CISE budget was $0.9B. Budgets start with a request by the White House, and then get negotiated with Congress.

A year ago, the White House made a FY 2026 budget request of $3.9B, which was a ~60% cut from the previous level. After negotiation with Congress, the enacted budget became $8.7B

However, a long period of FY 2025 were funded by continuing resolutions, and continuing resolutions are based on the White House budget request, so for the first half of FY 2025 the 60% cut was in effect, and awards only now pick up. In addition, there was the government shutdown, and the removal of NSF from the Eisenhower Ave building in Alexandria, so processing will be delayed.

The cancellation of awards, although it caused a huge media echo, had a very small impact outside the EDU and SBE directorates, about 2% of awards were affected, and perhaps 1/3 of them were restored.

The FY 2027 budget request asks for $4.8B, but subtracting $0.9B reserved for a new antarctic research vessel, the request is the same $3.9B. We hope that Congress will again restore much, but clearly the White House continues not supportive of basic research. The proposed cuts for NSF are a larger percentage than for NASA, NIH, NIST, or the DoE Research. The request reduces the CISE budget from $0.9B to $0.3B.

The NSF projects a slight decrease in the number of proposals, and a huge decrease in the funding rate, across all research proposals a decline from 18% to 7%. The average award size and award duration are projected to increase slightly, giving much fewer people a bit more resources.

So far NSF has not done any hiring since the start of the current administration, and although Congress restored much of the research funding, it did not restore the cuts in the staffing level. Rotators continue to end their rotation, and are not replaced, the current plan seems to discontinue the concept of a rotator entirely.

With the reduction is program staff, the remaining program directors are overworked, and are put in charge of programs where they have no previous connection to the research community, without time to establish the connection. The programs themselves remain unchanged, as they are governed by the solicitations, but the people managing the programs can give less time to the individual program. And the money available depends on the outcome of the budget process.

That is the current situation, or at least the plan of the White House; we will see what Congress makes of it.

TO DO: tell your congress member that you thank for their previous support of basic research at the NSF, and hope they continue it.

Thursday, April 09, 2026

Afterthoughs on Banach Tarski and the Miracle of loaves and Fishes

I posted about using the Banach-Tarski Paradox(BT) to explain the miracle of Loaves and Fishes (LF) here.

Darling says that whenever I fool my readers or my students then I have to tell them later, so I'll tell you now: The story about me meeting with Pope and talking about the BT Paradox (that would be a good name for a rock band: B-T-Paradox) was not true. I think my readers know that.

1) I first learned the Banach-Tarski Paradox as a grad student in 1981 when I read Hillary Putnam's article Models and Reality where he writes on Page 470:

One cannot simply sit down in one's study and ``decide'' that ``V=L'' is to be true, or that the axiom of choice is to be true. Nor would it be appropriate for the mathematical community to call an international convention and legislate these matters. Yet , it seems to me that if we encountered an extra-terrestrial species of intelligent beings who had developed a high level of mathematics, and it turned out they rejected the axiom of choice (perhaps because of the Tarski-Banach Theorem), it would be wrong to regard them as simply making a mistake. To do that would, on view, amount to saying that acceptance of the axiom of choice is built into our notion of rationality itself; that does not seem to me to be the case.

I agree with him and I wonder if we accept AC too readily. See my blog post on the BT paradox and my wife's strong opinion (she's against it).

2) Back in 1981 my first thought was I wonder if someone has thought to use the BT paradox to explain the LF? And if so, were they serious or was it some kind of joke? And does the Pope really get a discount at Pope-Yes?

I also had the meta-thought (which I could not have said as cleanly as I will now):

I wonder how I could find out if anyone else has thought of the BT-LF connection?

Recall that back in 1981 the Internet was but a glint in Al Gore's eyes. So back then I could not find out if anyone else had that thought of BT-LF.

But now I can! And indeed, as I expected, some other people have made the connection of BT to LF:

A tweet and a Reddit thread discussing the tweet: here. Not serious

A serious article, I think, here

A 24-page article about Holy Water and BT. I can't imagine an article that long being a parody so I think its serious, see here. On the other hand, there is a 12-page article about Ramsey Theory and History that I think is supposed to be a parody, see here. (My proofreader points out that a different definition of parody is A feeble or ridiculous imitation so the article may well be a parody in that sense.

A parody article, I think, here

I am sure there are more.

3) I had thought of doing a blog post about BT and LF a long time ago, but Pope Leo having a math degree was the final push I needed.

4) The word cardinal has three very different meanings: (a) a type of infinity, (b) a position in the Catholic church, (c) the bird. Same for large cardinal.

5) One of my students who proofread the post thought that people will know it's a hoax since I am a vegetarian and hence would not eat at Pope-Yes, even if the Pope was paying.