Sunday, October 13, 2019

The Sheldon Conjecture (too late for Problems with a Point)

Chapter 5 of Problems with a Point (by Gasarch and Kruskal) is about how mathematical objects get their names. If it was an e-book that I could edit and add to (is this a good idea or not? later on that) then I would have added the following.

The Sheldon Conjecture

Background: Nobel Laureate Sheldon Cooper has said that 73 is the best number because

a) 73 is prime.

b) 73 is the 21st prime and note that 7*3=21.

c) The mirror of 73, namely 37, is prime.

d) 37 is the 12th prime, and 12 is the mirror of 21.

Sheldon never quite said its the only such number; that was conjectured by Jessie Byrnes, Chris Spicer, and Alyssa Turnquist here. They called it Sheldon's Conjecture probably since Sheldon Cooper should have conjectured it

Why didn't Sheldon make Sheldon's conjecture? This kind of question has been asked before:

Could Euler have conjectured the prime number theorem

Why didn't Hilbert (or others) pursue Ramsey Theory?

(readers are encouraged to give other examples)

I doubt we'll be asking this about Sheldon Cooper since he is a fictional character.

I am delighted that

a) There is a Sheldon's Conjecture.

b) It has been solved by Pomerance and Spicer, see here

Actually (b) might not be so delightful--- once a conjecture is proven its stops being called by the name of the conjecturer. If you don't believe me just ask Vazsonyi or Baudet. If you don't know who they are then (1) see here and (2) that proves my point. So perhaps I wish it had not been solved so The Sheldon Conjecture would live on as a name.

Another issue this brings up: Lets say that Problems with a Point was an online book that I was able to edit easily. Then I might add material on The Sheldon Conjecture. And while I am at it, I would add The Monty Hall Paradox to the chapter on how theorems get there names. Plus, I would fix some typos and references. Perhaps update some reference. Now lets say that all books were online and the authors could modify them. Would this be good or bad?

1) Good- The book would get better and better as errors got removed.

2) Good- The book would get to include material that is appropriate but came out after it was published.

3) Good- The book would get to include material that is appropriate but the authors forgot to include the first time around.

4) Bad- For referencing the book or for book reviews of the book, you are looking at different objects. The current system has First Edition, Second Edition, etc, so you can point to which one you are looking at. The easily-edited books would have more of a continuous update process so harder to point to things.

5) Bad- When Clyde and I emailed the final version to the publisher we were almost done. When we got the galleys and commented on them we were DONE DONE! For typos and errors maybe I want to fix them online, but entire new sections--- when we're done we are DONE.

6) Bad- at what point is it (i) a corrected version of the old book, (ii) a new edition of the old book, (iii) an entirely new book? Life is complicated enough.

I would prob like a system where you can fix errors but can't add new material. Not sure if that's really a clean distinction.

Thursday, October 10, 2019

William Kruskal's 100th birthday

Today, Oct 10, 2019 is William Kruskal's 100th birthday (he's dead, so no cake. Oh well.) William Kruskal was a great statistician. To honor him we have a guest post by his nephew Clyde Kruskal. We also note that the Kruskal Family is one of the top two math families of all time (see here). William is a reason why the other two Kruskal brothers went into mathematics: As a much older sibling (6 years older than Martin and 8 years older than Joseph), he encouraged their early mathematical development.

Here are some pictures of William Kruskal and of the Kruskal Family: here

Guest Post by Clyde Kruskal

I was asked to blog about my uncle, the statistician, William H. Kruskal, on the centennial of his birth. We called him Uncle Bill. He is best known for co-inventing the Kruskal-Wallis test.

There are two stories that I know about Bill's childhood, which must have been family lore:

(1) As a young child, Bill was a prolific reader. His reading comprehension outstripped his conversational English. One morning, having just read the word ``schedule'' in a book, and obviously having never heard it pronounced, he sat down to breakfast and asked:
"What is the ske·DU·le for today?"

(2) My grandparents once had Bill take an occupational assessment test. The tester said that Bill was a very bright child, and should become a traffic engineer to solve the problems with traffic congestion. (This would have been the 1920s!) As you probably know, Uncle Bill did not succeed in ending traffic congestion. Oh well.

Recently there has been a controversy over whether to ask about citizenship in the 2020 census. In the late 1900s there was a different controversy: whether to adjust the known undercount statistically. In general, Democrats wanted to adjust the count and Republicans did not (presumably because Democratic states tended to have a larger undercount). A national committee was set up to study the issue, with four statisticians in favor and four against. I was surprised to learn that Uncle Bill was on the commission as one of those against adjustment, since, I thought his political views were more closely aligned with those of the Democrats. He was very principled, basing his views only on statistical arguments. I saw him give a talk on the subject, which seemed quite convincing (but, then again, I did not see the other side). They ended up not adjusting.

For more on William Kruskal, in general, and his view on adjusting the census, in particular, see the pointers at the end of this post.

I have more to say. I just hope that I am on the ske·DU·le to blog about Uncle Bill at the bicentennial of his birth.

The William Kruskal Legacy: 1919-2005 by Fienberg, Stigler, and Tanur

A short biography of William Kruskal by J.J. O'Connor and E.F. Robertson

William Kruskal: Mentor and Friend by Judith Tanur

William Kruskal: My Scholarly and Scientific Model by Stephen Fienberg

A conversation with William Kruskal by Sandy Zabell

Testimony for house subcommittee on census and population for 1990 (see page 140)

Monday, October 07, 2019

What comes first theory or practice? Its Complicated!

Having majored in pure math I had the impression that usually the theory comes first and then someone works out something to work in practice. While this is true sometimes it is often NOT true and this will not surprise any of my blog readers.  Even so, I want to tell you about some times it surprised me. This says more about my ignorance than about math or applications or whatnot.

1) Quantum

a) Factoring was proven to be in BQP way before actual quantum computers could do this in reasonable time (we're still waiting).

b) Quantum Crypto- This really is out there. I do not know what came first, the theory or the practice. Or if they were in tandem.

c) (this one is the inspiration for the post)  When I first heard the term Quantum Supremacy I thought it meant the desire for a THEOREM that problem A is in BQP but is provably not in P.  For example, if someone proved factoring is not in P (unlikely this will be proven, and hey- maybe  factoring is in P). Perhaps some contrived problem like those constructed by diagonalization (my spell checker thinks that's not a word. Having worked in computability theory, I think it is. Darn- my spellchecker thinks computability is not word.) Hence when I heard that Google had a paper proving Quantum Supremacy (I do not recall if I actually heard the word  proven) I assumed that there was some theoretical breakthrough. I was surprised and not in the slightest disappointed to find out it involved actual quantum computers.

Question: When the term Quantum Supremacy was first coined, did they mean theoretical, or IRL, or both?

2) Ramsey Theory

a) For Ramsey's Theorem and Van Der waerden's theorem and Rado's theorem and others I could name, first a theorem showed a upper bound on a number, then later computers and perhaps some math got better bounds on that number.

b) Consider the following statement:

For all c there exists P such that for all c-colorings of {1,...,P} there exists x,y,z the same color such that x2 +y2 = z2.

Ronald Graham conjectured the c=2 case and offered $100 in the 1980's. (I do not know if he had any comment on the general case.)  I assumed that it would be proven with ginormous bounds on the P(c) function, and then perhaps some reasonable bound would be found by clever programming and some math. (see here for the Wikipedia Entry about the problem, which also has pointers to other material).

Instead the c=2 case was proven with an exact bound, P(2)=7825, by a computer program, in 2016. The proof is 200 terabytes. So my prediction was incorrect.

As for the result

PRO: We know the result is true for c=2 and we even know the exact bound. Wow! and for Ramsey Theory its unusual to have exact bounds!

CON: It would be good to have a human-readable proof. This is NOT an anti-technology statement. For one thing, a human-readable proof might help us get the result for c=3 and beyond.

3) This item is a cheat in that I knew the empirical results first. However, I will tell you what I am sure I would have thought (and been wrong) had I not know them.

Given k, does the equation

x3 +y3 +z3 = k

have a solution in Z? I would have thought that some hard number theory would determine
for which k it has a solution (with a proof that does not give the actual solutions)  and for then a computer programs would try to find the solutions. Instead (1) some values of k are ruled out by simple mod considerations, and (2) as for the rest, computers have found solutions for some of them. Lipton-Regan (here) and Gasarch (here) have blogged about the k=33 case. Lipton-Regan also comment on the more recent k=42 case.

Thursday, October 03, 2019

Quantum Supremacy: A Guest Post by Abhinav Deshpande

I am delighted to introduce you to Abhinav Deshpande, who is a graduate student at the University of Maryland, studying Quantum Computing. This will be a guest post on the rumors of the recent Google breakthrough on Quantum Supremacy. For other blog posts on this exciting rumor, see Scott Aaronson's postScott Aaronson's second post on itJohn Preskill's quanta articleFortnow's post,
and there may be others.

Guest post by Abhinav:

I (Abhinav) thank Bill Fefferman for help with this post, and Bill Gasarch for inviting me to do a guest post.

The quest towards quantum computational supremacy

September saw some huge news in the area of quantum computing, with rumours that the Google AI Lab has achieved a milestone known as 'quantum computational supremacy', also termed 'quantum supremacy' or 'quantum advantage' by some authors. Today, we examine what this term means, the most promising approach towards achieving this milestone, and the best complexity-theoretic evidence we have so far against classical simulability of quantum mechanics. We will not be commenting on details of the purported paper since there is no official announcement or claim from the authors so far.

What it means

First off, the field of quantum computational supremacy arose from trying to formally understand the differences in the power of classical and quantum computers. A complexity theorist would view this goal as trying to give evidence to separate the complexity classes BPP and BQP. However, it turns out that one can gain more traction from considering the sampling analogues of these classes, SampBPP and SampBQP.  These are classes of distributions that can be efficiently sampled on classical and quantum computers, respectively. Given a quantum circuit U on n qubits, one may define an associated probability distribution over 2^n outcomes as follows: apply U to the fiducial initial state |000...0> and measure the resulting state in the computational basis. This produces a distribution D_U.

A suitable way to define the task of simulating the quantum circuit is as follows:

Input: Description of a quantum circuit U acting on n qubits.

Output: A sample from the probability distribution D_U obtained by measuring U|000...0> in the computational basis.

One of the early works in this field was that of Terhal and DiVincenzo, which first considered the complexity of sampling from a distribution (weak simulation) as opposed to that of calculating the exact probability of a certain outcome (strong simulation). Weak simulation is arguably the more natural notion of simulating a quantum system, since in general, we cannot feasibly compute the probability of a certain outcome even if we can simulate the quantum circuit. Subsequent works by Aaronson and Arkhipov, and by Bremner, Jozsa, and Shepherd established that if there is a classically efficient weak simulator for different classes of quantum circuits, the polynomial hierarchy collapses to the third level.

So far, we have only considered the question of exactly sampling from the distribution D_U. However, any realistic experiment is necessarily noisy, and a more natural problem is to sample from a distribution that is not exactly D_U but from any distribution D_O that is ε-close in a suitable distance measure, say the variation distance.

The aforementioned work by Aaronson and Arkhipov was the first to consider this problem, and they made progress towards showing that a special class of quantum circuits (linear optical circuits) is classically hard to approximately simulate in the sense above. The task of sampling from the output of linear optical circuits is known as boson sampling. At the   time, it was the best available way to show that quantum computers  may solve some problems that are far beyond the reach of classical computers.

Even granting that the PH doesn't collapse, one still needs to make an additional conjecture to establish that boson sampling is not classically simulable.  The conjecture is that additively approximating the output probabilities of a random linear optical quantum circuit is #P-hard.  The reason this may be true is that output probabilities of random linear optical quantum circuits are Permanents of a Gaussian random matrix, and the Permanent is as hard to compute on a random matrix as it is on a worst-case matrix. Therefore, the only missing link is to go from average-case hardness of exact computation to average-case hardness of an additive estimation. In addition, if we make a second conjecture known as the "anti-concentration" conjecture, we can show that this additive estimation is non-trivial: it suffices to give us a good multiplicative estimation with high probability.

So that's what quantum computational supremacy is about: we have a computational task that is efficiently solvable with quantum computers, but which would collapse the polynomial hierarchy if done by a classical computer (assuming certain other conjectures are true). One may substitute "collapse of the polynomial hierarchy" with stronger conjectures and incur a corresponding tradeoff in the likelihood of the conjecture being true.

Random circuit sampling

In 2016, Boixo et al. proposed to replace the class of quantum circuits for which some hardness results were known (commuting circuits and boson sampling) by random circuits of sufficient depth on a 2D grid of qubits having nearest-neighbour interactions. Concretely, the proposed experiment would be to apply random unitaries from a specified set on n qubits arranged on a 2D grid for sufficient depth, and then sample from the resulting distribution. The two-qubit unitaries in the set are restricted to act between nearest neighbours, respecting the geometric This task is called random circuit sampling (RCS).

At the time, the level of evidence for the hardness of this scheme was not yet the same as the linear optical scheme. However, given the theoretical and experimental interest in the idea of demonstrating a quantum speedup over classical computers, subsequent works by Bouland, Fefferman, Nirkhe and Vazirani, and Harrow and Mehraban bridged this gap (the relevant work by Aaronson and Chen will be discussed in the following section). Harrow and Mehraban proved anticoncentration for random circuits. In particular, they showed that a 2-dimensional grid of n qubits achieve anticoncentration in depth O(\sqrt{n}), improving upon earlier results with higher depth due to Brandao, Harrow and Horodecki. Bouland et al. proved the same supporting evidence for RCS as that for boson sampling, namely a worst-to-average-case reduction for exactly computing most output probabilities, even without the permanent structure possessed by linear optical quantum circuits.


So far, we have not discussed the elephant in the room: of verifying that the output distribution supported on 2^n outcomes. It turns out that there are concrete lower bounds such as those due to Valiant and Valiant, showing that verifying whether an empirical distribution is close to a target distribution is impossible if one has few samples.

Boixo et al. proposed a way of certifying the fidelity of the purported simulation. Their key observation was to note that if their experimental system is well modelled by a noise model called global depolarising noise, estimating the output fidelity is possible with relatively few outcomes. Under global depolarising noise with fidelity f, the noisy distribution takes the form D_N = f D_U + (1-f) I, where I is the uniform distribution over the 2^n outcomes. Together with another empirical observation about the statistics of output probabilities of the ideal distribution D_U, they argued that computing the following cross-entropy score would serve as a good estimator of the fidelity:

f ~ H(I, D_U) - H(D_exp, D_U), where H(D_A,D_B) is the cross-entropy between the two distributions: H(D_A, D_B) = -\sum_i p_A log (p_B).

The proposal here was to experimentally collect several samples from D_exp, classically compute using brute-force the probabilities of these outcomes in the distribution D_U, and estimate the cross-entropy using this information. If the test outputs a high score for a computation on sufficiently many qubits and depth, the claim is that quantum supremacy has been achieved.

Aaronson and Chen gave alternative form of evidence for the hardness of scoring well on a test that aims to certify quantum supremacy similar to the manner above. This sidesteps the issue of whether a test similar to the one above does indeed certify the fidelity. The specific problem considered was "Heavy Output Generation" (HOG), the problem of outputting strings that have higher than median probability in the output distribution. Aaronson and Chen linked the hardness of HOG to a closely related problem called "QUATH", and conjectured that QUATH is hard for classical computers.

Open questions

Assuming the Google team has performed the impressive feat of both running the experiment outlined before and classically computing the probabilities of the relevant outcomes to see a high score on their cross-entropy test, I discuss the remaining positions a skeptic might take regarding the claim about quantum supremacy.

"The current evidence of classical hardness of random circuit sampling is not sufficient to conclude that the task is hard". Assuming that the skeptic believes that the polynomial hierarchy does not collapse, a remaining possibility is that there is no worst-to-average-case reduction for the problem of *approximating* most output probabilities, which kills the proof technique of Aaronson and Arkhipov to show hardness of approximate sampling.

"The cross-entropy proposal does not certify the fidelity." Boixo et al. gave numerical evidence and other arguments for this statement, based on the observation that the noise is of the global depolarising form. A skeptic may argue that the assumption of global depolarising noise is a strong one.

"The QUATH problem is not classically hard." In order to give evidence for the hardness of QUATH, Aaronson and Chen examined the best existing algorithms for this problem and also gave a new algorithm that nevertheless do not solve QUATH with the required parameters.

It would be great if the community could work towards strengthening the evidence we already have for this task to be hard, either phrased as a sampling experiment or together with the verification test.

Finally, I think this is an exciting time for quantum computing and to witness this landmark event. It may not be the first probe of an experiment that is "hard" to classically simulate, since there are many quantum experiments that are beyond the reach of current classical simulations, but the inherent programmability and control present in the experimental system is what enables the tools of complexity theory to be applied to the problem. A thought that fascinates me is the idea that we may be exploring quantum mechanics in a regime never probed this carefully before, the "high complexity regime" of quantum mechanics. One imagines there are important lessons in physics here.

Monday, September 30, 2019

Richard Guy is 103 years old today

Richard Guy is a mathematician. He co-authored the classic book Winning Ways for your Mathematical Plays with Elywn Berlekamp and John Conway.

On Sept 30 (today) he turned 102. According to this list he is the oldest living mathematician, and he would need to live to 110 to be the oldest mathematician ever.

I have met him twice. He was at the Gathering for Gardner Conference as a young 98-year old. I told him that his book Winning Ways had a great influence on me. He asked it if was positive or negative. I later saw him at a Math conference where he went to my talk on The Muffin Problem. So he is still active.

His Wikipedia site says that he says he regards himself as an Amateur Mathematician. While it is awkward to disagree with how someone sees himself, I'll point out that he is an author or co-author of 11 books, has many papers, and has solved Erdos Problems. He has taught some but I couldn't really find out what his source of income is or was. This takes us back to the word `amateur' which has several meanings:

Amateur: Someone who does X for the love of X (Amor is Love in Spanish), and not for money. This could be true of Richard Guy. This notion of amateur may be lost on my younger readers since this it used to be a thing to NOT take money since it somehow soils what you do. In those days Olympic athletes could not have played professionally beforehand. We can't imagine that now.

Amateur: Someone who dabbles in something but is not really that good. This could NOT be true of Richard Guy.

Aside from games he has also worked in Number Theory. His book Unsolved Problems in Number Theory has inspired many (including me).

So happy birthday Richard Guy!

He is the also the oldest living person we have honored on this blog. Second oldest was Katherine Johnson, see who is still alive.

ADDED LATER- some people emailed me if Richard Guy was still actively doing mathematics. Here is a recent paper of his: here

Thursday, September 26, 2019

Quantum Supremacy

By now you've probably heard the rumors of Google achieving quantum supremacy. I don't have inside information outside of Scott's blog post but it looks like the news should be embargoed until the release of a Science or Nature paper. These things usually happen on a Tuesday and you'd think they would avoid the news of the Nobel Prize announcements October 7-14.

Since for now the Google paper doesn't officially exist, we live in an era of Classical Dominance. Any problem that can be solved on a quantum computer today, can be solved just as fast or faster on a traditional computer. Quantum Supremacy, despite its lofty name, is just the negation of Classical Dominance, that there is some problem that a current quantum machine can solve that all our regular machines would require a considerably longer time to solve. This isn't a formal mathematical or scientific definition, so one can debate when or if we cross this threshold and I'm sure people will.

Quantum Supremacy might not even be a monotone concept. Future classical algorithms might solve the problem quickly, leading us back to Classical Dominance but leaving open the possibility of returning to Quantum Supremacy with another problem.

Quantum Supremacy is a long way from Quantum Usefulness, where quantum machines can solve problems we care about faster that traditional machines. Quantum computing will truly reach its potential when we can run general quantum algorithms like Shor's algorithm to factor products of large primes. We'll probably never see Quantum Dominance where classical transistors go the way of vacuum tubes.

Nevertheless, quantum supremacy is an important step and whether or not you think Google has gotten there, I'm sure it's an incredible achievement of science and engineering.

Monday, September 23, 2019

Applicants to Grad School are too good. Here is why this might be a problem.

Sitting around with three faculty we had the following conversation

ALICE: When I applied to grad school in 1980 they saw a strong math major (that is, I had good grades in hard math courses) but very little programming or any sort of computer science. That kind of person would NOT be admitted today since there are plenty of strong math majors who ALSO have the Comp Sci chops.

BOB: When I applied to grad school I was a comp sci major but my grades were not that good- A's in system courses, B's and even some C's in math. But I did a Security project that lead to a paper that got into a (minor) systems workshop. Two of my letters bragged a lot about that. (How do I know that? Don't ask.) So I got into grad school in 1989. That kind of person would NOT be admitted today since there are plenty of people who have papers in minor conferences whose grades ARE good in stuff other than their area.

CAROL: In 1975 I was an English major at Harvard. The summer between my junior and senior year I took a programming course over the summer and did very well and liked it. I then took some math. Then I worked in industry at a computer scientist for 5 years. Then I applied to grad school and they liked my unusual background. Plus I did well on the GRE's. Letters from my boss at work helped me, I don't think they would count letters from industry now. They took a chance on me, and it paid off (I got a PhD) but I don't think they would let someone like me in now since they don't have to take a chance. They can admit people who have done research, have solid backgrounds, and hence are not taking a chance. The irony is that some of those don't finish anyway.

1) Are Alice, Bob, and Carol right that they wouldn't be admitted to grad school now? I think they are with a caveat- they might end up in a lower tier grad school then they did end up in. Also, Alice and Bob I am more certain would not end up in the top tier grad schools they did since they can be compared DIRECTLY to other applicants,
where as Carol might be more orthogonal.

2) I have a sense (backed up my no data) that we are accepting fewer unusual cases than we used to (not just UMCP but across the country) because too many of the applicants are the standard very-good-on-paper applicants. Even the on-paper is not quite fair- they ARE very good for real.

3) Assume we are taking less unusual cases. Is that bad? I think so as people with different backgrounds (Carol especially) add to the diversity of trains of thought in a program, and that is surely a good thing. If EVERY students is a strong comp sci major who has done some research, there is a blandness to that.

4) What to do about this? First off, determine if this is really a problem. If it is then perhaps when looking at grad school applicants have some sensitivity to this issue.

5) For grad school admissions I am speculating. For REU admissions (I have run an REU program for the last 7 years and do all of the admissions myself) I can speak with more experience. The students who apply have gotten better over time and this IS cause for celebration; however, it has made taking unusual cases harder.

Monday, September 16, 2019

this paper from 2015 cracks Diffie-Hellman. What to tell the students?

I am teaching cryptography this semester for the second time (I taught it in Fall 2019) and will soon tell the students about the paper from 2015:
Imperfect Forward Secrecy: How Diffie-Hellman Fails in Practice. There are 14 authors.

The upshot is that as Diffie-Hellman was implemented in 2015, many cases were crackable. In summary (and probably too simple):

DH in a 512-bit group can be cracked by the authors

DH in a 1024-bit group they speculate can be cracked with nation-state resources.

Is this a big deal? If YES then what is being done, and if NOT then why not?

I have come up with some statements that I DO NOT KNOW if they are true, but I am ASKING you, to shed some light on the BIG DEAL or NO BIG DEAL question. (Note- Idea for a game show: BIG DEAL or NO BIG DEAL where contestants are asked if a news story is a BIG DEAL or not.)

So, please comment on the following question:

1) Since 2015 the people who use DH have upped their game and are now using bigger parameters. (I doubt this is true)

2) DH is mostly not used on things that hackers are not interested in, so this is not a big deal.

3) The expertise required to crack DH via this paper is rather difficult, so hackers don't have the skills.

4) This paper is not a problem for a bad reason: Hackers don't need to use the number field sieve DL algorithm when all they need to do is (1) guess that the pin numer is 1234 or the year the user was born (or close to it), (2) put on a uniform from Geek-Squad or some such organization and claim they are here to help, (3) exploit a known security flaw that the company has not bothered fixing.

5) The 14 authors have mysteriously disappeared. (I doubt this is true.)

(Misc: My spell checker thinks that Diffie and crackable are not words, but Hellman is.)