Monday, May 20, 2019

Notorious L.A.H or Notorious LAH? OR You always need one more proofread

I noticed a while back that even on the nth proofread of a document there are still corrections. So I decided to keep track of how many corrections there are in a paper I was working on. I chose a non-technical paper so that errors-in-the-math would not be the issue.  I chose

                                           Guest Column: The Third P =?NP Poll (see here)

that appeared in Lane Hemaspaandra's SIGACT News Complexity Column.

I kept track of the following:

1) Number of corrections. Anything that I changed. Could be style, a new thought, need not be (though could be) an error.

2) Errors. These are things that really need to be corrected, like having `think' instead of  `thing' .

Corrections vs  Errors, an Example:

If I refer to Lane Hemaspaandra as Notorious L.A.N that is a correction and an  error, as he is Notorous L.A.H.

If I refer to Lane Hemaspaandra as Notorious L.A.H and decide to change it to LAH that is a correction that is not an error.

I didn't keep track of serious errors vs typos, but after the first 3 proofreads there were no more serious errors--- sort of- --you'll see. Most serious was a fonts-gone-wild thing where half the paper was in boldface.

Here is a history of the number of corrections

1) Lane proofread the first draft. κ corrections where κ is some cardinal between the cardinality of N and the cardinality of  2N . Its value depends on which model of set theory you are in. (My spellchecker thinks that cardinality is not a word. I checked and I am spelling it correctly but perhaps it's one of those things where I stare at it too much and keep misreading it.)

Henceforth I omit the word proofread as it is understood


2) Bill G:  81 corrections, 29 of which were errors.

3) Clyde: 64 corrections,  of which 17 were errors.

4) Bill G: 40 corrections, of which 21 were errors (I had added a new section causing more errors)

5) Clyde: 30 corrections of which 10 were errors.

6) Bill G: 24 corrections of which 6 were errors.

7) Clyde: 18 corrections of which 8 were errors.

8) David Sekora (A CS grad student at Maryland who at one time wanted to be an English Major): f15 corrections of which 15 were errors. Really! Typos dagnabbit! (Spell check thinks that dagnabbit is spelled wrong. Um---in that case what is the correct spelling?)

9) Nathan Grammel (A CS grad student at Maryland) :6 corrections of which  3 were errors.

10) Bill G, proofreading backwards, a paragraph at a time: 29 corrections of which 5 were errors.

11) Justin Hontz, an ugrad who TAs for me: 10 corrections of which 7 were errors.

12) Karthik Abinav, a grad student in theory at Maryland: 2 corrections both of which were errors. Was this the end or are there still issues?

13) Josh Twitty, an ugrad who TAs for me: 0 corrections. YEAH!

14) Dan Smolyak, an ugrad CS and Eco major:4 corrections, all 4 errors. Error  sounds too strong. For example, one of them was to replace ?. with ?  Yes, its an error, but not that important. It DOES point to his carefulness as a proofreader.

15) Clyde Kruskal :20 corrections, 10 of which were errors. To call them errors seems wrong when he corrects  Group theory' to  Group Theory. None of these corrections were caused by prior comments. I think all of the errors were in the paper early on, undetected until now!

16)  Backwards Bill G again:  28 corrections,  14 of which were errors. Again, the errors were minor. Most of the errors were relatively recent. As an example, if I list out topics in math like:

a) Group Theory, Set Theory, and  Ramsey Theory

then I am supposed to use capital letters, but if I say in prose

Lance Fortnow thinks that the techniques used will be group theory, set theory, and Ramsey theory

then only the R in Ramsey Theory is in caps.  Makes me glad I'm in math.

17) Lane got penultimate proofread. Lane found 75 (yes 75 WOW) of which 66 (yes 66 WOW) were errors. Many of these were spacing and latex things that I would never have noticed (indeed- I didn't notice) and most readers would not have noticed (hmmm- how do I know that?) but only an editor could catch (hmmm- when I've edited the book review column and now the open problems column and I never found more than 10 errors). So when all is said and done: KUDOS to Lane! And My point was that you can never get all the errors out. On that I am correct. I wonder if there are still errors? Yeah, but at most 10. However, I said that BEFORE giving it to Lane.

18) Stephen Fenner, the editor of SIGACT news got FINAL proofread. He found that I spelled his name wrong . How many errors are left? I would bet at most 10. I would bet that I would lose that bet.
------------------------------------------------------------------------------------------------------------

Why after multiple proofreadings are there still errors? (My spell check thinks proofreadings is not a word. Maybe my spell check is worried that if people get documents proofread a lot then they won't be needed anymore. This blog post refutes that thought.)

1)  An error can occur from a correction. This caused a massive problem with another paper. Lane's next column will be by me and co-authors on The Muffin Problem. We had all kinds of problems with the colors and sizes--- Massive Magenta Muffins or Miniature Magenta Muffins? Sizes gone wild! Again Kudos to my proofreaders and to Lane for catching this rather important error.

2) If some passage is added late in the process it will surely have errors.

3) An error correction may clear away the brush so you can see other errors.

4) With LaTeX (or Word for some) we have the ability to get things perfect. So there is no cost to keeping on perfecting things. This lead so many corrections that are not errors.

5) I know of an adviser who would first say change A to B, and later change B back to A. (None of that happened with the paper discussed above).

Are errors inevitable? Clyde Kruskal tells me that his father Martin Kruskal, as a teenager, read Courant and Robbins book What is Mathematics and found some errors in it. Martin's mother didn't believe him and marched him over to Courant's house:

MARTIN MOTHER: Martin claims to have found errors in your book.

COURANT:  (laughs) There are errors in every book.

Courant was so impressed that ten (or so) years later Courant became Martin's PhD adviser.


Thursday, May 16, 2019

Getting Through the Clutter

My daughter showed up at her doctor's office the other day and found it closed. She complained that she had received all these alerts, texts, emails, voicemails, reminding her of the appointment and then they weren't there when she showed up. She had stopped reading the alerts, the last of which said the appointment need to be rescheduled.

We all get too many alerts. I just assume many of the emails I send to faculty never get read or remembered. I get many requests to mention conferences, workshop and other theory happenings on this blog because nobody knows how to get the word out through the clutter. If we followed through on all these requests, this blog would become clutter itself.

Back in 2009, Nikhil Devanur and I wrote a paper trying to capture this phenomenon using complexity. Building on Levin's notion of universal search, the unawareness of a string x in an environment E is the amount of time needed to generate x from a context c given an oracle for E. Levin showed that one can have a universal algorithm, only a constant time slower to generate x than any other algorithm. One should think of E as the sum total of all our knowledge including search engines on the Internet. Technically we should have used the term "attention" instead of "awareness".

One example is using a calendar as part of your environment, E, that reminds you of an event, x, on that date, c. We use calendars, contact lists, emails searches and many other ways to keep strings we need to remember. Advertisers try to alter your E to get the unawareness of x down.

One of these papers that didn't go far because we didn't have great applications for the definition. But it follows a good philosophy, when life seems complex, model it.

Monday, May 13, 2019

Ronald Graham's other large number. Well---- it was large in 1964 anyway.

Graham's number (see here) was at one time the largest number to appear in a math proof.

a) GN was an upper bound on a problem in Ramsey theory. There are now better upper bounds, see here. These better upper bounds are still large- Hales-Jewitt-Large, but that's already much smaller than the original GN.

b) Other proofs now have numbers even larger than GN. For example Harvey Friedman's work on the finite versions of Kruskal's Tree Theorem. (There may be other cases- if you know some then let me know in the comments.)

Since my dept recently moved buildings I found old papers that I had not looked at in years. One of them was

Old and New Problems and Results in Combinatorial Number Theory

by Erdos and Graham

(see here)

So I began reading it and came across a result of Graham from 1964 that used large numbers. No where near as large as GN, but I found it interesting that Graham was involved with large numbers way back then.

Here is the problem:

A Lucas Sequence is a sequence that obeys

a(n) = a(n-1) + a(n-2).

Clearly such a sequence is determined by a(0) and a(1).

QUESTION: Does there exists a(0) and a(1)  that are rel prime such that the sequence has only composite numbers?

By ingenuity and some computing power Graham found YES. For how the got the numbers see here. The numbers are of course in the paper, and how they got them is interesting, but I present them anyway. Hope I don't make a typo:

a(0) = 1786772701928802632268715130455793

a(1) = 1059683225053915111058164141686995

The paper Old and New... says its open if there is a smaller pair of numbers, I do not know if it is still open. If you know, let us all  know in the comments!

These numbers seem small today since we have modern computers that can store and manipulate them easily. Were the considered large numbers in 1964? They were never called Graham Numbers which is probably just as well since that honor lay ahead.


Thursday, May 09, 2019

Multiple Provers

Just over thirty years ago on May 5, 1989, I defended my PhD Thesis Complexity-Theoretic Aspects of Interactive Proof Systems. It's starts off with a parable for interactive proof systems.
Victor, a venture capitalist, had everything a man could desire: money, women and power. But he felt something missing. He decided he lacked knowledge. So Victor packed up his bags and headed to the Himalayas in search of ultimate truths. The natives pointed Victor to a tall mountain and mentioned rumors of a great man full of wisdom. Victor, who smartly brought some climbing equipment, tackled the mountain until he reached a small cave near the summit. Victor found the great Pulu, grand guru of all that is known. Victor inquired to some ultimate truths and Pulu responded, I will teach you but you must not trust my words. Victor agreed and found he learned much even though he had to verify all the sayings of the great Pulu. Victor though lacked complete happiness and he asked if he could learn knowledge beyond what he could learn in this manner. The grand guru replied, You may ask and I will answer. Victor pondered this idea for a minute and said, "Since you know all that is known, why can you not predict my questions?" A silence reigned over the mountain for a short while until the guru finally spoke, You must use other implements, symbols of your past life. Victor thought for a while and reached into his backpack and brought out some spare change he had unwittingly carried with him. Even the great Pulu can not predict the flip of a coin. He started flipping the coins to ask the guru and wondered what can I learn now?
Without the coins, one gets the complexity class NP. My thesis didn't answer the last question, but by the end of the year, Shamir building on work of Lund, Fortnow, Karloff and Nisan showed this class IP was equal to PSPACE, the problems we could solve in a polynomial amount of memory.

Part of my thesis explored the class MIP where we had multiple Pulus (provers) on different mountain tops unable to communicate. The news was disappointing, we failed to get a PSPACE upper bound for MIP, only NEXP (nondeterministic exponential time) and our proof that two provers sufficed relied on a bad assumption on how errors get reduced when you run multiple protocols in parallel. Later Babai, Lund and myself showed MIP = NEXP and Ran Raz showed parallel repetition does reduce the error sufficiently.

Back in the 80's we didn't even imagine the possibility that the Pulus had shared entangled quantum bits. Does the entanglement allow the provers to cheat or can the entanglement allow them to prove more things? Turns out to be much more, as a new result by Anand Natarajan and John Wright shows that MIP*, MIP with classical communication, classical verifier and two provers with previously entangled quantum bits, can compute everything in NEEXP, nondeterministic double exponential time. This is only a lower bound for MIP*, possibly one can do even more.

Neat to see my three-decade old thesis explored ideas that people are still thinking about today.

Monday, May 06, 2019

Thoughts on the recent Jeopardy streak (SPOILERS)

James Holzhauer  has won 22 consecutive games of Jeopardy and has made around 1.6 million dollars. Nice work if you can get it. Here are some thoughts no this

1) Before James H the records for number of consecutive games was, and still is, Ken Jennings winning 74 in a row, and second place was 20. I was surprised that Ken was that much better than the competition.

2) Before James H the record for amount of money in normal play (not extra from, say, tournament of champions or losing to a computer) was around 400,000. I was surprised that Ken was that much better than the competition.

3) James is obliterating the records for most wins in a single game. He holds the top 12 records for this.  This is due to his betting A LOT on the daily doubles and the final jeop, as well as of course answering so many questions right.

4) One reason players in Jeopardy don't have long streaks is fatigue. The actually play
5 games a day, two days of the week.  James H is getting a break since he has two weeks off now since they will soon have the Teachers Tournament. This could work either way--- he gets a break or he loses being in-the-zone.

5) James strategy is:

a) Begin with the harder (and more lucrative) questions.

b) Bet A LOT on the daily doubles (which are more likely to be in the more lucrative questions) and almost always go into final jeop with more than twice your opponent (He failed to do this only once.)

c) Bet A LOT on Final Jeop- though not enough so that if you lose you lose the game. I think he's gotten every Final Jeop question right.

For more on his strategy see this article by Oliver Roeder in Nate Silvers Blog: here

6) I tend to think of this as being a high-risk, high-reward strategy and thus it is unlikely he will beat Ken Jennings, but every time he wins that thought seems sillier and sillier. While we are here, how likely is it that someone will beat Ken Jennings? In an article before all of this Ben Morrison in Nate Silvers Blog wrote that it was quite likely SOMEONE would break Ken J's record,  see here.

7) OKAY, how does James H compare to Ken J? According to Oliver Roeder in Nate Silvers Blog,
here, they are similar in terms of percent of questions answered right, but James H bets so much more (bets better?) which is why he is getting so much money. I'll be curious to see a head-to-head contest at some point. But to the issue at hand, they don't give James H that good a chance to break Ken J's record.

8) Jeop used to have a  5-game limit. Maybe that was a good idea- its not that interesting seeing the same person with the same strategy win 22 in a row. Also, the short-talk-with-Alex T-- James is running out of interesting things to say. I wonder what Alex did with Ken J after 50 games.
``So Ken, I hear you're good at Jeopardy''

9) Misc: Ken J was the inspiration for IBM to do Watson.

10) Will future players use James Strategy? Note that you have to be REALLY GOOD in the first place for it to help you. Maybe a modified version where you go for the lucrative questions and bet a lot on Daily Doubles (more than people have done in the past) when its an area you know really well (I'll take Ramsey Theory for $2000.)

11) I used to DVR and watch Jeop but didn't mind if I was a few behind. Now I have to stay on top of it so articles like those pointed to above don't give me a spoiler.

12) My prediction: He will beat Ken Jenning for money but not for number-of-games. I have no real confidence in these predictions.

Thursday, May 02, 2019

The Next Chapter


I've accepted a position as Dean of the College of Science at the Illinois Institute of Technology in Chicago starting in August. It's an exciting opportunity to really build up the sciences and computing in the city that I have spent the bulk of my academic career and grew to love.

I had a fantastic time at Georgia Tech over the last seven years working with an incredible faculty, staff and students in the School of Computer Science. This is a special place and I enjoyed watching the school, the institute and the City of Atlanta grow and evolve.

After I tweeted the news yesterday, Bill Cook reminded me that
Illinois Tech was the long-time home of Karl Menger, the first person to pose the problem of determining the complexity of the TSP. Now you can settle it!
I wouldn't bet on my settling the complexity of traveling salesman even if I didn't have a college to run. But it goes to remind us that wherever you go in life, P and NP will be right there waiting for you. 

Sunday, April 28, 2019

x3 + y3 + z3 = 33 has a solution in Z. And its big!

Consider the following problem:

Given k, a natural number, determine if there exists x,y,z INTEGERS such that x3+y3+z3=k.

It is not obvious that this problem is decidable (I think it is but have not been able to find an exact statement to that affect; however, if it was not solvable, I would know that, hence it is solvable. If you know a ref give it in the comments.)


If k≡ 4,5 mod 9 then mod arguments easily show there is no solution. Huisman showed that if k≤ 1000, k≡1,2,3,6,7,8 mod 9 and max(|x|,|y|,|z|) ≤ 1015 and k is NOT one of

33, 42, 114, 165, 390, 579, 627, 633, 732, 795, 906, 921, 975

then there was a solution. For those on the list it was unknown.

Recently Booker (not Cory Booker, the candidate for prez, but Andrew Booker who I assume is a math-computer science person and is not running for prez) showed that

x3 + y3 + z3 =33

DOES have a solution in INTEGERS. It is

x= 8,866,128,975,287,528

y=-8,778,405,442,862,239

z=-2,736,111,468,807,040


does that make us more likely or less likely to think that

x3 + y3 + z3 =42

has a solution? How about =114, etc, the others on the list?

Rather than say what I think is true (I have no idea) here is what I HOPE is true: that the resolution of these problems leads to some mathematics of interest.





Thursday, April 25, 2019

Geo-Centric Complexity

An interesting discussion during Dagstuhl last month about the US-centric view of theory. Bad enough that all talks and papers in an international venue are in English but we also have
  • Manhattan Distance. How are foreigners supposed to know about the structure of streets in New York? What's wrong with grid distance?
  • Las Vegas Algorithms. I found this one a little unfair, after all Monte Carlo algorithms came first. Still today might not Macau algorithms make sense?
  • Arthur-Merlin Games. A British reference by a Hungarian living in the US (László Babai who also coined Las Vegas algorithms). Still the Chinese might not know the fables. Glad the Europeans don't remember the Pat and Vanna terminology I used in my first STOC talk. 
  • Alice and Bob. The famous pair of cryptographers but how generically American can you get. Why not Amare and Bhati?
I have two minds here. We shouldn't alienate or confuse those who didn't grow up in an Anglo-American culture. On the other hand, I hate to have to try and make all terminology culturally neutral, you'd just end up with technical and ugly names, like P and NP.