Tuesday, January 16, 2018

Donald Knuth Turns 80 years and 6 days

Celebrating Donald Knuth's 80th birthday, or 80 years + 7 days birthday seems odd. Should we use powers of 2? Hmm- too few, just 32 and 64 really. And having a 32-year celebration for someone is unusual. How about numbers that end in 0 in base 8. 64 would be 100, 72 would  110, 80 would be 120 so AH- we would be celebrating! So lets Celebrate!

LANCE: One of us should blog about Don Knuth turning 80.

BILL: How about both of us, sep posts?

Lance had his post here. Now its my turn.

Donald Knuth was the first person (roughly- history is not as definite as mathematics) to use mathematics to analyse algorithms. This may seem like an obvious thing to do but that's easy to say in retrospect. And he never lost focus: He may have to use some hard math to analyze algorithms but his goal always was to analyze algorithms. He didn't get fascinated with some obscure aspect of math like, say, surreal numbers, and go write a book on it. Oh wait, he did (see here). But, for the most part everything he did was towards faster algorithm and faster typesetting.

In an early draft of a book review of  Companion \to the Papers of Donald Knuth I wrote at the end of it

 Donald Knuth has been called the father of Theoretical Computer Science and hence his thoughts are worth knowing

I usually email my reviews to the author so they can make sure I didn't say anything stupid or in some very rare cases have a rebuttal. In Knuth's case I postal mailed it (ask your grandparents what that is) since he does not use email. He mailed back the hardcopy with some corrections in pencil. One of them was to cross out

father of theoretical computer science

and replace it with

father of algorithmic analysis.

I made the correction. That is how he views himself and it would be odd to argue the point.

As a tribute to him I have gathered up all of the book reviews  in my column of books by him (between 10 and 12 depending on how you count) and books clearly inspired by him (1 book- A=B) into one file which I point to  here

Wednesday, January 10, 2018

Donald Knuth Turns Eighty

We've kept this blog going long enough that we start repeating our celebrations. Ten years ago Bill celebrated Don Knuth's 70th Birthday and today Donald Knuth turns 80. While he celebrates in Piteå, Sweden with its less than 4.5 hours of daylight, we wish him a happy birthday from stateside.

Looking back in this blog, in 2015 I wrote about the history of the history of computing including this wonderful talk by Knuth, Let's Not Dumb Down the History of Computer Science.

Donald Knuth is known for many things, important algorithms, the brilliant book series The Art of Computer Programming, TeX, the many awards he's won and the award named after him. In my favorite story, back in the 70's when Knuth started working on Volume 4 (still in progress), he wanted a good name for those hard problems that Cook and Karp found. His candidates "Herculean", "Formidable" and "Arduous" didn't get much traction so he ran a contest which had some interesting suggestions before reluctantly going with the name "NP-complete" that came out of a group from MIT.

In his SIGACT News article that described the contest, Donald Knuth was so sure that these hard problems would remain hard, he puts everything on the line, offering anyone who could prove P = NP a live turkey. May Don Knuth and his turkey continue to celebrate many more birthdays.

Sunday, January 07, 2018

A new largest prime found!

A new largest KNOWN prime has been discovered and its 23 million digits long.

Nate Silver's website had an article about it (written by Oliver Roeder) here

An article about why people do this is  here

Lance posted about finding large primes in 2006 here

I'll just make some random comments

1) The prime is 277,232,917-1

2) The prime is not much bigger than the previous champion.

3) More generally, the graph (in Oliver Roeder's article) shows from 1600 to about 1951there was slow progress but since then there has been LOTS of progress. See the table in this article. I had wanted to say every year a new prime was found  but, alas, not that simple a pattern. Even so, lots of new records.

4) I"ll list the reasons given for why people do this and my comments on them.

a) Tradition! (Note to self- write a novelty song to the tune of Fiddler-on-the-roof's Tradition about why people work on various mathematical things)

b) For the by-product of the quest. This one does make sense and I am sure has driven and helped test out some research. Reminds me of the spinoffs from going to the moon (see here). Contrary to what I heard as a kid, the orange-powder-drink Tang was not a spinoff. But there were many others. Of course- would these spinoffs have happened anyway? Hard to know.

c) People collect rare and Beautiful items. I don't really see this one. Finding a large prime doesn't make  its yours, it belongs to the ages! And I don't think people get their names on the prime either. The only prime that has someone's name on it is the famous Grothendieck prime which is 57. Oh well. There are sets of primes with peoples names on them: Mersenne primes, Gaussian  primes (which are subsets of Gaussian integers so maybe shouldn't count), Eisenstein primes, and
Sophie Germain primes. If you know of any other primes or set of primes named after someone, leave a comment please.

d) For the glory! Maybe, but given how briefly people hold the record, fame is fleeting.

e) To test the hardware. This one I didn't know! I'll quote the article as to why primes are good for this
Why are prime programs used this way?  They are intensely CPU and bus bound.  They are relatively short, give an easily checked answer (when run on a known prime they should output true after their billions of calculations).  They can easily be run in the background while other "more important" tasks run, and they are usually easy to stop and restart.
f) To learn more about their distribution. The prime number theorem was conjectured from data. We have so many primes now that I wonder if a few more really help formulate conjectures

g) For the money. The first person to get a ten-million digit prime gets $100,000. The first person to get a one billion digit prime gets $250,000.  Wow! Except that the article must be a bit old since the $100,000 prize was claimed in 2009 (see here). Still, theres that one billion digit prize out there!

5) Mersenne primes are of the form 2^n-1. It is known that n must be prime for 2^n-1 to be prime (this is not hard). There are much faster primality testing algorithms for Mersenne primes than arb primes. But see next item.

6) While writing this blog post I looked up non-mersenne primes. It seems like the largest one is

10223*2^31172165 + 1 and was discovered in 2016.

But of more interest- there is no Wikipedia page on non-Mersenne primes, there are some outdated pages that don't have the most recent information. As the kids say, its not a thing.

8) I'll add one more reason why people work on this, but its more of a tautology: People work on finding large primes because they can!. By contrast, finding VDW numbers is hard and likely to not make much progress.

9) I think that the most reason advances have come from computing power and not number theory. (if this is incorrect let me know with a polite comment)

10) Have their been spinoffs in either number theory OR computing power lately?

11) I wonder if there will come a point where progress gets hard again and the graph of largest known primes flattens out. I tend to think yes, but hard to say when.

Friday, January 05, 2018

Which of these Math acronyms are well known?

The last time I taught Grad Ramsey Theory there were very good math grads and ugrads in it. They used some acronyms - some I knew, some I didn't know (but know now).  I am sure some are well known and some are now. I don't know which is which. Here is the list and comments

WLOG- Without Loss of Generality. This one I know and it seems well know-- When Googled the first page is all this definition. (Maybe I shouldn't use the term ``Googled''- I've heard that brand names don't like it when they become generic terms like `Googled'. Kleenex, Photoshop, Xerox had this fate. Their is a word for it- genericide)

ITOT- It turns out that. An Applied Math Prof used this in a course I had in 1977. I have not seen it used since then. I don't even use it. 

BWOC- By Way of Contradiction. I thought this was better known. When I google it I got to this page which tells me it stands for:

Big Wolf on Campus (TV Show)

By Weight of Cement (Oil and Gas industry)

Big Women on Campus (GOOD- the counterpart to BMOC)

By Way of  Contradiction (Given the other items on the list I'm surprised it made the cut)

Bob Wayne's Oil company

FTSOC- For the Sake of Contradiction. NOT, as Google told me, Fuck this Shit O'Clock (reminds me of when I use FML for Formula and the class tells me its means Fuck My Life)

WTS- This was a new one for me. It took a while to get it from context but it was Want to Show. Google gives Women in Transportation.  

NTS- Need to show. Also a radio station and the National Traffic System.

I think the only one of these that is standard is WLOG.  The rest I think could be useful. But I ask you- are any of these standard? Are there ones that are standard that I missed? Of course, the great thing about standards is that there are so many of them.

Wednesday, December 27, 2017

Complexity Year in Review 2017

Theorem of the year goes to the resolution of the dichotomy conjecture. I wrote about the conjecture in February and while the Feder et. al paper didn't hold up, two later papers seem to resolve the conjecture.

A Proof of CSP Dichotomy Conjecture by Dmitriy Zhuk

I checked with experts in the field and at least one of these papers and more likely both ought to be correct.

Runners up include two matching papers I posted about last month, Svensson and Tarnawski who give a quasi-NC algorithm for general graph matching and Anari and Vazirani who give a NC algorithm for matching on planar graphs. We also had the nice quasi-polynomial time algorithm for parity games by Calude, Jain, Khoussainov, Li and Stephan that I posted on last March.

In last year's review we said "2016 will go down as a watershed year for machine learning" yet somehow it paled against 2017 with breakthroughs in chess, poker, astronomy not to mention continuous advances in machine translation, autonomous vehicles and everything else. Maybe next year ML can write the 2018 year in review.

We had an awesome eclipse to remind us of the wonders of the world and almost made me forget about US politics. Computing keeps growing and how do we find the resources to train people from pre-school through college and throughout their lives? How much should we worry about the dominance of a handful of computing companies? 

2018 is just full of questions: What will the Internet look like post-net neutrality? How will the new tax code play out? Where will Amazon put HQ2? What will machine learning do next? What can quantum computers with 50 qbits accomplish? Will bitcoin move to $300K or 30 cents? And what great advances in complexity await us?

Thursday, December 21, 2017

Having Faith in Complexity

I believe P ≠ NP as much as anyone else. Nevertheless should we worry about trust we put in complexity?

You don't need the full power of P = NP to break cryptography. I don't worry about quantum computers breaking RSA and related protocols. It won't sneak up on us--when (or if) quantum computing gets anywhere close to factoring large numbers, we'll figure out a strategy to change our protocols and to protect the information we already have. However if someone comes up with an algorithm tomorrow that cracks AES, we'll have a crisis on our hands as AES is so well used the algorithm is embedded into computer chips. Perhaps we can mitigate the damage before the algorithm spreads or at take our information off-line until we develop new solutions.

But what about blockchains, the technology that underlies cryptocurrencies such as Bitcoin. A blockchain consists of a series of transactions collected into sequence of blocks, where each block consists of a hash of the previous block with transactions themselves hashed and often encrypted with public-key cryptography. One would hope that breaking the cryptography would be caught quickly and we'd still have a legit record of transactions saved somewhere. The transactions themselves might be compromised especially if anonymity was built into the system.

Bitcoin itself, as I write this, has a total market cap of over $250 billion based fully on cryptography. The cryptography will probably hold up, Bitcoin investors have more to worry from bad implementations or the pop of a bubble of unrealistic expectations. But as we watch so many exciting advances in computing tackling challenges that we would never have expected to get solved, should we continue to build our economy on the hope that other advances won't happen?

Sunday, December 17, 2017

Monkey First!

The following story is not true nor has anyone claimed its true, but it has a point:

A company gets a contract to do the following: train a monkey to sit on a 10-foot pedestal and recite some passages of Shakespeare. After a week they announce they have made progress! They invite their investors to see what progress they have made! They unveil a curtain and there is... a 10-foot pedestal.

This story was in an article about how Google does moonshots-- that is, high-risk, high-reward, innovative work. The article is here. (How the Atlantic makes money when they have stuff online is a mystery to me. Perhaps they do in a very innovative way.)  The point is that its BAD to have tangible results (like the pedestal) that are not getting at the heart of the problem. So Google has various incentives to do the important stuff. Their slogan is MONKEY FIRST.

This also applies to our research.  The following sequence of events is common:

1) Prove some scattered results.

2) Pedastal or Monkey? You could write up what you have, polish it, write up some nice LaTeX macros to make the writing of the paper easier OR you could try to find the unifying principle that would be hard, and might not work, but if it works that would be, as the kids say, Jawesome (Jaw-dropping awesome). The sad answer is that which you do might depend on when the next conference deadline is.

More generally there is a tension between safe do-able research(Pedestal) and high-risk, high-reweard research (Monkey).  Is our incentive structure set up to encourage high-risk high-reward? The Tenure system is supposed to do it and it DOES in some cases, but not as much as it could since there are other factors (salary, promotion to full prof, grants).

Does the system encourage high-risk high-reward? Should it? Could we do better? What are your experiences? I have no answers (especially to the question of what are your experiences) so I welcome your comments.

Wednesday, December 13, 2017

Our AI future: The Good and the Ugly

I don’t directly work in machine learning but one cannot deny the progress it has made and the effect it has on society. Who would have thought even a few years ago that ML would have basically solved face and voice recognition and translate nearly as well as humans.

The Neural Information Process Systems conference held last week in Long Beach, California, sold out its 7500 registration slots in 12 days. NIPS, not long ago just another academic conference, has become a major machine learning job market where newly minted Ph.D.s earn north of $300,000 and top-ranked senior academics command multimillion-dollar, multiyear contracts."

AlphaZero, an offshoot of Google’s Go programs, learned chess given only the rules in just four hours (on 5000 tensor processing units) and easily beats the best human-designed chess programs. Check out this match against Stockfish.

Just a trend that machine learning often works better when humans just get out of the way.

The advances in machine learning and automation have a dark side. Earlier this week I attended the CRA Summit on Technology and Jobs, one of a series of meetings organized by Moshe Vardi on how AI and other computing technology will affect the future job market. When we talk about ethics in computer science we usually talk about freedom of information, privacy and fairness but this may be the biggest challenge of them all.

The most stark statistic: Contrary to what certain politicians may tell you, manufacturing output in the United States has never been higher, but manufacturing jobs have declined dramatically due to automation.

The changes have hit hardest for white middle-class less educated males. While this group usually doesn’t get much attention from academics, they have been hit hard, often taking less rewarding jobs or dropping out of the job market entirely. We're seeing many young people living with their parents spending their days playing video games and see a spike in suicides and drug use. Drug overdose is the now the leading cause of death of men under 50.

There are no easy solutions. Universal basic income won’t solve the psychological need a job plays in being a part of something bigger than oneself. In the end we'll need to rethink the educate-work-retire cycle towards more life-long learning and find rewarding jobs that go around automation. This all starts by having a government that recognizes these real challenges.