Monday, December 30, 2013

2013 Complexity Year in Review

The complexity result of the year goes to The Matching Polytope has Exponential Extension Complexity by Thomas Rothvoss. Last year's paper of the year showed that the Traveling Salesman Problem cannot have a subexponential-size linear program formulation. If one could show that every problem in P has a short polynomial-size LP formulation then we would have a separation of P and NP. Rothvoss' paper shoots down that approach by giving an exponential lower bound for the polynomial-time computable matching problem. This story is reminiscent of the exponential monotone circuit lower bounds first for clique then matching in the 1980's.

If you expand to all of mathematics, one cannot ignore Yitang Zhang's work showing the liminf of the difference between consecutive primes is a constant. Dick and Ken have other great results for the year.

A big year for theoretical computer science. Silvio Micali and Shafi Goldwasser received the ACM Turing Award. The P v NP problem makes a prominent appearance on a major US television series. We are seeing the rise of currencies based on complexity. Large-scale algorithms, cryptography and privacy play center stage in the Snowden revelations on the National Security Agency.

Generally good news on the funding and jobs front in the US. After a year of sequestration and a government shutdown, looks like some stability for science funding now that congress has actually passed a budget. Plenty of theorists got academic jobs last spring and given the number of ads, this year's CS job market should be quite robust as well.

A year for books. Of course my own Golden Ticket as well as Scott Aaronson's Democritus and Tom Cormen's Algorithms Unlocked.

An odd year for the blog in 2013 without a single obituary post. Nevertheless let us remember 4-Colorer Kenneth Appel, Georgia Tech Software Engineering Professor Mary Jean Harrold and Wenqi Huang who led the Institute of Theoretical Computer Sciences at the Huazhong University of Science and Technology. In 2013 we also said goodbye to Alta Vista, IntradeGoogle Reader and the CS GRE.

In 2014 we'll have the next installment of My Favorite Ten Complexity Theorems of the Past Decade and the centenaries of George Dantzig and Martin Gardner. Enjoy New Years and keep reading.

Thursday, December 26, 2013

To Write, or Not to Write, That Is the Question

Guest post by Vijay Vazirani

Our field has been blessed with some great books: classics such as Knuth's volumes and Garey & Johnson literally got the field going, and the books of Lovász brought much clarity in difficult, important areas. The new millennium brought a plethora of new books by famous TCS researchers including Arora, Barack, Goldreich, Kleinberg, Nisan and Tardos. With an astonishing 33 short titles in theory, Now Publishers has virtually created an assembly line for producing books.

Even so, when faced with the prospect of writing a new book, I am filled with trepidation and self-doubt: Am I up to the effort needed? Do I have a genuinely new point of view to expound? If so, have I figured out the "right'' format and style for expounding it in the simplest and clearest manner?

The issue arose when I decided to co-teach, with my able postdoc Ruta Mehta, the course Equilibrium Computation in Spring. With her characteristic enthusiasm, Ruta has practically forced me into a long-term commitment of writing a book on this topic. However, knowing what it takes, so far I have managed to resist an unequivocal "yes".

Most people would say it is a losing proposition -- the gains are typically minuscule compared to the toils. So why do so many people write books? I can imagine several reasons, but best to leave this discussion to my esteemed colleagues who are probably yearning for something to ponder on in this holiday season ...

That brings me to another question, "Are some of the books written in this millennium eventually going to be regarded as classics in the style of Knuth or Lovász's books?'' One could look at citation counts or average citations per year to get an "objective'' reading of the situation. However, when one is talking about true class, such crude estimators are clearly not the way to go.

Your thoughts are eagerly sought ...

Monday, December 23, 2013

Our Journal is 99 44/100 percent pure!!

Sometimes we are asked to evaluate how good a journal or conference  formally(Excellent, Very Good, Good, Fair, Better-than-being-poked-by-a-stick,pass the stick). Sometimes we talk about these things informally (ICALP is the European STOC! SODA is as hard to get into as FOCS!, CCC is a topTier topics-conference!)

But is there a way to judge these things formally? And should there be?Be aware of Goodhart's Law:

When a measure becomes a target, it ceases to be a good measure.
 One way to measure how good a journal or conference is  impact factor which is based on number-of-citations. Does this work well? Some issues:
  1. Even with an honest effort these things are hard to do well. People may cite the conference version, the journal version, the arXiv version, or just give a website.
  2. Even with an honest effort just a few articles can skew the results.
  3. Its hard to compare cross-fields.  I suspect that pure math journals have lower citations rates than biology journals.
  4. If an author cites himself, should that count?
  5. The above points were all about HONEST efforts. Could a journal do things to boost its impact factor and then brag about how high their impact factor is? Would they? Alas yes, see this article and this article.
So what are we left with? We could just go on our gut, but that favors journals that used to be good and aren't any longer. But the problem is deeper than all that--- can't we just say Thats a good article  and not have to proof it by saying where it was published? Alas no- all fields have gotten specialized that we must use these proxies to inform us.

 I COULD rant that we should all be well rounded enough to read outside of our field, but I know how hard that is.

I COULD rant about the dishonesty pointed out in the above links, but that's only part of the problem.

I COULD say What do you think? so I will.

Thursday, December 19, 2013

Security Changes

A couple of policy changes recently, one that supposedly enhances privacy and another that could reduce it.

Google has been implementing perfect forward secrecy since 2011 and other major Internet players, such as Facebook and Twitter, have started using perfect forward secrecy in the wake of the Snowden revelations that the NSA has been collecting Internet traffic to these companies.

So what is perfect forward secrecy? Not an easy question to find the answer to on the Internet. The wikipedia article says little. So I asked a couple of our security folks in the department.

The rough idea: We want to communicate several rounds of messages but if the current keys are compromised they can't be used to decrypt earlier messages. A couple of immediate thoughts: This isn't "perfect", you can still discover the earlier messages by breaking the encryption (say if P = NP). Also this isn't that exciting a problem from a theoretical perspective, you can just use a standard public-key protocol and start with fresh private and public keys each round and deleting the old ones. But that isn't very efficient.

One approach to PFS: Have a standard public/private key scheme to set up a session key (used in an AES or similar private key protocol) then run separate Diffie-Hellman schemes for each message. In RSA if you have the factors for N you can decrypt, where in Diffie-Hellman you can keep the same group without compromising security.

Chris Peikert calls this a poor-man's perfect forward security and there are better schemes though a bit more complicated.

On a different front, Google recently announced that images by default would be displayed in gmail messages. The images would not come directly from the sender, which could contain malware that avoids Google's filters, but rather from Google's servers after being downloaded and cleansed by Google.

Downloading an image often tells the sender that the image was read, typically with some id encoded in the filename. So once again we give up privacy for convenience. At least Google gives us the option to turn off the automated displaying.

Monday, December 16, 2013

Analogs between Quantum Computing and Parallelism

(Jon Katz wanted me to mention this:  A wise man once noted that there are fewer quantum algorithms than thereare quantum-algorithms textbooks! But there is still a lot of interest inquantum computation from cademia,government, industry, and the broader public. Univ of MD and NIST have recently formed a
center devoted to quantum computing and involving faculty and researchers from both physics and computer science communities. As part of this they are advertising a posdoctoral fellowship.)

A long time ago people in theory did  a lot of work on parallel computing before
there were many parallel computers build. Today people are doing a lot of work on quantum computing before quantum computers are build. What is similar and different here?

  1.  When parallel computers were actually built they were not like PRAM's. Many of the models assumed shared memory which wasn't true. Even so, did the work on PRAMS and other models help the practioners? Directly? Indirectly?
  2. Are the models of quantum computing studied now helping the practioners? Is the development of quantum computers at too early a stage to even ask this question?
  3. Even if no quantum computers are ever built the study has been a sucess since some classical problems have been solved using quantum techniques (and I think this will happen more and more). And some interesting math has come out of it. And physicists and others have learned more about quantum mechanics from quantum computing. Could the same thing have been said for parallelism- even if parallel computers had not been built would the study of them have still be useful?
  4. One big difference- many problems can be parallelized in some form and solved that way (and some cannot be solved any other way). A wise man named Jon Katz referred above to a wise man named Ronald de Wolf  who wrote, in a review of 3 books on quanum computing:

 A quick survey on shows that the number of books on quantum computing (at least 20) is more than 10 times as high as the number of quantum algorithms (2: Shor's and Grover's). (Footnote: Note that this review was written in 2003 so this statement is no longer true.)

While  I think he meant there are more quantum algorithms (quantum random walks, quantum simulations, quantum selection-type problems?, quantum number-theory-type-problems?) now than in 2003, I will note that there are also more books on quantum computing now than then- on a cursory look at amazon I think I counted 30, but with the uncertainly principle, its hard to tell.  The point is, aside from factoring and of course quantum simulation I wonder if when  quantum computers, if they are build, will be able to do much more
BUT SEE NEXT PARAGRAPH. (Every year my students are surprised to find out that quantum computers probably CANNOT solve SAT in poly time.)

ADDED LATER: Comment 6 has a pointer to a survey of MANY quantum algorithms for MANY algebraic problems, and also a pointer to a more recent article on quantum algorithms. I will be delighted if the number of quantum algorithms now exceeds the number of books on quantum computing.

Thursday, December 12, 2013

Approximate Computing

Hadi Esmaeilzadeh is the newest professor in the School of Computer Science at Georgia Tech. Hadi works in computer architecture and did some great work on dark silicon. (Just noticed this is starting to look like a Lipton-style blog post.)

I had Hadi give a lecture in my theory class (the dreaded day-before-Thanksgiving lecture). Hadi talked about his new research directions in approximate computing. Approximate computing is a new paradigm in the architecture community, doesn't even have its own wikipedia entry yet.

When you do arithmetic operations, say addition and multiplication on real numbers, you typically want full precision up to the limits of the bits we use to store those numbers. Suppose you allow some error, say 5%. For logical operations, this would be a disaster giving a very wrong answer. Running various optimization algorithms, like simplex, these error might compound leading to very suboptimal results.

But there are several scenarios where approximate computing might not hurt that much. Processing media, like pictures, sound and video, are not exact anyway and a small error might not degrade the quality successfully. Statistical methods that sample a large space, such as when we analyze big data, still could yield reasonable results using approximate computing.

Why do approximate computing? Because of the architecture--approximate computing can be done often faster and with less power consumption. The tradeoffs may allow approximate computing to handle tasks beyond what we can do with traditional computing.

Approximate computing needs good theoretical models and that's where our community can come it. What's the right way to model the power-speed-accuracy tradeoffs and how can we determine the right computational problems that can take advantage of these tradeoffs. Might have nice connections to learning theory and property testing.

Monday, December 09, 2013

Inventions are Non-Commutative: Amazon vs Amazon

(Tal Rabin, Shubhangi Saraf and Lisa Zhang asked me to remind you to publicize this: the bi-annual Women in theory (WIT workshop), NYC, May 28-30, 2014. Apps due Jan 20, 2014. Go here for all relevant information)

If FAX machines had come out 20 years earlier they would have had far MORE impact.
If FAX machines had come out 20 years later they would have had NO impact since by then
we all had email and scanners and what not. So when an invention comes out matters.

Ask your grandparents what white-out was for corrections on a typewriter (while you are at it ask your grandparents what a typewriter is). If it came out 20 years later then Bette Nesmith  would have not made 50 million dollars, her son Michael Nesmith would not have had the spare time to become a musician and there might not be a group  called THE MONKEES. Gee, we would have no LAST TRAIN TO CLARKSVILLE. But I digress (from what?).

If Lasik  eye surgery came first and glasses later, glasses may have been seen as a great way to avoid surgery!

Amazon is working on a technology that may become obsolete before it really takes off. I am referring to the Amazon Drone Project. The idea is that when your order an item from Amazon a drone will get it to your house (or perhaps wherever you are) VERY FAST - maybe within 30 minutes. This works well for objects under 5 pounds. Say like what Amazon is best know for BOOKS.

But there is an a technology out there that may kill this idea before it gets off the ground. There is a competing company called Amazon which has already developed ways for people to get books on what they call a Kindle, which is like emailing them a book. So there is NO Physical object to be delivered.
Hence the Amazon Drone project may, like the 8-track tape (ask your great grandparents) become obsolete due to technology being developed by Amazon.

I wonder- had Amazon Drone come out 20 years ago would it have had more impact? Would it have made less of a demand for Amazon Kindle?

Amazon Drone may still have an impact by delivering other things. But I wonder how long it will take before 3-d printers make MOST things that Amazon delivers not need to be physical objects.

Thursday, December 05, 2013

Bitcoins Revisited

Two years ago I gave a lecture and posted about bitcoin. Of course what I didn't do was buy a bitcoin whose value back then was about $3 and today runs in the $1000 range.

Bitcoins have received quite a bit of press, particularly with the FBI shutting down Silk Road, the drug trafficking site which used bitcoins for their transactions. Then people realized that bitcoins are starting to become a real currency, with a market cap of about US$11 billion not far now from the money supply of the Costa Rica Colones (US$13 billion). Now governments are deciding on how to deal with bitcoins as a currency, one which they really can't regulate or control.

The Economist has one of the better articles on Bitcoins, talking about some of the technical issues involved.
The Bitcoin system is designed to cope with the fact that improvements in computer hardware make it cheaper and faster to perform the mathematical operations, known as hashes, involved in mining. Every 2,016 blocks, or roughly every two weeks, the system calculates how long it would take for blocks to be created at precisely 10-minute intervals, and resets a difficulty factor in the calculation accordingly. As equipment gets faster, in short, mining gets harder. But faster equipment is constantly coming online, reducing the potential rewards for other miners unless they, too, buy more kit. Miners have formed groups that pool processing power and parcel out the ensuing rewards. Once done with ordinary computers, mining shifted to graphics-processing units, which can perform some calculations more efficiently. Miners then moved on to flexible chips that can be configured for particular tasks, called field-programmable gate arrays. In the past year, bespoke chips called ASICs (application-specific integrated circuits) have appeared on the scene.
Then there was the paper Dorit Ron and Adi Shamir wrote that explored the bitcoin transaction graph and suggested a (now supposedly debunked) connection between the mysterious creator of bitcoins, "Santoshi Nakamoto", and Ross William Ulbricht aka Dread Pirate Roberts, the founder of Silk Road.

Bitcoins even make it to my Thanksgiving table. My brother thought they were a scam even though I pointed out the systems has no scammers. He remains unconvinced though he invests heavily in gold, which has the same property of the value mostly being there because people believe it has value.

I taught a lecture on bitcoins again in my intro theory course. We all generally agreed that a few years from now we'll all remember the days when bitcoins were worth $1000. Not sure we'll remember those days because bitcoins will be worth millions or because they'll be worth pennies.

Monday, December 02, 2013

Global Warming and the Axiom of Choice

Who was the first scientist to warn of Global Warning? These questions are complicated, but I would say it was Bing Crosby in a paper called White Christmas. Here are the first two lines:
I'm dreaming of a white christmas
Just like the ones I used to know
Why are there no more white christmas's? Because global warming made it stop snowing!

Why do otherwise intelligent people refuse to believe that Global Warming is real and is caused by humans and we we need to do something about it? I have a conjecture and an analog. Here is what I think the reasoning is

  1. Republicans have the following AXIOM (until they are in office): government IS the problem, not the solution. More than this, they think that there is NO problem that requires government action.
  2. Consequence: Government should do NOTHING about Global Warming.
  3. Since Government shouldn't do anything about Global Warming, it is not a problem.
Rather than rethink their AXIOM they accept the conclusion that Global Warming is either not a problem or not caused by humans. The shame of it is that there ARE economically viable ways, perhaps moderate republican ways, to fight global warming- some version of Cap-and-trade, or pay-to-pollute. And getting off of Fossil Fuels would be good for other reasons. I can picture history going a different way so that Republicans want more fuel-eff cars to get us off of Mideast Oil. I can picture a history where the insurance companies are more powerful than the oil companies for lobbying and hence Government takes LOTS of action against global warming.

Are their things in math where people accept an axiom despite its absurd consequences?Yes:
  1. Most math people believe the Axiom of Choice.
  2. Consequence: the Banach-Tarski Paradox

Rather than rethink their AXIOM they accept the absurd conclusion that you can break a ball into 5 pieces, reassemble, and get twice the volume. Fortunately, believing this does not endanger the planet.

Monday, November 25, 2013

The Institute for proving Graph Isomorphism is in P

(This post was inspired by Adam Winklers awesome book
Gunfight: The Battle over the Right to Bear Arms in America.
Disclaimers one: Adam Winkler is my cousin and I got a free copy.
Question: Should I give him a free copy of my VDW book when it comes out?
Disclaimer two: Scott did a post on a related matter here.)

If someone started an Institute to prove Graph Isomorphism is in P that would
be very odd since it could be that GI is not in P.
If someone started an Institute to study Graph Isomorphism that would be
much less odd (though still somewhat odd).

Does it make sense to have an openly biased think tank?

  1. If a pro-gun-control person writes a book that proves that there weren't that many
    guns in America in the early 1800's would you believe it?
  2. If an anti-gun-control person writes a book claming that the more guns there are
    the less crime there is, would you believe it?
  3. The CATO Institute: A Libertarian Think Tank.
    If they did an honest study of gun control and concluded that it does reduce
    crime then would they publish it? I honestly do not know.
    If they did an honest study of gun control and concluded that it increases
    crime then would anyone believe it? Being openly biased might undermine their credibility.
  4. The Tobacco Institute (they no longer exist). They produced reports
    claiming that smoking was not unhealthy (or perhaps that the evidence is incomplete).
    They were employed by the Tobacco industry. Did they ever have any credibility?
    Did they do any unbiased science, perhaps on non-smoking issues?
    I honestly don't know.
It is tempting to say Scientists should not have an opinion before they do a study. But this is clearly not correct in theory or practice. Scientists do indeed have an opinion, even an interest, in what a study will tell. Why is that different from the Tobacco institute?
  1. An honest scientist's preconceived notions are hopefully also based on science and not on who is paying him and not on other non-science factors.
  2. An honest scientist, when faced with evidence that they are wrong, will hopefully pursue that evidence and perhaps change their mind. This might be easier in math than in science since Proof is our accepted criteria. For example, I doubt there are diehards who still think that NL ≠ coNL.

Tuesday, November 19, 2013

The New Patrons

A few centuries ago if you wanted to do science and not independently wealthy you needed help.
Most of the important astronomers and natural philosophers (as well as artists) in the 16th and 17th centuries depended on the patronage of powerful religious or political figures to fund their work. Patronage networks extended all the way from Emperors and Popes to regional nobles to artisans to peasants; even university positions were based to some extent on patronage. Scholarly careers in this period were driven by patronage, often starting in undistinguished universities or local schools or courts, and traveling closer or farther from centers of power as their fortunes rose and fell.
Today most scientists have salaried positions at universities and get funded by the government but with sequestration and budget cuts, scientists have to seek out other sources, such as industrial funds. We've long had various scholarships endowed by private donors: Sloan, Packard, MacArthur. Recently though we've seen some new patrons, the upper 1%, who want to help out where other funds are limited. Some of these work through endowed positions at universities, but we also see some who create foundations dedicated to funding directed at research.

In the past few months I came face-to-face, or at least in the same room, as two of them: Landon Clay in Oxford for the opening of the new Maths Institute partially funded by his foundation and Jim Simons, when I visited Stony Brook and had lunch in the Simons Center for Geometry and Physics. The Clay Mathematics Institute funds several mathematicians and offers the million dollar bounty on P v NP and other open questions. The Simons Foundation supports a few theoretical computer scientists, not to mention the Simons Institute in Berkeley.

Of course the more money coming into our field, the more research we can do. But patronage does have its other side.
Patronage, and the desire for more, also shaped the work and publications of scientists. Effusive dedications to current or potential patrons can be found in almost every scholarly publication, while the interests of a patron in a specific topic was a strong incentive to pursue said topic—or reframe one's work in terms of it. Galileo, for example, first presented the telescope as a naval instrument to military- and commerce-focused Republic of Venice; when he sought the more prestigious patronage of the Medici court in Florence, he instead promoted the astronomical potential of the device (by naming the moons of Jupiter after the Medicis).
 How much do the lessons of the 16-17th centuries still apply today?

Thursday, November 14, 2013

Local Reductions

With the STOC deadline passing on Monday, now is a good time to look at the arXiv to see what has been posted since then. Hamid Jahanjou, Eric Miles and Emanuele Viola have a new paper, Local Reductions, that gives a new reduction from NTIME(t) to 3-SAT formulas of size t polylog(t). The twist to their new reduction: there is an NC0 circuit C that maps the number i to the ith clause. NC0 means every output bit depends on only a constant number of input bits. The proof uses old-fashioned parallel routing.

Should have some interesting applications. It does save a step in Williams' proof that ACC0 ≠ NEXP but the combined proofs are longer.

In other news, I've been getting several email from other CS chairs looking for students to hire as faculty in their departments. The latest CRA News is 59 pages, 50 of them are faculty job ads. It's a good year to be on the job market.

Tuesday, November 12, 2013

Four answers to the Recip problem

In my last post I asked you to solve the following question which
was from the Maryland Math Competition:

The inequalities 1/2 + 1/3 + 1/6 = 1 and 1/2 + 1/3 + 1/7 + 1/42 = 1
express 1 as a sum of three (resp. four) reciprocals.

Find five positive integers a,b,c,d,e such that
1/a + 1/b + 1/c + 1/d + 1/e = 1.

Prove that for any positive integer k GE 3 there exists positive intgers numbers d1,d2,...,dk
such that 1/d1 + ... + 1/dk.

The HS students had the following solutions.
I list the answers to part b first.  I sketch the proofs. They are all by induction.

1) Use 1/n = 1/(n+1) + 1/n(n+1).  This was the most common solution.  This leads to (2,3,7,43,1806) for part a.

2) Since the question itself gives the solution for m=2 and 3 we only need P(k) --> P(k+2)
Use 1/n =  1/2n + 1/3n + 1/6n.  This leads to (2,3,12,18,36).
One of the students later told me that knew the solution (i) but did it this way to
avoid having to multiply 42 by 43 which is needed to get part a using that solution.

3) Inductively that the largest denom n is even Use 1/n = 3/3n = 1/3n + 2/3n = 1/3n + 1/(3n/2)
Less than five students did 2b this way.  This leads to (2,3,7,63,126) for 2a.

4) If (d1,...,dn) is a solution then so is (2,2xd1,...,2xdn).
Only two student did it this way.  It leads to (2,4,6,14,84), which they both used.

NOBODY did in the non-inductive way mentioned in the last post.

There were THIRTY TWO solutions to 2b.  Several people had their part 2a and 2b not
related to each other at all.  This was far more solutions than I anticipated.
While grading I got good at adding reciprocals.
I list them in lex order along with how many people did that answer.
(This is likely approx- I may have miscounted a bit, but its basically right)

(2,3,7,43,1806) - 91 (linked to solution 1 above)

(2,3,7,48,336)  - 3

(2,3,7,56,168)  - 1

(2,3,7,63,126)  - 6 (linked to solution 3 above)

(2,3,7,70,105)  - 1

(2,3,8,25,600)  - 1

(2,3,8,30,120)  - 1

(2,3,8,32,96)   - 6

(2,3,8,36,72)   - 5

(2,3,8,42,56)   - 11

(2,3,9,21,126)  - 2

(2,3,9,24,72)   - 4

(2,3,9,27,54)   - 3

(2,3,10,20,60)  - 5

(2,3,11,22,33)  - 1

(2,3,12,15,60)  - 1

(2,3,12,16,48)  - 1

(2,3,12,14,84)  - 2 (linked to solution 4 above)

(2,3,12,18,36)  - 12 (linked to solution 2 above)

(2,4,5,25,100)  - 3

(2,4,5,30,60)   - 1

(2,4,6,14,84)   - 3

(2,4,6,16,48)   - 1

(2,4,6,18,36)   - 2

(2,4,6,20,30)   - 1

(2,4,7,12,42)   - 4

(2,4,7,14,28)   - 2

(2,4,8,12,24)   - 6

(2,4,8,10,40)   - 2

(2,5,6,10,30)   - 1

(2,5,6,12,20)   - 2

(3,4,5,6,20)    - 3

Monday, November 11, 2013

A problem on Reciprocals

(I thought I had posted this a while back but I can't find it in past blogs
so I think I did not. I DID post a diff problem on reciprocals.)

Here is the question I graded a while back on a  Maryland Math Olympiad.
I request that you do it and post your answer as a comment- I'll be curious
how your answers compare to the students who took it.
I will post the solutions the students used in my next post and comments
on how they were similar or different than yours.
The students had two hours to do five problems.
This was problem 2.

The equalities 1/2 + 1/3 + 1/6 = 1 and 1/2 + 1/3 + 1/7 + 1/42 = 1
express 1 as a sum of three (resp. four) reciprocals.

PART A: Find five distinct positive integers a,b,c,d,e  such that

       1/a + 1/b + 1/c + 1/d + 1/e = 1.

PART B: Prove that for any positive integer k  GE 3 there exists k distinct positive intgers numbers d1,...,dk such that

1/d1 + 1/d2 + ... + 1/dk = 1.

Thursday, November 07, 2013

A Theorist Goes to SOSP

Monday I attended the 24th Symposium on Operating Systems Principles, the lead conference for computer systems research. Why would a nice theorist go to SOSP? Trying to recruit a few good systems faculty for Georgia Tech.

I really enjoyed the day in ways I didn't expect. I found several of the talks interesting, even from a theory perspective. Austin Clements, in the first and one of the best paper talks, said he had a theorem and proof (roughly if operations scale there is an implementation that scales well on multicores), though purposely left the formalization and proof out of the talk and focused on implementations. Kay Ousterhout built on some theoretical tools for job scheduling. In a talk after I left, a group from Texas takes a step towards practical proof-based verifiable computing. I never expected to be cited in a SOSP paper.

When I go to a theory conference I see so many people I know that I don't spend enough time meeting new people. At SOSP, I knew a handful of people and just had a great time talking to people I haven't met before, particularly students.

Only thirty papers get presented in single track in this conference held every two years. STOC/FOCS accepts over 300 papers in the same time period. Having an SOSP paper is a really big deal. Despite having only thirty talks and traditionally held in hard-to-reach places (this year an hour and a half drive from Pittsburgh), there were 628 attendees split 42% students, 42% non-student academics, 15% industry and one member of the press.

The 2013 SOSP is the first ACM conference will fully open proceedings and the authors retained full rights to their paper, the gold standard espoused by many in our community. It didn't come cheap, the conference put up $1100/paper to the ACM to pay for the privilege.

Tuesday, November 05, 2013

My Pope Number is 2: The Smaller World Hypothesis

I proofread Piergiogrio Odilfreddi's book (which is on Lance's List of Favorite Complexity Books) for which I got a generous acknowledgment. I have also
visited him in Italy, though not for a while. 

Benedict.Pope Emeritus (I think that's what he is still called) broke his silence with a letter to Odilfreddi, see here.

Hence I am two handshakes away from Pope Benedict.  It used to be said that there were Six degrees of separation-- for all people a,b there is a path of length at most 6 that links them. The graph varies with you you ask, but it tries to pin down that a and b know each other.

Is six now too big? One measure is how many Google hits
`X degrees of separation' gets
  • Six degrees gets 1,760,000 hits
  • Five degrees gets 97,300 hits
  • Four degrees gets 159,000 hits
  • Three degrees gets 605,000 hits
  • Two degrees gets 843,000 hits
The last one may not be quite fair- there was an episode of Pokemon
with the title `Two degrees of Separation' and also a company with that name.

How well two people know each other has to be defined carefully.

  1. Erdos Numbers- Put an edge between a and b if they have a paper together.
  2. Bacon Numbers- Put an edge between a and b if they appear in the same movie.
  3. Handshake Numbers (I am not sure its every been called that)- Put an edge between a and b if they have shaken hands.
  4. knows-number (likely not defined). Put a DIRECTED edge from a to b if a will return b's phone calls and/or email.
  5. Twitter Numbers (Not sure if its ever been defined). But a directed edge between a and b if a follows b on twitter.
Odilfreddi may be an articulation point in the handshake graph or the knows-graph since he is in math AND known to the public (at least in Italy) as an outspoken atheist, so he connects two worlds. Another articulation point might be David Seetapun who has a PhD in computability theory (he worked on Recursive Ramsey Theory which is how I know of him), Finance (Goldman Sacks), Gambling in Las Vegas, and swordfish fishing (he won the Golden Fly Tarpon Tournament). He may be the key to connecting mathematicians to fisherman.

The following is probably known but I couldn't find it- what is the longest distance between two websites (number-of-links to go from one to the other)?
The average? Are these numbers getting larger or smaller?

ADDED LATER: Christian Sommer emailed me the following two

Diameter of the web and

Tools to study the web graph

The first link claims the avg diameter of the web is 19.

Friday, November 01, 2013

Andrzej Mostowski (1913-1975)

Andrzej Mostowski was born 100 years ago today. While Mostowski worked in many areas of logic, including early fundamental work on model theory, for our readers he's best known for co-discovering the arithmetic hierarchy, sometimes called the Kleene-Mostowski hierarchy.

The arithmetic hierarchy has a few different equivalent definitions but let's use one based on computability. We define inductively there hierarchies, Σi0, Πi0 and Δi0. Σ000000 are the computable sets and
  1. Δi+10 are the sets computable with a Σi0 oracle.
  2. Σi+10 are the sets computably enumerable with a Σi0 oracle.
  3. Πi0 = co-Σi0.
In particular, Δ10 are the computable sets and Σ10 are the computably enumerable sets. The halting problem is Σ10-complete under computable reductions, the set of Turing machines that accepting infinite sets are Π20-complete.

We completely know the structure of the arithmetic hierarchy, for i > 0, Σi0 ≠ Πi0 and for i ≥ 0, Δi0 = Σi+10 ∩ Πi+10.

The arithmetic hierarchy inspired the polynomial-time hierarchy in complexity theory. Unlike the arithmetic hierarchy, separations in the polynomial-time hierarchy remain open and any separation implies P ≠ NP. While we have relativized worlds which do quite a few different separations and collapses in the polynomial-time hierarchy the following remains open: Does there exist a relativized world where the polynomial-time hierarchy looks like the arithmetic hierarchy, i.e., for i > 0, Σip ≠ Πip and for i ≥ 0, Δip = Σi+1p ∩ Πi+1p?

Monday, October 28, 2013

University of Maryland Job Posting Mentions Quantum Computing explicitly!

The University of Maryland at College Park has its job posting up (its been up for a while). You can look at it here. I It lists THREE areas but says that they will take applicants from any area. This is believable since they only listed three. Had they listed (say) seven then I would not believe they are looking at other areas. What is the X such that if they list X then you believe they will take from other areas but if you list X+1 then you don't?

The three areas listed are:

  1. Cybersecurity
  2. Quantum Computing
  3. Natural Lang. Proc.
All three of these seem more particular than I usually see in job postings. That is, I've seen things like  Systems, Theory, AI. SO- is this unusual? I don't quite know--- I haven't been on the market for a long time.

Thursday, October 24, 2013

Science and Humanities

David Hollinger, a historian, wrote a recent Chronicle Review article The Wedge Driving Academe's Two Families Apart: Can STEM and the human sciences get along?, one of a number of articles I see talking about the connections between science and humanities and the future of humanities at universities.

Most scientists do find great value in the humanities and I would hope vice-versa. But when funds get tight, different fields talk about their relative importance--it happens between science and humanities broadly, it happens between theory and systems in CS departments with limited slots to hire.

I feel badly for humanities these days. In a tight job market, students and parents think hard about doing a humanities major while universities are trying to find ways to cut costs. I don't have a solution--right now the job market calls for more computer scientists than English majors, but I would hate to see an intellectual core of our academic world shrink away.

Humanities are cheap. A provost once said to me it costs the same to hire five philosophers as one physicist once start-up costs and salary are considered. We should find a way to keep funding the humanities while maintaining the strengths across all fields.

Pushing the bounds of human wisdom is important, whether it be in chemistry or classics. Only when we push in all directions does the ball of knowledge truly expand.

Monday, October 21, 2013

Teaching without a net

As a grad student I was teaching the linear-time Median finding algorithm and I FORGOT
that I needed to solve the more general problem of selection. After less than a minute
of trying to see what was wrong I told them
I am sure that Median IS in linear time. I will consult sources  and redo this tomorrow.
I then did the rest of the lecture (which didn't require knowing the Algorithm for Median) and the next day I did the linear Median Finding Algorithm correctly.

Note that I was teaching well known material. So I KNEW that what I was saying was true even if I couldn't  prove it. I also KNOW that I could look it up. I was TEACHING WITH A NET.

When I taught Grad Algs a few years ago I sometimes didn't quite know how the PROOF went  BUT I knew that the STATEMENTS I made were correct, and the algorithms and proofs were out there. In one case I emailed the original author with a subtle point I was stuck on. (It really was subtle- the author himself had to think about it). TEACHING WITH A NET

Last semester some of my Ramsey Theory course was taught WITHOUT A NET. Not in termsof the statements of theorems, but in my attempt to find easier proofs of theorems--- sometimes my alleged proof DID NOT WORK. And there was no book I could consult, nor person I could ask, to help me out on these new ``proofs''. One of my attempted simplifications (of the Canonical Ramsey Theory) DID NOT pan out in the end.

This semester  I am teaching an honors interdisplinary course on Fair Division (nicknamed 'Cake cutting'). I've pulled material from a  variety of different subfields (math, CS, AI. Yes AI!). So I have put some things together that are ``new''(not worth-publishing-new but new in some sense). Some of them have been wrong, or to be more fair, not quite right. But WHO CAN I ASK? Nobody! This is truely teaching WITHOUT A NET. I have made about 2 incorrect statements (both of which were prefaced with `this might not be quite right') but the bigger effect is that every day I wonder if what I am saying is correct.
The effect on the actual course is mininal-- but my mentality going in ``will I make a mistake today that I cannot recover from'' is... interesting.

What to do if you are wrong? Own up to it ASAP. Every minute you fumble around you lose the classes interest.

Is the course working? I think so-- they are learning and having fun. It helps that they are honors students who chose to take this course.

Wednesday, October 16, 2013

2013 Fall Jobs Post

Time again for the annual fall jobs post. As always the best places to look for academic CS positions are the job sites at the CRA and the ACM. Also check out the postdoc and other opportunities on the Theory Announcements site and the Intractability Center. It never hurts to check out the webpages of departments or to contact people to see if positions are available.

I encourage everyone who has a job to offer in theoretical computer science at any level to post links in the comments.

Faculty hiring has rebounded nicely and with computer science enrollments expanding, it should continue to be quite robust. Postdocs will still be down from a few years ago.

Good luck to everyone in the market. I look forward to seeing your names in the 2014 spring jobs post.

Monday, October 14, 2013

Who controls what is taught- the dept or the students?

There is a debate about the questions:

To what extent do we give them what they NEED?  what they WANT?

These questions permeate many other discussions of education.

Rather than discuss this profound issue I will discuss a fictional example.

  1. A dept offers one section of Operating Systems (henceforth OS) in the fall and one section in the spring.
  2. The same dept also offers one section of AI (henceforth AI) in the fall and one section in the spring.
  3. They notice after a few years that the OS tends to underfill and the AI course tends to overfill.
  4. Hence they switch to offering OS in the Spring only, and  AI is offered two in the fall and one in the spring.
  5. Over time more students take AI and less take OS. Some of this is interest but some is that AI is easier to fit into a schedule since its always offered and has two sections in the spring.
  6.  All of the teachers are excellent (remember this is fictional) so the quality of teaching is not the issue. The courses are of equal difficulty so this is not the issue. The courses have the same prerequisites so this is no the issue.
  7. The next hiring season they decide to hire someone in AI since they need the teaching help.
The department DID NOT  mean to send the message:
AI is more important than OS.
NOR  did they mean to send the message
 We will let the students decide what is important.
  But the department ended up sending both messages. What should the dept have done? For one they should DECIDE if this is okay with them--- is AI more important than OS? Or more directly, is it okay that students graduate without having a course in OS as  long as they've had a course in AI? They may decide YES- and that would be fine. If they decide NO they could restructure the requirements OR have the advisers give that advice OR just offer less sections of AI.

Does your department fall into this trap--- ending up giving  student's opinions more sway then you intend? 

Wednesday, October 09, 2013

Shut Down

The NSF core proposal in theoretical computer science, or Algorithmic Foundations as the NSF calls it, has three deadlines this academic year:
  • Medium proposals ($500k-$1.2m): October 15
  • Large proposals ($1.2m-$3m): November 19
  • Small proposals ($0-$500k): January 17
For the most part nearly every core program in computer science has the same deadlines, making it quite an interesting time in CS departments when most of the faculty are all submitting their proposals last minute in January.

Let's talk not about January but about October 15, next Tuesday. Good luck trying to download the proposal call for algorithmic foundations, or the NSF grant proposal guide, or submitting your proposal on Fastlane. All NSF links take you here, where you can read all about what is not happening at the NSF during the government shutdown. So what about October 15?
Once normal operations resume, NSF will issue guidance regarding any funding opportunities that have a deadline or target date that occurs during the government shutdown. Such information will be disseminated via a FastLane Advisory and other electronic methods.
In principle, the government could reopen for business Tuesday morning and the proposals would still be due Tuesday 5 PM. I would guess the deadline would be extended but there is nobody to ask, no one at the NSF to answer the phones and NSF employees are forbidden from responding to or even reading email. At least those that already have grants can keep spending their money, most importantly continuing to fund their students.

These are short term problems, the government will re-open at some point and the NSF will get back to business. But all discussions seem to lead to budget cutting and even just erasing the sequester of last year seems unlikely. The budget crises hasn't stopped Eric Cantor and Lamar Smith from continuing to trash some NSF grants.

Science is too important to be a pawn in politics. Investments in science have given incredible value back to America in terms of jobs and economic growth. Yet somehow science never gets mentioned as a tragedy in the Washington money battles.

Monday, October 07, 2013

P vs NP is Elementary? No-- P vs NP is ON Elementary

As I am sure you all know, the TV show Elementary  (Premise- Sherlock Homes in Modern Day NY. He emails and Texts!  Watson is a female! and...) had an episode that involved P vs NP in a big way. I think they would have been better off with a fictional problem (Bourbaki's conjecture in Recursive Algebraic Topology?) rather than a real problem that they could say rather odd things about.
  1. Sherlock Holmes doesn't know what the Mill. Prizes are. I thought most educated people did. Everyone I know knows about them. Could be the company I keep.
  2. The show indicates that `Solving P vs NP' means `showing P=NP' It never seems to dawn on them that maybe P is NOT NP.  
  3. The show  assumes that once P=NP is proven it will take a very short time to write a program to  use it.  If P=NP is true then I suspect taking the proof and making it work on real world problems would take several years.
  4. The show focuses on P=NP's implications for crypto. As Lance has pointed out in his book if P=NP then the benefits for society are GINORMOUS, and would dwarf the relatively minor problem of having to switch to private key  (I agree with Lance for the long term, but I think the short term would be chaotic for security).
  5. The show refers to seven Mill problems. While this is technically correct they really should mention that one of them (Poincare's conj.) was already solved. 
  6. They seem to think that algebraic geom would be used on P vs NP. If they were claiming it was being use to prove P NE NP then I would think of the Geometric Complexity Theory Program and be impressed. Since they were using it to work on P=NP I'm less impressed. If Alg Geom really is used to prove P=NP then I'll be impressed.
  7. How was the episode- I am a fan of the show in general, and this was a solid but not outstanding episode. I wonder if I knew less about P vs NP would I enjoy it more.
  8. They are talking about P vs NP on National TV! That's kind-of nice. Only danger is the overhype. If  P NE NP is shown and this has no real world applications then the public may be confused.  I suspect we won't have to worry about that for at least 300 years.

Thursday, October 03, 2013

Celebrating Maths in Oxford

This week I'm in Oxford for the opening of the new Andrew Wiles Mathematical Institute building and the Clay Research Conference including on workshop on New Insights on Computational Intractability.

The building is beautiful with two small towers, one each for pure and applied math, and a common room joining them. The downstairs, where the talks are being held, is separated from the tower by its own glass dome, so, I was told, that the noise of the students don't disturb the great thinkers above.

The Clay Math Institute, best known in our circles for the millenium prize problems, has moved to Oxford from Cambridge (Massachusetts) and will take up residence in this new building. Unlike Jim Simons, Landon Clay didn't have a particular math background but was looking for a purpose for a charitable foundation. He met Andrew Wiles and loved the story of his proof of Fermat's last theorem and Clay realized the need for basic math research. Besides the millenium problems, the institute sponsors a number of research fellows and the move to Oxford reflects a transition to funding American mathematicians to a more international base.

What about the new insights into intractability? Lots of great talks on connections to physics and economics, on proof complexity, information theory, algebraic and circuit complexity. On the other hand I watched some talks on other millenium prizes and while difficult to follow, it looks like they have measured progress towards resolving their problems. In complexity, we're still searching for that true path towards P ≠ NP.

Monday, September 30, 2013

Long Tails and Fat Heads

Sometimes words or phrases are used in MATH and then spread to the REAL WORLD. I have blogged about how the terms Prisoner's Dilemma has become a real-world-phrase here and speculated about the terms Venn Diagram, Zeno's Paradox, and n+1 here.

I recently came across a pair of words that are related--- one of them seems to be (like Prisoner's Dilemma) going from MATH to THE REAL WORLD. The other one is very odd in that I've seen it in the REAL WORLD but it SHOULD be in MATH.

Long Tail: A Probability distribution has a long tail if there are MANY items that have a SMALL but NON-ZERO prob of happening. This is a term in probability. However, I have seen it used in the REAL WORLD as in Amazon has a long-tail strategy meaning that they will sell LOTS of DIFFERENT things even if the number of people buying some of them is small (like this which is ranked 9,578,520- though I doubt they can be that precise). This article from the Atlantic Monthly points out that ESPN used to have a long tail strategy (e.g., showing Billiards and others sports that are not that popular, but ALOT of them) but then abandoned it for... see next definition. Note that the term Long Tail is used for both a type of Prob Dist and a marketing strategy related to it. How common a word is Long Tail? It gets 66,500,000 hits on Google. The first page has the definition above only. The 10th page had about half of the hits with the def above.

Fat Head: A strategy where you concentrate on just a few items. ESPN is doing that by covering just a few sports, but the most-watched ones (too bad, I was hoping they would cover my favorite sport, chess boxing). This SHOULD be a math term for a Prob Dist with just a few points of high prob. I asked my friends in the ML community and he assures me that NO its not a math term--- but it SHOULD be! How common a word is this? It gets 2,300,000 hits on Google. The first page seems to have NOT have ANY reference to the definition above.

SO- this COULD be a case where a term used in the REAL WORLD migrates to MATH with essentiallythe same meaning. This isn't that uncommon (the term Continuity comes to mind) but this timeI will have predicted it! Maybe I should do Machine Learning.

Saturday, September 28, 2013

Complexity and FOCS

The Conference on Computational Complexity Call for Papers is out, deadline November 27.

The deadline for early registration and hotel for the FOCS conference in Berkeley is October 4. Student travel support is available.

Thursday, September 26, 2013

Dealing with Death

Mary Jean Harrold, a professor of software engineering at Georgia Tech, passed away last week. Mary Jean was 67 and still quite active before the cancer struck.

Computer science is still a relatively young field and most of even the early computer scientists remain quite alive. So a death in the field, particularly a colleague, makes a mark because it typically is happening at a young age. I've lost five co-authors (Avner Magen, Steve Mahaney, Andrej Muchnik, Nick Reingold, Carl Smith) all well before their time. Each death is a stark reminder of what's important in life, what does the next theorem mean when life seems so short?

We're nearing a time in computer science that many of our ranks will die after living to a ripe old age. Those remembrances will be of a life well lived. But there will always be lives cut short. The best we can do is remember them and move on and continue to build on the research tradition they left behind.

Tuesday, September 24, 2013

Crystal Math- What NUMB3RS and BREAKING BAD both get wrong

The TV show Numb3rs  had as a premise that a GENIUS mathematician
could help solve crimes. Is this true? I rather doubt you need a GENIUS-
though of course some prob, state, data mining,  the math behind forensics, and a few other things help. And it may help to know some number theory if a mathematician who is working on the Riemann hypothesis has his daughter kidnapped.  But I don't think you need someone on the level of Charles Eppes.

The TV show Breaking Bad  (see Honest Trailor for Breaking Bad and/or
Idiots Guide to Breaking Bad if you've seen the first 4.5 seaons at least)
has as a premise that a GENIUS chemist can make really good crystal meth. And as a by product it's blue. I know nothing about the crystal meth business; however a chemist friend of mine (who has never made the stuff) tells me that YES, being a careful chemist is good, and certainly better than a meth-head who is more likely to blow up his lab than to produce any, a GENIUS chemist would not be any better than a good chemist.

The TV show Elementary  (Sherlock Holmes in modern day New York) and many other shows (Monk, Perception, Psyche, The Mentalist, Columbo, and others I am sure) has as a premise that a GENIUS observer could help solve crimes. This may be more true then the above, but there are other tools available today (e.g., DNA).

All of these shows, and others, make the  FALLACY OF EXTRAPOLATION. Taking a good idea and extrapolating it to absurdity.

Here is a non-TV example: If blogging is part of my job, and I can deduct job expenses for Tax purposes, then I should be able to deduct the cost of the DVD's for Numb3rs that I bought because of this post.

Sunday, September 22, 2013

STOC CFP still delayed- but I was asked to pass this along

Since there were comments on the blog about the STOC and CCC CFP not being out yet I mailed various people who are in-the-know. I got email from David Shmoys  (STOC PC chair 2014) telling me
(1)  The  STOC the call is still delayed,
(2)  There is a website about it, here, that is INCORRECT - the REAL deadline for submission will be Nov 11, 2013 (4PM east coast time.)
(3) Please post this correction on complexity blog.
(So I just did.)

Note that Lance and I are NOT involved with the organization of STOC or CCC. The blog entry is just passing along information.

Wednesday, September 18, 2013


Every now and then we need new words and phrases come into our lexicon, like the unfortunate "twerking", but here's another "tl;dr", short for "too long; didn't read". I'm guessing it started as an insult/excuse not to read a long document, blog post or email. Respond "tl;dr" and you've put the blame on the writer for being too loquacious to the tweet-friendly generation.

Now I see tl;dr used as a short synopsis sometimes by the author themselves, the new executive summary if the executive has six seconds to read. Maybe I should tl;dr my class lectures: "Finite Automata: Easy to analyze but too weak to do much interesting".

Are we really moving to this brave new world of micro-attention spans? Is this just another reason that newspapers are dying and blogs are passé? When I write email should I keep it short and be misunderstood, make it long and have it not be read or add a tl;dr summary and get the worst of both worlds?

Monday, September 16, 2013

Did YOU think the NSA could factor fast?

Before the recent revelations about the NSA (see Lances Post and Scott's post )I would tell my class, when teaching P and NP,

We have very good reasons to think that Factoring is NOT NP-complete. As for P--- much murkier. People have tried to get it into P because of crypto and have not succeeded, hence many people think that Factoring is NOT in P. But there is so much math there that perhaps could be exploited to show it IS in P. Another very odd possibility is that it is KNOWN to be in P but only by the NSA. Crytpo has had this before- where some concepts were known to governments before they were known to the public,  even the academic public.

I will need to revise that now. BEFORE the recent revelations there were
the following points of view on factoring:
  1. The NSA cannot factor any better than what is known in the literature. Maybe a bit better because they use more parallelism.
  2. The NSA has taken the known algorithms and found the right parameters and has special purpose hardware so can do them better than anyone else, but nothing of interest mathematically. Perhaps some very interesting subtle points of math and hardware. What they have would not get into STOC/FOCS/CRYPTO (though maybe it should- that's another debate). This is the one I believed.
  3. The NSA has an algorithm that is better than the literature (e.g., exponential in (log n)^{1/5}).  But not in P. This would surely get into STOC/FOCS/CRYPTO and win a prize.
  4. The NSA has factoring in P through some very interesting and new mathematics. If this was public then perhaps a Turing Award.  Some serious number theorists do think that Factoring IS in P, so this one is not quite so implausible.
  5. The NSA has a quantum computer that factors quickly. I do not now of anyone serious who believed this. Of course, this could be a case of the No True Scotsman Paradox--- if someone really believed this I would (perhaps unfairly) declare them non-serious.
  6. The NSA does not have a better algorithm, but has managed to put trapdoors in stuff so that they and only they could break certain codes.(A covert version of Clipper Chip.) So they can break codes but not in a way that is interesting mathematically.
There may be more but I don't know them off hand. Item 6 I never heard people say, though that might be a function of the company I keep. I do not know what the most common view was, but I would guess item 2.This reminds me of  Karmarkar's Algorithm which I've heard runs fast because of the implementation- the algorithm is not a secret but exactly how they implement it is. (Note- just because I've heard this does not mean it's true.)

The truth seems to be that the truth is between 1 and 2, closer to 1, and also item 6.
In particular, the NSA does not seem that much ahead of academics.

In the past governments were way  ahead of academics in crypto. This no longer seems to be the case (at least in America). I speculate that this is because there is now a large community  of people doing research in crypto openly, publishing openly, so the government is no longer the only (or almost the only) game in town. Also, many non-government industries use crypto and some do research in it. This also helps the NSA- they can use results in the open literature, but they can't get that much ahead of it.

Are there other fields where the government is ahead of academics and industry? On a guess stuff with weapons and weapon detection, since not many academics work on that. Maybe sociology since the government has  census data and other data that is not available to the public.

Thursday, September 12, 2013

Cryptography and the NSA

Back at Northwestern I occasionally taught an undergraduate cryptography class since I was the local expert in the field (a statement not even remotely true at Georgia Tech). I would cover the Advanced Encryption Standard, an implementation of a one-way key-based permutation. AES had many components included an S-Box that seems like a random shuffle but is actually based on the inverse of a polynomial. One sentence in the textbook of Trappe and Washington made the following point (page 161).
The S-box was constructed in an explicit and simple algebraic way so as to avoid any suspicions of trapdoors built into the algorithm. 
Really? Can't we trust the government not to put back doors into our standardized cryptographic algorithms?

After reading last week's New York Times article on the NSA, I realize my naivety. The NYT article doesn't go into how and which protocols the NSA has their hand in but I now understand the concern.

It doesn't look like the NSA has actually broken cryptographic protocols, have a secret quantum-like computer in their basement or polynomial-time algorithms for SAT. I could go on for pages but Scott has done an excellent job talking about the complexity issues involved. They've more likely found ways to access your information before it has been encrypted or after its been decrypted.

Matthew Green wrote a nice post speculating on what the NSA might be able to do, so nice that it caused some controversy at Johns Hopkins.

The whole Snowden affair gives us a glimpse into the NSA but they hide their capabilities well and we'll never know the full extent of their knowledge.

Monday, September 09, 2013

T/F - No Explanation needed VS T/F-Explanation needed.

One of the comments on my blog on Types of question for exams
A True/False math question where they have to prove their answer. A student who picks the wrong answer can figure that out during the proof and then correct their answer. A student who picks the wrong answer and proves it has proven they really don't have a clue
Actually I once did an experiment about this! It's only one so I don't know what to read into it, but I will describe it and speculate.

CMSC 250 is the Sophomore Discrete Math course, required for all majors. CS 3 is a co-req. It's a course on how to prove simple things. We DO go over how a FOR ALL statement can be true vacuously (E.g.,all of the students over  10 feet tall will get an A+). Around 150 students take the course. In the spring there is an honors section of about 20.  I PLANNED the following:

  •  In Spring of 2008 one of the questions on the final was a set of FIVE statements where the students had to, for each statement, say if its TRUE or FALSE and NO JUSTIFICATION NECC. One of the statements was  If A is a set of natural numbers such that the powerset of A has 5 elements then A is infinite.
  •  In Spring of 2010 one of the questions on the final was a set of FIVE statement where the students had to, for each statement, say if it's TRUE or FALSE and IF TRUE THEN GIVE A SHORT JUSTIFICATION, IF FALSE THEN GIVE A COUNTEREXAMPLE.

Note that the statement is TRUE since there are NO such sets A.

So,  how did they do?

  1. When NOT needing to justify or give a counterexample, of the 150 students in the class, 14 got it right. There was no correlation (or perhaps a very weak one) between those who got it right and those who did well in the course or those that were in the honors section.
  2. When the DID need to justify or give counterexample, of the 152 students in the class, 19 got it right. Slightly stronger correlation to those who got it right and those who did well in the course and to those in the honors section.
I would say the 5 extra students and the slightly better correlation is too small to care about. I was surprised--- I thought being forced to find a countexample would help them along. But this is a rather
tricky question which some non-theory faculty members had trouble with when I explained this story to them. Exam Pressure was likely NOT a factor as my exams have very little time pressure- by the end of the exam
there were only about 30 students left taking it.

Here are the answers I got: 
  1. FALSE- clearly A is finite.
  2. FALSE- too obvious to say why.
  3. FALSE- there is no such A
  4. Variants of the above.
  5. Incoherent things that may be similar to the above.
 UPSHOTS: This is a failed experiment in that I didn't prove or disprove the hypothesis that asking students to justify makes more students get it. Of course, even if I had shown that it would only be for this one problem. I DID show that this problem is trickier than I thought.  I may try this again with a less tricky problem.

Friday, September 06, 2013

Myhill Nerode versus Pumping Lemma

I have seen some recent backlash against the pumping lemma for showing that languages are not regular and as I am now teaching regular languages I had to choose should I teach the pumping lemma or Myhill-Nerode to show languages are not regular. Let's review both definitions (taken from Wikipedia)

Pumping Lemma: If a language L is regular, then there exists a number p ≥ 1 (the pumping length) such that every string uwv in L with |w| ≥ p can be written in the form uwv = uxyzv with strings x, y and z such that |xy| ≤ p, |y| ≥ 1 and uxyizv is in L for every integer i ≥ 0.

Myhill-Nerode: Given a language L, and a pair of strings x and y, define a distinguishing extension to be a string z such that exactly one of the two strings xz and yz belongs to L. Define a relation RL on strings by the rule that x RL y if there is no distinguishing extension for x and y. It is easy to show that RL is an equivalence relation on strings, and thus it divides the set of all finite strings into equivalence classes.

The Myhill–Nerode theorem states that L is regular if and only if RL has a finite number of equivalence classes, and moreover that the number of states in the smallest deterministic finite automaton (DFA) recognizing L is equal to the number of equivalence classes in RL. In particular, this implies that there is a unique minimal DFA with minimum number of states.

The two basic complaints about the pumping lemma: Five quantifiers and it is not complete--there are nonregular languages that can be pumped. To the first point if you think of the pumping lemma as a game with the adversary choosing p, x, y and z, the quantification is not as confusing as some would think. Myhill-Nerode also has five quantifiers when you spell it out: For all regular L, there exist x1,...,xk such that for all y there is an i such that for all z, xiz is in L iff yz is in L.

As to the second part, the counterexamples are contrived and usually go away with simple closure properties. Consider the one from wikipedia:

Take L ∩ (01(2∪3))* eliminates the strings in the first part of L and now it is easy to pump.

So I don't buy the arguments for Myhill-Nerode over pumping. Nevertheless I'll teach the pumping lemma and Myhill-Nerode because they are both so cool.

Tuesday, September 03, 2013

Types of questions for exams

QUESTION: Give as many types of exam questions you can, give examples, and comment on if this is a good type of question.

My answer below.

  1. A problem that some students can get right even if they never had the course because they have seen it in some other course. EXAMPLE: In a course on Ramsey Theory have a question that uses the Prob. Method. PRO: The question is still in scope for the courses. CON: A bit awkward that someone may have learned the material elsewhere. UPSHOT: This is FINE.
  2. A problem that some students can get right even if they never had the course because they are quite clever. EXAMPLE: Easy Combinatorics or Probability in a sophomore Discrete Math Course. PRO: The question is still in scope for the courses. CON: A bit awkward that someone may have missed class but still got it right. UPSHOT: This is FINE.
  3. A rigged question--- students saw two examples in class, two examples on the HW and now have to do one themselves. EXAMPLE: proving numbers irrational. PRO: Clearly in scope and fair. PRO: They will surely understand what you are asking for. CON: They may get it right via memory rather than understanding (they may not even know the difference.) UPSHOT: This is FINE though it requires some planning ahead of time.
  4. A rigged question with a twist--- students saw two examples in class, two examples on the HW and now have to do one themselves but its DIFFERENT in an important way. EXAMPLE: In class and HW do many problems like Here is the distribution, here is a random var, what is its expected value but on the exam give Here is a random var, here is what we want for the expected value, give a distribution that gives us that. PRO: Harder to memorize template. CON: May be hard to grade as they say odd things. CON: May be confusing to know what you are asking for, even for good students. UPSHOT: This is FINE though it requires some planning ahead of time.
  5. A problem that requires utter mastery of the material but no creative thought. EXAMPLE: Give the algorithm (that we did in class) for proving that a CFG's are in P. Write it up so that someone who had never seen it can understand it. PRO: Straightforward yet hard to get via memorization. CON: Might be too time consuming for an exam. CON: (From experience) no matter how much you say in bold letters things like Write it up so that someone who had never seen it can understand it. They will skip steps and write it up badly and its hard to tell if THEY really know it. UPSHOT: I do this but only in certain cases.
  6. A problem that requires them to be creative (this is ill defined but its the opposite of the one above). PRO: If they truly understand the material they can do this. CON: My PRO may be incorrect. UPSHOT: Absolutely fine for HW which are not worth much for the grade anyway and I can enlighten them. I tend to avoid these on exams. Though the line between creativity and standard is a thin one. (Problem for an exam: How thin in millimeters?)
  7. A giveaway question. When I teach Formal Lang Theory I have (going back to when I was Harry Lewis's TA in 1981) have on the exam Give an example of a string of length 4 over the alphabet {a,b}. An unintended consequence- if they CAN"T do this its a really bad sign. I have asked this question many times and I have literally NEVER seen someone get it wrong and pass the course. I have gotten the following answers: ab*, ababa, and a DFA recognizing aaaa (that I was tempted to give credit to but did not). Incidentally, the most common right answer has always been abab. Second is abba. PRO: I have this one early in the exam to calm them down.
I try to ask some of each type on an exam. However, sometimes a question can be easier or harder than you intended, or be harder to grade then you thought, or not be in category you thought it would be in. The hardest line to draw is which questions are a matter of mastery and which are a matter of creativity? Another issue- some students can abstract better than others.

When teaching a large course such as Sophomore discrete math (150-200 students) I tend to get a uniform distribution skewed a bit on the high side. More precise: I tend to get at roughly 10 students in EVERY 10-point interval: 0-10, 10-20, 20-30,..., 90-100, with less on the low side and more on the high side. The benefit of this is that the students who get (say) less than 40 CANNOT say Well--- everyone did badly. They really are send a signal to either work harder or drop (I tell them this directly as well). I don't understand profs who give exams where nobody cracks 50/100 (I have heard this is common in Physics). They are wasting half of the grade spectrum.

Wednesday, August 28, 2013

The Dream

I have this theory that everybody's notion of "recent history" starts not from their memories but from their birth date. Case in point: Billy Joel's We Didn't Start the Fire. The first major event of my then very young life came from an oppressed people making their voices heard. The newspapers in the early days of my life were full of fear of violence that might come from the upcoming march on Washington. But 200,000 souls came out fifty years ago today in a peaceful demonstration asking for the basic freedoms the rest of America had.

Having moved to the birthplace of Martin Luther King, Jr from the hometown of the first black president, I know much has improved in the last fifty years. But we know King's dream is far from fulfilled, obvious to us from the paucity of African-Americans in our conferences and classes.

Take a moment of your day, watch the greatest speech of the 20th century, and remember how far America has come, and how far America has yet to go.

Monday, August 26, 2013

What are Galois Games?

How are math concepts named?

  1. After the people who was involved with it. Examples: The Cook-Levin Theorem, Goldbach Conjecture, Ehrenfeucht-Fraisse games,
    Banach-Tarski Paradox.
  2. A descriptive name:
    Examples: Chromatic Number; Girth of a graph (length of shortest cycle). This resembles the definition of Girth in English though I have only heard the word used in mathematics;
    Duplicator-Spoiler games.
  3. A name that conjures up a nice image. Examples: Dining Philosophers problem;
    The Monty Hall Paradox (though future historians will think he was a great Probabilist).
  4. Name may have very little connection to the concept. Example: The Pell equation.
I saw an article whose title was Greedy Galois Games. I wondered what this game could be.
  1. Do the players alternate picking polynomials and if the composition is solvable by radicals then (say) Player I wins.
  2. Did Galois invent some game?
The first game I thought of might be interesting; however, the paper was not about that. Nor was it about some game Galois invented. So---what is a Galois game? Aside from being a mathematician what else is known about Galois:
He died in a duel!
In the article Greedy Galois Games they study a DUEL between two BAD DUELISTS. The idea is that if both have prob of hitting p (and p is small) and they want to make it fair, first Alice shoots, then Bob shoots the min number of times so that the prob of Bob winning exceeds Alice's, then Alice shoots a number of times so that her prob of winning exceeds Bob's, etc. The paper ends up involving the Thue-Morse sequence. They are NOT using the name Galois the way we use Banach in Banach-Tarski Paradox, nor the way we use Monty Hall in The Monty-Hall Paradox. The fact that Galois was a mathematician has nothing to do with the naming,  The authors are using  Galois because he is a  famous duel-loser. They could have used Alexander Hamilton (who lost a Duel to Aaron Burr) and then called them Greedy Hamiltonian Games, in which case I would assume that the game involved
Hamiltonian cycles or Quaternions.

Thursday, August 22, 2013

P = NP and the Weather

In the Beautiful World, my science fiction chapter of The Golden Ticket where P = NP in a strong way, I predicted that we could predict weather accurately enough to know whether it will rain about a year into the future. Besides putting Novosibirsk on the wrong side of Moscow, my weather prediction prediction has drawn the most ire from my readers.

Here was my thinking: Weather forecasting comes down to modeling. Find a good model, use the current initial conditions and simulate the model. P = NP can help dramatically here by making what should be the hardest part, finding the right model, easy. P = NP would help create much better models and should lead to far more accurate and deep forecasts than before. A year ahead prediction of weather didn't seem out of the realm of possibility.

As my readers point out, one cannot put in all of the initial conditions which would involve too much data even if we could get it, and small random events, the so-called butterfly effect, could dramatically change the weather in even a short period of time. Dean Foster, a Penn statistician, wrote me a short piece giving an analogy to a game of pool over time changed by the gravity generated by a single proton.

So how far can you predict the weather if P = NP? A month? Of course we'll probably never find out since I doubt P and NP are the same. In retrospect I shouldn't have put in such an aggressive weather forecasting because it detracts from other great things that happen if P = NP such as curing cancer.

Monday, August 19, 2013

When Lance was 10 years old..

In honor of Lance's 50th birthday I ask the following: When Lance was 10 years old which of the following were true?
(Disclosure- some of the below are from a birthday card.)

  1. A REMOTE meant a secluded spot off the beaten path.
  2. CABLE was something that supported a bridge.
  3. A VIDEO GAME was trying to make out what fuzzy images were on a snowy black and white 10 inch TV screen.
  4. A CELL PHONE was what you used to make one phone call from jail.
  5. A CALCULATOR was the accountant who did your parents taxes.
  6. AN AIRBAG was someone who talked too much.
  7. DIGITAL COMPUTING was counting on your fingers.
  8. HIGH SPEED ACCESS was an on-ramp to the freeway.
  9. SURFING was something done on a board in the ocean.
  10. A BIRTHDAY was something Lance looked forward to.
  11. A MOUSE was something you didn't want in your house.
  12. A SPAM ASSASSIN was someone who killed people by giving them poisoned spam.
  13. A WEB was what spiders wove.
  14. A BUG was what spiders ate.
  15. AMAZON meant where some big rain forest is (smaller now).
  16. GOOGLE was an obscure term used by some math folks for the number 10100.
  17. BING had no meaning.
  18. APPLE was either a fruit or the record company founded by the Beatles. (There really WAS a legal name-issue when Apple-the-computer-company got into music see here .)
  19. It was impossible to have 10,000 friends.
  20. There were only three Network channels and a few local ones.
  21. Music was on Vinyl records.
  22. You went to the bathroom during commercials.
  23. Johnny Carson joked that couples had sex during commercials on his show. (Ask your grandparents who Johnny Carson was, what commercials were, and what sex was.)
  24. People read books written on paper.
  25. Computer Science was not available as a major at most schools.
  26. When people said you sound like a broken record they actually knew what a broken record sounded like.
  27. People really would DIAL a phone number.
  28. People would have to actually stop at toll booths instead of using easy-pass.
  29. Long running TV shows would have one (or at most two) Christmas episodes since there were no arcs, hence an episode could be inserted into any season at any time. Contrast: M*A*S*H in its 11 seasons and 256 episodes had TWO Christmas episodes, where as 30 ROCK its 7 seasons and 131 episodes had FOUR Christmas episodes. (This may be THE least important consequence of the new technology.)
  30. There were bar room fights over trivia since you couldn't just look it up on Google. The Guinness Book of World Records was supposed to cut down on bar fights, but it didn't quite work.
  31. People knew how to read maps and get a sense of where things were instead of relying on technology. That's why today the number of hikers who get lost has skyrocketed.
  32. If MTV existed they would still be playing music videos. The question Why doesn't MTV show Music Video's anymore has been asked so often it is now Cliche. But the above video provides an answer.
  33. Lance did not recognize the importance of NP-completeness. Then again, neither had the math community, the non-theory computer science community, and Probably parts of the theory community.
  34. To find out what time it was you couldn't look at your cell phone, TV set, or Microwave. You had to go outside and look at your sundial.
  35. TV shows may have pilot episodes, or may not, but they didn't bother with explaining everything. Thought experiment: If Mr. Ed was on today
    they would explain how he could talk (A government experiment gone wrong? gone right?) rather then the ONE line by Mr. Ed in the first episode: Don't try (to understand why I can talk)--- its bigger than both of us.
  36. We all watched a TV show the same night. Contrast- last month I watched Firefly.
    (If you are a fan of firely check this out.)
    Bizarre result of this--- since people can't find people to talk about shows as much as the used do, there is now a show called TALKING BAD where people on the show TALK ABOUT Breaking Bad
  37. The final Jeapordy theme music didn't have lyrics. Now it does: here.
  38. When you heard a mnemoic device like Kids Prefer Cheese Over Fried Green Spinach it was hard to find out what it meant- now its easy (just use Google!)
True story: On March 9, 1967 John Smith (not his real name) wanted to watch Star Trek (Episode: Devil in the Dark) but his parents wanted to take the family out for dinner. So he pretended to be sick so he could watch it- because, as he puts it, if I don't see it now I will NEVER GET TO SEE THE EPISODE, EVER!!!!!. Imagine a world without DVR, DVD, TIVO, On-Demand, Hulu. He doesn't have to imagine it. Our younger readers do.

I think SURFING, MOUSE, and SPAM really have changed primary meanings. FRIENDS may have also.

Thursday, August 15, 2013

Flash Gordon

We watched the movie Ted last week but this post isn't about that movie. The movie has several references to the 1980 movie Flash Gordon including an extended cameo by Sam Jones who played Flash.

Flash Gordon and its soundtrack from Queen saved me senior year of high school--whenever I felt down I would listen to the album and run the movie through my head escaping reality for a little bit. These were the days before videos and CDs, now I've rewatched the movie several times on DVD.

Flash Gordon was not a great movie by any means but it resonated with me with its action sequences, great music and corny lines like "Flash, I love you, but we only have fourteen hours to save the Earth!". The stars of the movie Sam Jones and Melody Anderson were and still are relatively unknown but it had a great supporting cast.

Topol, best known as Tevye in Fiddler on the Roof, played a scientist who many mocked for his crazy (but true) ideas of what was happening in outer space. Basically the same character as when he played Galileo.

Timothy Dalton played Prince Barin and would go on to be James Bond and the Max von Sydow, who played chess against Death in The Seventh Seal, was the Ming the Merciless.

What does this all have to do with computational complexity? Absolutely nothing. But today I turn 50, it's my party and I'll post what I want to.

Monday, August 12, 2013

How much Trig does your governor know?

How much math should our public officials know? Basic probability and statistics so they can follow the arguments that their science advisers give them. And they should hire good objective science advisers and listen to them.

How much Trigonometry should a Governor know? Should a Governor know the angles of a 3-4-5 triangle? The following true story is paraphrased from Somewhat more than Governors need to know about Trigonometry by Skip Garibaldi.

In June 2004 Governor Jeb Bush of Florida was giving a talk to promote state-wide annual testing of students in public schools. A high school student asked him What are the angles in a 3-4-5 triangle? He responded I don't know. 125, 90, and whatever is left to add up to 180. Note that (1) he knew that 3-4-5 triangle has a 90 degree angle, (2) he knew that the angles of a triangle add up to 180, but (3) he didn't realize that 125+90 > 180. Still, I suspect most governors would do worse. The real answer is 90, 53.1 (approx), 36.9 (approx). A retired math professor was later quoted as saying I would not expect many mathematicians to know that.

The paper then proves the following:

The Governors Theorem: If a right triangle has integer
side lengths then the acute angles are irrational when measured
in degrees.

When politicians say things that contradict current science (e.g., on evolution or global warming) I wonder if they know the truth and are lying to please their voters, or if they honestly don't know the truth.I also wonder which one is worse. In the case above I think Jeb honestly didn't know, and that's fine.