Sunday, April 27, 2014

The Big bang Theory hits close to home

On a recent episode of  The Big Bang Theory Sheldon gives up String theory because, after working on it for 20 years, he has nothing to show for his efforts. I have been asked `is anyone working on P vs NP or have they given up because after 20 years of effort they have nothing to show for their efforts.'' Aside from GCT I don't know if anyone is ``working on it'' though it depends on what ``working on it'' means.

During the episode there were some random things about science that were said. If it was math I may have a sense if it was nonsense or not. Here I am not quite sure, so I list some and see what you think, or what you know, or what you think you know.

  1. A physicist named Barry says that he's not a String Theorist- he's a String Pragmatist. Is there such a thing as a String Pragmatist? 
  2. Sheldon referred to his attempt to compactify extra dimensions. Do string theorists do that?
  3. Penny said that giving up a field is like breaking off a relationship. What do you think of that? When I stopped doing computability and started doing combinatorics it was not a clean break. I gradually stopped looking at logic journals and started looking at combinatorics journals. But there are awkward moments- like when I study  Computable Combinatorics.
  4. Does the magazine Cosmopolitan really have an article on how to deal with a breakup in, to quote Sheldon, literally every single issue.
  5.  Is Geology:Science like the Kardashians:real celebrities? Actually, since the notion of a celebrity is just to be well known, I suppose the Kard's ARE real celebrities. But the real questions is- do Physicists look down on Geology?
  6. Is Standard Model Physics less advanced than string theory?
  7. Is Calculation of nuclear matrix elements a fad?
  8. Are string theory books so large that a bully would use one to hit someone over the head with? (There is a really heavy book on gravity which is better and more appropriate.)
  9. Has string theory made any progress towards proving that it's correct in the last 20 years?
  10. Will the TV-RESET rule make it such that Sheldon is back doing string Theory soon? This is also called the Dirty-Harry rule: no matter how many times Dirty Harry is kicked off the force, he's back by the next movie. Sometimes by the next scene.

Thursday, April 24, 2014

Libraries Without Books

Georgia Tech in its library renovation will move the vast majority of its books into a combined Emory/Tech Library Service Center off-campus. Very few people use the stacks anymore, particularly at this institute dominated by engineering and computer science.

As a young assistant professor at the University of Chicago in the 90's, I averaged about one trip a day to the joint Math/CS library to track down research papers or old books to find various mathematical definitions and theorems. These days I can access all these papers online and get access to mathematical definitions through Wikipedia and other sites.

I miss seeing the papers near the papers I'm looking for. I remember running across a paper grappling with the union of two complexity classes, which inspired this blog post. In order to prove a theorem about context-free languages, we relied on a characterization buried in a 1966 textbook on the topic. The Chicago library had three copies of that book, so I suppose someone once taught from it, a whole course on Context-Free Languages in the 60's.

I last time tracked down books in the library (at Northwestern) to research the history of P v NP for my book, which will itself likely soon follow the others into these "service centers". I can't remember the last time I went to the library to track down information for research purposes.

There's much to miss about looking up research materials in a library. But I also miss my electric typewriter, my almanac and my World Book encyclopedia collection. Can't stop progress.

Monday, April 21, 2014

Changing the ORDER you teach things in might help A LOT

I am teaching the undergrad jr/sr course Elementary Theory of Computation
which is Reg Langs, Context free Langs, Computability theory, P and NP.

Over the last few years I've changed some of the content (I dumped CFLs-- don't worry, they cover them some in the Prog Langs course) but more to my point today changed the ORDER I do things in

Change  1: Do P and NP first and Computability later. While this is not historically accurate (which may be why the order was what it was)
this is better pedagogically. Why?  Consider the following reductions:

1) Given a FORMULA phi produce a graph G and a number k such that
phi is in SAT iff (G,k) is in CLIQ

2) Given a MACHINE M_e produce a MACHINE M_i such that

M_e(e) halts iff M_j halts on at least 10 numbers.

Students have always found the second more confusing since the input and the output are both machines (and the reduction itself is done by a machine)
where as the first one seems like a REAL transformation- you input a FORMULA
and get out a GRAPH and a NUMBER.

There are two other reasons to do P NP first:
(1) Sometimes the last topic in a course gets short changed, and P NP is
more important than Computability, so better if Computability gets short changed.
(2) Lets say I could do P NP in either April or May. If I do it in April and someone resolves it in May, the students can get excited about it since they know about it. If I do it in May and its resolved in April then the students
cannot share in the joy of the discovery. This reason may become more relevant in the year 2214.

Change 2: I used to do the Cook-Levin Theorem (SAT is NP-complete) and
then do a reduction like SAT \le CLIQ. This semester I did SAT \le CLIQ first
and Cook-Levin later. Why? Because SAT\le CLIQ is better pedagogically (note- pedagogically is today's word on my word-of-the-day calendar) since the Cook-Levin reduction is harder and deals with Turing Machines as opposed to more familiar objects like Formulas and Graphs.

More generally- the order you teach things in may matter, and changing it is relatively easy.

Friday, April 18, 2014

Announcements

Time for a short rundown of announcements.

  • STOC will be held May 31-June 3 in New York City. Early registration and hotel deadline is April 30. Student travel support requests due by this Monday.
  • The newly renamed ACM Conference on Economics and Computation (EC '14) will be held in Palo Alto June 8-12. Early registration deadline is May 15. Hotel deadline is May 19th but the organizers suggest booking early because Stanford graduation is June 13.
  • The Conference on Computational Complexity will be held June 11-13 in Vancouver. Local arrangements information will be posted when available.
  • The ACM Transactions on Algorithms is searching for a new Editor-in-Chief. Nominations due May 16.
  • Several of the ACM Awards have been announced. Robert Blumofe and Charles Leiserson will receive the Paris Kanellakis Theory and Practice Award for their "contributions to robust parallel and distributed computing."
  • Belated congratulations to new Sloan Fellows Nina Balcan, Nayantara Bhatnagar, Sharon Goldberg, Sasha Sherstov, David Steurer and Paul Valiant.

Thursday, April 17, 2014

Should you reveal a P = NP algorithm?

A reader asks
What would you do if you could prove that P=NP with a time-complexity of n2 or better... moreover, would you publish it?  
There are all these statements of the good that could come of it. But how would the government react in its present state? Would it ever see the light of day? How would a person be treated if they just gave it away on the internet? Could a person be labeled a threat to national security for giving it away?
 I consider this a completely hypothetical and unlikely scenario. If you think this applies to you, make sure you truly have a working algorithm. Code it up and mint yourself some bitcoins, but not enough to notice. If you can't use your algorithm to mint a bitcoin, you don't have a working algorithm.

The next step is up to you. I believe that the positives of P v NP, like their use in curing diseases for example, greatly outweigh the negatives. I would first warn the Internet companies (like was done for heartbleed) so they can modify their systems. Then I would just publish the algorithm. Once the genie is out of the bottle everyone can use it and the government wouldn't be able to hide it.

If you can find an algorithm so can others so you should just take the credit or someone else will discover it. I don't see how one can get into trouble for revealing an algorithm you created. But you shouldn't take legal advice from this blog.

Once again though no one will take you seriously unless you really have a working algorithm. If you just tell Google you have an algorithm for NP-complete problem they will just ignore you. If you hand them their private keys then they will listen.

Sunday, April 13, 2014

Factorization in coNP- in other domains?

I had on an exam in my grad complexity course to show that the following set is in coNP

FACT = { (n,m) : there is a factor y of n with 2 \le y \le m }

The answer I was looking for was to write FACTbar (the complement) as

FACTbar = { (n,m) | (\exists p_1,...,p_L) where L \le log n
for all i \le L we have m < p_i \le n and p_i is prime (the p_i are not necc distinct)
n =p_1 p_2 ... p_L
}
INTUITION: Find the unique factorization and note that the none of the primes are < m
To prove this work you seem to need to use the Unique Factorization theorem and you need
that PRIMES is in NP (the fact that its in P does not help).

A student who I will call Jesse (since that is his name) didn't think to complement the set  so instead he wrote the following CORRECT answer

FACT = { (n,m) | n is NOT PRIME and forall p_1,p_2,...,p_L  where 2\le L\le log n
for all i \le L,  m< p_i \le n-1 , (p_i prime but not necc distinct).
n \ne p_1 p_2 ... p_L
}
(I doubt this proof that FACT is in coNP is new.)
INTUITION: show that all possible ways to multiply together numbers larger than m do not yield n,
hence n must have a factor \le m.

Here is what strikes me- Jesse's proof does not seem to use Unique Factorization.  Hence it can be used in other domains(?). Even those that do not have Unique Factorization (e.g. Z[\sqrt{-5}]. Let D= Z[\alpha_1,...,\alpha_k] where the alpha_i are algebraic. If n\in D then let N(n) be the absolute value of the sum of the coefficients (we might want to use the product of n with all of its conjugates instead, but lets not for now).

FACT = { (n,m) : n\in D, m\in NATURALS, there is a factor y in D of n with 2 \le N(y) \le m}

Is this in NP? Not obvious (to me) --- how many such y's are there.

Is this the set we care about? That is, if we knew this set is in P would factoring be in P? Not obv (to me).

I suspect FACT is in NP, though perhaps with a diff definition of N( ). What about FACTbar?
I think Jesse's approach works there, though might need  diff bound then log L.

I am (CLEARLY) not an expert here and I suspect a lot of this is known, so my real point is
that a students diff answer then you had in mind can be inspiring. And in fact I am inspired to
read Factorization: Unique and Otherwise by Weintraub which is one of many books I've been
meaning to read for a while.

Wednesday, April 09, 2014

Favorite Theorems: Extended Formulations

The first couple of favorite theorems took place at the beginning of the last decade, but recent years have brought exciting results as well, such as the limitations of using linear programs to solve hard problems.
It all started with a paper by E.R. "Ted" Swart (the Deolalikar of the 80's) that claimed to show P = NP by giving a polynomial-time algorithm for Traveling Salesman based on linear programming. Yannakakis put the nail in that technique by showing that no symmetric linear program formulation can solve the traveling salesman problem. The TSP problem expressed as a linear program has many facets (multi-dimensional faces) but one could possibly have a higher dimensional polytope with a smaller number of facets that could project down to the TSP polytope. Yannakakis showed that these higher dimensional symmetric polytopes must still have many facets by showing an equivalence between the smallest number of facets and the monotone dimension of a slack matrix of the polytope describing the TSP.

Not much happened since the 80's until Fiorini et al. generalized the Yannakakis result to give exponential lower bounds on the number of facets of general non-symmetric extensions of the polytope for traveling salesman. They combine Yannakakis' techniques with some communication lower bounds due to Alexander Razborov. Their approach was inspired by quantum computing ideas but in the end they had a more conventional proof.

Last fall RothvoĂź showed even stronger exponential lower bounds for the matching polytope through a "substantial modification" of Razborov's work. Since matching is in P, RothvoĂź showed that there are polynomial-time computable problems that can be formulated, but not succinctly formulated, as linear programs.

There has also been recent work showing lower bounds approximating clique via linear programs.

Ankur Moitra gave an excellent talk at the recent Dagstuhl giving an overview of these results and the techniques involved. The next step: Show lower bounds for semidefinite programs.

Monday, April 07, 2014

How free is Randomness?

Matthew Green had a great post on the topic how do you know a random number generator is working.  Gee, I just look at the sequence and see if it LOOKS random. Probably not the best idea. Actually people often found pattersn where there aren't any.


Reminds me of a the following apocryphal story which would have to have taken place before everyone had a computer: A teacher assigns the class a HW assignment to flip a coin 600 times and record H's and T's. He warns them that if they just try to come up with a random sequence without flipping he will know. Sure enough, a few of them had sequences with NO runs of 6 or more H's or T's. He knows they are faked. One of the students fessed up that yes, he just wrote down H's and T's in a way that looked random. But another student claims that he DID flip the coins, but when he saw a sequence of 6 H's in a row he changed it since he THOUGHT the teacher would spot it and THINK it wasn't random.

Is randomness a free resource? Papers in Complexity Theory strive to find good RNG's while papers in Crypto claim they already exist. Who is right?

Faulty RNG's are surely a problem for crypto protocols. But are they a problem for randomized algorithms? People really do use randomized algorithms to test primes. What other algorithms do people in the real world routinely use randomness for? (Not counting crypto protocols).Is it ever a problem?

I believe there are stories where a faulty RNG led to a crypto system being broken.

Are there stories where a faulty RNG led to a randomized algorithm being wrong?

Friday, April 04, 2014

How I learn a result

There is a theorem I want to learn. How do I go about it? (How do YOU go about it?) I give an example here which also leads to  pointers to some mathematics of interest.

A division of a cake between n people is PROPORTIONAL (henceforth prop.)
if everyone thinks they got (IN THEIR OWN MEASURE- AND PEOPLE CAN DIFFER WILDLY) at least 1/n. A division of a cake is ENVY FREE if everyone thinks they got the biggest (or tied) piece.

There are two kinds of protocols- Discrete and Moving Knife. I discuss Discrete here.

I had read several books on fair division (see here for a review) and had some interest in the topic. In particular (1) I knew the (fairly easy) discrete 3-person Envy Free protocol, but did not know (2) the rather complicated discrete 4-person Envy Free Protocol. And I wanted to learn it! How to do I do that (this isn't quite advice on how YOU should do it, but I am curious what you think of what I do AND what you do.)

  1. Find a good source for it. I used the original article from the American Math Monthly. It is NOT always the case that the original source is the best (I discussed this in another blog post.)
  2. Create a REASON to learn it with a deadline. I both agreed to teach it in seminar AND I volunteered to do an HONORS COURSE (interdisciplinary) on Fair Division (nickname: Cake Cutting). The first time I taught the course I didn't get to the theorem (getting my feet wet in both HONORS courses and the material was a limiting factor) but I have in subsequent years.
  3. While reading make HANDWRITTEN notes for the lecture I am going to give. I try  to understand everything that I  write down before  going on, but sometimes, I have to go forward and then backward. I find myself redoing parts of the notes many times. The notes  have many examples, pictures (which is why I don't type them initially), and counterexamples to check that the premises are needed. I check EVERY detail. What I present will  be less detailed.
  4. Type in the notes. (mine for 4-envyfree are here ) For me, somehow, being forced to type it  in I find corrections or better ways of saying or doing things. Also this produces notes for students or others. This MIGHT over time lead to survey articles or even textbooks but I DO NOT think in those terms. The notes that are produced are VERY GOOD FOR THE STUDENTS IN THE CLASS, or THE PEOPLE IN SEMINAR but NOT all that good for anyone else. (This may be why some textbooks are crap--- it is hard to turn class-specific notes into something anyone can use.)
  5. TOSS OUT the handwritten notes. Knowing you will do this makes the typewritten version better. Also its hard to keep track of where one puts handwritten notes. I used to keep a math notebook but its hard to find anything in it.
  6. In subsequent years when I teach the material I take my typewritten notes and make handwritten notes out of them. This force me to refresh on the material and I teach better if I can glance at notes that use large letters and symbols. If I do the course often enough I no longer need to. When I taught the Fair Division course in Fall 2013 I could do this proof off the top of my head.
There are some variants, Pros, and Cons of the above system:

  1.  If the material is easy I may go straight to type and skip handwritten. This was the case for protocols for proportional  cake cutting.
  2. I sometimes translate my notes, NOT to typed notes but to slides. There are some theorems whose ONLY writeup are my slides. I may or may not ever get them into notes  (that's a tautology).  I'll mention two but note that, even more than notes, slides are not really a good read unless you are taking the class. They are (1)  An adaption of Mileti's proof of the infinite Canonical Ramsey Theory to the finite case- it gives pretty good bounds, though not the best known. Very good for teaching, so I may submit to Journal of  Pedagogical Ramsey Theory.  I am sure these slides contain a correct proof since I ran them by Mileti himself., who was surprised and delighted  that someone got around to finitizing his proofs. The slides are here and here.(2) Using linear programming on some cake cutting problems. I doubt this is new, though I could not find it in the literature. I just did it for my class. Slides are here.
  3. If my GOAL is stuff for my class I may end up giving up. For example, I sort-of want to read the LOWER BOUND of Omega(nlogn) on number-of-cuts for n people, on prop. cake cutting due to Jeff Edmonds and Kirk Pruhs (this is Jeff Edmonds paper page). But the proof looks just a bit too hard for my students (recall- they are NOT math people, they are BRIGHT students from across majors). Knowing that I won't get to teach it to them I didn't quite have the motivation to look at it. I probably will someday--- for seminar.
  4. If its not just one paper but an entire area of knowledge I try to find all the papers in the area and make a website out of them. This is easier in a limited area. My Erdos Distance Problem Website has only 10 papers on it (its prob missing a few), where as my Applications of Ramsey Theory to TCS websiteApps  has 63 papers on it (I am sure its missing many). 
  5. One can get carried away with GATHERING papers as opposed to READING them.
What if you want to read a result, you begin, and its rather hard and/or you need a lot more background? There are several competing schools of thought:

  1. If at first you don't succeed, Try Try Again (credited to  William  Edward   Hickson).
  2. If at first you don't succeed, give up, Then Try Try again. Then quit. There's no point in being a damn fool about it  (credited to W.C. Fields)
  3. If at first you don't succeed then destroy all evidence that you tried.
  4. If at first you don't succeed, skydiving is not for you.
The question of when to hold 'em, when to fold 'em, when to walk away, and when to run permeates life and mathematics as well as gambling.


      Tuesday, April 01, 2014

      I am Bill Gasarch

      I have a confession to make. Remember back in 2007 when I retired from the blog for about a year. I didn't actually retire at all but instead took the blog in a new direction by writing under the pseudonym William Ian “Bill” Gasarch, to be more playful and irreverent. All I had to do was be a little off the wall, add a FEW CAPITAL LETTERS and some mispellings and voila, same blog, new blogger. I’m especially proud of the fake conversations “Bill” had with his “relatives”. 

      Now that I have come out, I’d like to thank Samir Khuller to help me set up a website off  Maryland’s CS page. The pictures of Bill are a friend of mine, a web programming consultant. Most of the rest of the links were plagiarized from the far corners of the Internet except for the God Play--wrote that one myself. 

      In 2008, I decided to return to the blog as myself and also continue as Bill. Great fun switching context between the two personalities. I particularly enjoyed doing the typecasts with the two personalities talking to each other.

      I have to admit Dick has done me one better, with his teenage-chess-genius-turned-complexity-theorist alter ego Ken Regan.

      I’m surprised how few people had caught on, after all how many of you have actually met Bill in person? But now that I have a respectable job, I realized it just wasn't right for me to keep fooling the community.