Wednesday, April 29, 2015

Is Logarithmic Space Closed Under Kleene Star?

A Georgia Tech student asked the title question in an introductory theory course. The instructor asked his TA, the TA asked me and I asked the oracle of all things log space, Eric Allender. Eric didn't disappoint and pointed me to Burkhard Monien’s 1975 theorem
L is closed under Kleene star if and only if L = NL.
L here is the set of problems solved in deterministic O(log n) space and NL is the nondeterministic counterpart. For a set of strings A, Kleene star, denoted A* is the set of all finite concatenations of strings of A. For example if A = {00,1} then A* = {ε, 1, 00, 11, 001, 100, 111, 0000, 0011, 1100, …} where ε is the zero-length string.

Kleene star comes up in regular expressions but also makes for many a good homework problem.
  1. Show that if A is in NL then A* is also in NL.
  2. Show that if A is in P then A* is also in P.
  3. Show that if A is c.e. (recognizable) then A* is also c.e.
Problem 1 above is equivalent to saying NL is closed under Kleene star and implies the “if” part of Monien’s result. Here is a simple proof of the other direction that L closed under Kleene star implies L = NL.

Consider the following NL-complete problem: The set of triples (G,s,t) such that G is a directed graph with a restriction that all edges (i,j) have i<j and there is a path from node s to node t.

Define the following language
B = {G#i+1#G#i+2#...#G#j# | there is an edge (i,j) in G}
B is computable in log space and the string G#s+1#G#s+2#…#G#t# is in B* if and only if there is a path from node s to node t. QED

Allender, Arvind and Mahajan give some generalizations to log-space counting classes and also notes that there are languages computable in AC0 (constant-depth circuits) whose Kleene star is NL-complete. B above is one such set.

Monday, April 27, 2015

Advice on running a theory day

Last semester I ran a THEORY DAY AT UMCP. Below I have ADVICE for people running theory days. Some I did, some I didn't do but wish I did, and some are just questions you need to ask yourself.

1) Picking the day- I had two external speakers (Avrim Blum and Sanjeev Arrora) so I was able to ask THEM what day was good for THEM. Another way is to pick the DAY first and then asking for speakers.

2) Number of external speakers- We had two, and the rest were internal. The external speakers had hour-long talks, the internal had 20-minute talks. This worked well; however, one can have more or even all external speakers.

3) Whoever is paying for it should be told of it towards the  beginning of the process.

4) Lunch- catered or out? I recommend catered if you can afford it  since good time for people to all talk. See next point.

5) If its catered you need a head count so you need people to register. The number you get may be off- you may want to ask when they register if they want lunch. Then add ten percent.

6) Tell the guest speakers what time is good for them to arrive before they make plans so you can coordinate their dinner the previous night.

7) If have the money and energy do name tags ahead of time. If not then just have some sticky tags and a magic marker.

8) Guest speakers- getting them FROM Amtrak or Airport to dinner/hotel --- give them a personal lift (they may be new to the city and a bit lost). Getting them from the event back TO the airport or amtrak, can call airline limo and taxi. (though if can give a ride, that's of course good.)

9) Pick a day early and stick with it. NO day is perfect, so if someone can't make it, or there is a faculty meeting that day, then don't worry about it.

10) Have website, speakers, all set at least a month ahead of time. Advertise on theory-local email lists, blogs you have access to, and analogs of theory-local for other places (I did NY, NJ, PA). Also email people  to spread the word.

11) Advertise to ugrads. Students are the future!

12) If you are the organizer you might not want to give a talk since you'll be too busy doing other things.

13) Well established regular theory days (e.g., NY theory day) can ignore some of the above as they already have things running pretty well.

Friday, April 24, 2015

Fifty Years of Moore's Law

Gordon Moore formulated his famous law in a paper dated fifty years and five days ago. We all have seen how Moore's law has changed real-world computing, but how does it relate to computational complexity?

In complexity we typically focus on running times but we really care about how large a problem we can solve in current technology. In one of my early posts I showed how this view can change how we judge running time improvements from faster algorithms. Improved technology also allows us to solve bigger problems. This is one justification for asymptotic analysis. For polynomial-time algorithms a doubling of processor speed gives a constant multiplicative factor increase in the size of the problem we can solve. We only get an additive factor for an exponential-time algorithm.

Although Moore's law continues, computers have stopped getting faster ten years ago. Instead we've seen the rise of new technologies: GPUs and other specialized processors, multicore, cloud computing and more on the horizon.

The complexity and algorithmic communities are slow to catch up. With some exceptions, we still focus on single-core single-thread algorithms. Rather we need to find good models for these new technologies and develop algorithms and complexity bounds that map nicely into our current computing reality.

Wednesday, April 22, 2015

The New Oracle Result! The new circuit result! which do you care about?

You have likely heard of the new result by Ben Roco, and Li-Yang on random oracles (see here for preprint) from either Lance or Scott or some other source:

Lance's headline was PH infinite under random oracle

Scott's headline was Two papers but when he stated the result he also stated it as a random oracle result.

The paper itself has the title

An average case depth hierarchy theorem for Boolean circuits

and the abstract is:

We prove an average-case depth hierarchy theorem for Boolean circuits over the standard basis of AND, OR, and NOT gates. Our hierarchy theorem says that for every d2, there is an explicit n-variable Boolean function f, computed by a linear-size depth-d formula, which is such that any depth-(d1) circuit that agrees with f on (1/2+on(1)) fraction of all inputs must have size exp(nΩ(1/d)). This answers an open question posed by Hastad in his Ph.D. thesis.
Our average-case depth hierarchy theorem implies that the polynomial hierarchy is infinite relative to a random oracle with probability 1, confirming a conjecture of Hastad, Cai, and Babai. We also use our result to show that there is no "approximate converse" to the results of Linial, Mansour, Nisan and Boppana on the total influence of small-depth circuits, thus answering a question posed by O'Donnell, Kalai, and Hatami.
A key ingredient in our proof is a notion of \emph{random projections} which generalize random restrictions.

Note that they emphasize the circuit aspect.

In Yao's paper where he showed PARITY in constant dept requires exp size the title was

Separating the polynomial hiearchy by oracles

Hastad's paper and book had titles about circuits, not oracles.

When Scott showed that for a random oracle P^NP is properly in Sigma_2^p the title was

Counterexample to the general Linial-Nissan Conjecture

However the abstarct begins with a statement of the oracle result.

SO, here is the real question: What is more interesting, the circuit lower bounds or the oracle results that follow? The authors titles and abstracts might tell you what they are thinking, then again they might not. For  example, I can't really claim to know that Yao cared about oracles more than circuits.

Roughly speaking the Circuit results are interesting since they are actual lower bounds, often on reasonable models for natural problems (both of these statements can be counter-argued), oracle results are interesting since they give us a sense that certain proof teachniques are not going to work. Random oracle results are interesting since for classes like these (that is not well defined) things true for random oracles tend to be things we think are true.

But I want to hear from you, the reader: For example which of PARITY NOT IN AC_0 and THERE IS AN ORACLE SEP PH FROM PSPACE do you find more interesting? Is easier to motivate to other theorists? To non-theorists (for non-theorists I think PARITY).

Thursday, April 16, 2015

PH Infinite Under a Random Oracle

Benjamin Rossman, Rocco Servedio and Li-Yang Tan show new circuit lower bounds that imply, among other things, that the polynomial-time hierarchy is infinite relative to a random oracle. What does that mean, and why is it important?

The polynomial-time hierarchy can be defined inductively as follows: ΣP= P, the set of problems solvable in polynomial-time. ΣPi+1 = NPΣPi, the set of problems computable in nondeterministic polynomial-time that can ask arbitrary questions to the previous level. We say the polynomial-time hierarchy is infinite if ΣPi+1 ≠ ΣPi for all i and it collapses otherwise.

Whether the polynomial-time hierarchy is infinite is one of the major assumptions in computational complexity and would imply a large number of statements we believe to be true including that NP-complete problems do not have small circuits and that Graph Isomorphism is not co-NP-complete.

We don't have the techniques to settle whether or not the polynomial-time hierarchy is infinite so we can look at relativized worlds, where all machines have access to the same oracle. The Baker-Gill-Solovay oracle that makes P = NP also collapses the hierarchy. Finding an oracle that makes the hierarchy infinite was a larger challenge and required new results in circuit complexity.

In 1985, Yao in his paper Separating the polynomial-time hierarchy by oracles showed that there were functions that had small depth d+1 circuits but large depth d circuits which was needed for the oracle. Håstad gave a simplified proof. Cai proved that PSPACE ≠ ΣPi for all i even if we choose the oracle at random (with probability one). Babai later and independently gave a simpler proof.

Whether a randomly chosen oracle would make the hierarchy infinite required showing the depth separation of circuits in the average case which remained open for three decades. Rossman, Servedio and Tan solved that circuit problem and get the random oracle result as a consequence. They build on Håstad's proof technique of randomly restricting variables to true and false. Rossman et al. generalize to a random projection method that projects onto a new set of variables. Read their paper to see all the details.

In 1994, Ron Book showed that if the polynomial-time hierarchy was infinite that it remained infinite relativized to a random oracle. Rossman et al. thus gives even more evidence to believe that the hierarchy is indeed infinite, in the sense that if they had proven the opposite result than the hierarchy would have collapsed.

I used Book's paper to show that a number of complexity hypothesis held simultaneously with the hierarchy being infinite, now a trivial consequence of the Rossman et al. result. I can live with that.

Tuesday, April 14, 2015

Baseball is More Than Data

As baseball starts its second week, lets reflect a bit on how data analytics has changed the game. Not just the Moneyball phenomenon of ranking players but also the extensive use of defensive shifts (repositioning the infielders and outfielders for each batter) and other maneuvers. We're not quite to the point that technology can replace managers and umpires but give it another decade or two.

We've seen a huge increase in data analysis in sports. ESPN ranked teams based on their use of analytics and it correlates well with how those teams are faring. Eventually everyone will use the same learning algorithms and games will just be a random coin toss with coins weighted by how much each team can spend.

Steve Kettmann wrote an NYT op-ed piece Don't Let StatisticsRuin Baseball. At first I thought this was just another luddite who will be left behind but he makes a salient point. We don’t go to baseball to watch the stats. We go to see people play. We enjoy the suspense of every pitch, the one-on-one battle between pitcher and batter and the great defensive moves. Maybe statistics can tell which players a team should acquire and where the fielders should stand but it still is people that play the game.

Kettmann worries about the obsession of baseball writers with statistics. Those who write based on stats can be replaced by machines. Baseball is a great game to listen on the radio for the best broadcasters don't talk about the numbers, they talk about the people. Otherwise you might as well listen to competitive tic-tac-toe.

Thursday, April 09, 2015

FCRC 2015

Every four years the Association for Computing Machinery organizes a Federated Computing Research Conference consisting of several co-located conferences and some joint events. This year's event will be held June 13-20 in Portland, Oregon and includes Michael Stonebraker's Turing award lecture. There is a single registration site for all conferences (early deadline May 18th) and I recommend booking hotels early and definitely before the May 16th cutoff.

Theoretical computer science is well represented.
The CRA-W is organizing mentoring workshops for early career and mid-career faculty and faculty supervising undergraduate research.

A number of other major conferences will also be part of FCRC including HPDC, ISCA, PLDI and SIGMETRICS. There are many algorithmic challenges in all these areas and FCRC really gives you an opportunity to sit in talks outside your comfort zone. You might be surprised in what you see.

See you in Portland!

Tuesday, April 07, 2015

Two cutoffs about Warings problem for cubes

  1. All numbers except 23 can be written as the sum of 8 cubes
  2. All but a finite number of numbers can be written as the sum of 7 cubes 
  3. There are an infinite number of numbers that cannot be written as the sum of 3 cubes(this you can prove yourself, the other two are hard, deep theorems).
Open: Find x such that:
  1. All but a finite number of numbers can be written as the sum of x cubes.
  2. There exists an infinite number of numbers that cannot be written as the sum of x-1 cubes.
It is known that 4 ≤ x ≤ 7

Lets say you didn't know any of this and were looking at empirical data.

  1. If you find that every number ≤ 10 can be written as the sum of 7 cubes this is NOT interesting because 10 is too small.
  2.  If you find that every number ≤ 1,000,000 except 23 can be written as the sum of 8 cubes this IS interesting since 1,000,000 is big enough that one thinks this is telling us something (though we could be wrong). What if you find all but 10 numbers (I do not know if that is true) ≤ 1,000,000 are the sum of seven cubes?
Open but too informal to be a real question: Find x such that
  1. Information about sums-of-cubes for all numbers ≤ x-1 is NOT interesting
  2. Information about sums-of-cubes for all numbers ≤ x IS interesting.
By the intermediate value theorem such an x exists. But of course this is silly. The fallacy probably relies on the informal notion `interesting'. But a serious question: How big does x have to be before data about this would be considered interesting? (NO- I won't come back with `what about x-1').

More advanced form: Find a function f(x,y) and constants c1 and c2 such that
  1. If f(x,y) ≥ c1  then the statement all but y numbers ≤ x are the sum of 7 cubes is interesting.
  2. If f(x,y) ≤  c2 then the statement all but y numbers ≤ x are the sum of 7 cubes is not interesting. 
To end with a more concrete question: Show that there are an infinite number of numbers that cannot be written as the sum of 14 4th powers.

Wednesday, April 01, 2015

Which of these stories is false

I would rather challenge you than fool you on April fools day. Below I have some news items. All but one are true. I challenge you to determine which one is false.

  1. Amazon open a brick and mortar store: Full story here. If true this is really really odd since I thought they saved time and money by not having stores.
  2. You may have heard of some music groups releasing Vinyl albums in recent times. They come with an MP3 chip so I doubt the buyers ever use Vinyl,but the size allows for more interesting art. What did people record on before Vinyl. Wax Cylinders! Some music groups have released songs on wax cylinders! See here for a release a while back by Tiny Tim (the singer, not the fictional character) and here for a release by a group whose name is  The Men Will Not Be Blamed For Anything.
  3. An error in Google Maps lead to Nicaragua accidentally invading Costa Rica. Even more amazing--- This excuse was correct and Google admitted the error. See here for details.
  4. There was a conference called Galileo was wrong, The Church Was Right for people who think the Earth really is the centre of the universe (my spell checker says that `center' is wrong and `centre' is right. Maybe its from England). I assume they mean that the sun and other stuff goes around the earth in concentric circles, and not that one can take any ref point and call it the center. The conference is run by Robert Sungenesis  who also wrote a book on the topic (its on Amazon  here and the comments section actually has a debate on the merits of his point of view.) There is also a website on the topic here.  The Catholic Church does not support him or his point of view, and in fact asked him to take ``Catholic'' out of the name of his organization, which he has done. (ADDED LATER- A commenter named Shane Chubbs, who has read over the relevent material on this case more carefully than I have, commented that Robert Sungenesis DOES claim that we can take the center of the universe to be anywhere, so it mine as well be here.   If thats Roberts S's only point, its hard to believe he got a whole book out of it.) OH- this is one of the TRUE points.