Monday, April 30, 2012

Is Moore's Law Good for CS?

Roy Friedman writes a blog post on The Expected End of Moore’s Law is Good News for Computer Science.
For a long time, people got used to being lazy. If computers become twice as fast every 1.5 to 2 years, there is no point in investing much efforts in writing efficient code. If something does not run fast enough, simply wait for the next generation of Intel x86 and everything will be resolved. In particular, CPUs became fast enough that traditional programming languages and efficient data structures and algorithms were being abandoned in favor of high level scripting languages whose most sophisticated data structure is an associative array. Suddenly, every Joe-hoe could become a programmer developing sophisticated Web applications with no effort  – no need for hard earned computer science degrees anymore.
Let's ignore the issue as to whether Moore's law is really coming to an end (Moore's law has had 5-10 years left in it since Gordon Moore developed his law). Friedman misses the bigger point, an end of Moore's law will do great harm to Computer Science.

Friendman's mistake is to assume that people always have the same problems to solve. Suppose computers stopped getting more powerful fifteen years ago. Machine Learning as we know now would not have been possible. Nor would we have the advances in graphics, robotics and virtually every other area of computer science. A good deal of CS systems research goes into making these new machines even more efficient, secure and reliable.

Even in theoretical computer science, the advances in computer technology make our work stand out. As we try to tackled larger problems asymptotic algorithmic techniques start dominating the running time. While we can now brute-force solve NP-complete problems on moderate input sizes, as computer performance improves, so does the thirst for solving even larger challenges and the P versus NP problem becomes a harder monster to slay.

Sure Joe-hoe can wait a year or two for his application to run fast without new coding ideas. While he waits Sam-ham takes advantage of new technologies making Joe-hoe's applications out of date.

Without the increased power of computers, we all become hackers to make our programs run better and that is not computer science.

Friday, April 27, 2012

Hanan Samet wins Kanellakis Prize!

(
  1. Symposium on Turing in Princeton in about a month
  2. UMCP is doing a job search for a lecturer
  3. New York Area Theory Day
) The Paris Kanellakis Theory and Practice Award for 2011 went to Hanan Samet from University of Maryland at College Park. See here for the details. He works on data structures THAT PEOPLE ACTUALLY USE! The Kanellakis price honors Theoretical accomplishments that have had significant and demonstrable effect on the practice of computing. This raises the question: What theory accomplishments are
good candidates for the prize (that have not already won it)?

Thursday, April 26, 2012

I left Push Down Automata out of my class and learned some things!


This semester in the Ugrad Course titled Elementary Theory of Computation
(Syllabus: Reg Languages, CFLs, Computability Theory, P and NP) I decided to NOT
teach PDA's. I mentioned them and told the students they were equivalent
to CFG's but nothing more.

This lead to a NEW (to me at least) piece of mathematics!

Proving
X={a^nb^nc^n | n ∈ N}
is NOT a CFL is easy using the pumping theorem.
Consider
Y={w | number of a's, b's and c's in w is the same}
How do you prove Y is not regular?
Why don't you just intersect Y with a*b*c* to get X which
you know is not a CFL?
I hear you say.
AH- you need to know that CFL intersect REG is CFL.
If we had the PDA/CFG equivalence then this would be an
easy cross product construction. But now?
SO, we need a proof that CFL intersect REG is CFL
that ONLY uses CFL's. That is, DOES NOT use PDA's.
Here is a proof.
We do not believe it is new but have not been able to find a reference.
If you know a reference please comment.
(NOTE ADDED LATER- The commenters politely provided a reference
and I have put it into the paper.)

Another point of interest: How do you show that REG is a subset of CFL.

  1. Normally you would note that DFA's are just PDA's without a stack,
    hence every lang recognized by a DFA can be recognized by a PDA,
    and then use PDA/CFG equivalence. I could not use this.

  2. You could use the proof that any DFA language is a left-linear grammar
    (left linear only has productions of the form X-->aY)
    and this is nice since, in fact, they are equivalent.
    Here is a proof.
    I ended up not doing this but I will next year
    when I do the course.

  3. On the midterm I had the following question:

    Recall that if α is a regular expression then $(α) is the
    set of strings that α generates.
    Recall that if G is a CFG then L(G) is the set of strings generated by G.

    Prove the following by induction on the formation of a regular expression:
    For all regular expressions α there exists a Context Free Grammar G
    such that L(α)=L(G).
    You cannot use PDA's (Push Down Automata). (If you do not know what this is,
    do not worry.)

    I could only ask this BECAUSE they had not seen PDA's or Left Liner Grammars.
    Of the 40 students about 20 got it right.


  4. One of my students, Justin Kruskal, wondered how to go from a DFA directly to a CFG
    (recall that he had not seen left-linear grammars).
    It is interesting to see what untainted students come up with on their own.
    He came up with a proof
    which is
    here.
    It is a weaker result than the left-linear grammar equivalence, but its his!
Questions for YOU, the readers.

  1. What do you think of leaving out PDA's? (I may heed your advice
    next spring when I teach it again.)

  2. Is the proof I point to that CFG intersect REG is CFG, that only uses CFG's, new?
    If not please give a reference. A comment like Bill you idiot, this is a well known proof
    that I have neither a reference nor a website to point to.
    ,
    is not helpful. If you DO have a reference or website to point to you can call me whatever you like.

  3. Have you ever learned some new math be leaving something OUT of a course?

Monday, April 23, 2012

CS in the Sunshine State

As many of you've heard the Dean of Engineering at the University of Florida is planning deep cuts to the Computer and Information Science and Engineering Department (CISE) and focusing its mission solely on teaching. Here is the relevant part of her plan.
Roughly half of the CISE faculty would be offered the opportunity to move to Electrical and Computer Engineering, Biomedical Engineering or Industrial and Systems Engineering.  These faculty would continue to support the graduate and research mission in the Computer Engineering degree track.  The choice of which faculty and which departments will be made based on fit with the research program and with the receiving departments. Staff positions in CISE which are currently supporting research and graduate programs would be eliminated.  The activities currently covered by TAs would be reassigned to faculty and the TA budget for CISE would be eliminated. The faculty remaining in CISE would then focus their efforts on teaching and advising students in the existing Computer Science BS and MS degree programs, offered through both the College of Engineering and the College of Liberal Arts and Sciences.  Their assignments would change to reflect this new educational mission with sole focus on delivering quality education for students in these degree programs.  Any faculty member who wishes to stay in CISE may do so, but with a revised assignment focused on teaching and advising.
In other words Florida is eliminating core Computer Science research, something that makes no sense for a state flagship research university in this day and age. There is a website and petition protesting the move which has caused the Dean to respond. Perhaps because of geographical closeness, this was a big topic of discussion when I was down at Georgia Tech last week. The current and founding Deans of the College of Computing at GT wrote strong letters to the Florida president. The CRA has also expressed their concern.

Certainly the president and dean deserve much of the blame to allow the targeting of computer science at Florida. But as Steven Salzberg of Forbes points out, the real villains lie in Tallahassee with a governor and legislature that has cut funding for the school by 30% over the past six years.

Thursday, April 19, 2012

How important is the PROOF of Cooks theorem for my ugrad class?

I am teaching Automata theory this semester. The topics are basically (1) Regular Languages, (2) Context Free Languages, (3) Decidable, undecidable, and c.e. sets, (4) P, NP, NP-completeness.

Other things can be added like Primitive Recursive Functions, the arithmetic hierarchy, the polynomial hierarchy.

I am considering NOT proving Cook's Theorem. State it YES, say how important it is YES, but prove it NO. I am not sure if I want to do this. This post is an attempt to get some intelligent comments on this. My mind is not made up yet so this is I raise the question TRULY non-rhetorically. Some thoughts:
  1. The proof is coding a Turing Machine computation into a formula. It is interesting that you can do this. The details of it are not so interesting.
  2. The very good students (I would guess 5 out of my 40) will get it. The bad students never get it. I am oversimplifying--- and if this was a reason to not teach a topic I may not have much of a course left.
  3. The time spend IN CLASS on this is not so much- one lecture. The time spend in OFFICE HOURS on this re-explaining it is a lot. And its not clear that it helps.
  4. Showing them NPC reductions is more important and more interesting. This is what I would put in instead. I usually only get to do a few.
  5. Cook's Theorem is SO fundamental that the students should SEE a proof of it even if they don't understand it. but I am not a fan of they should see it even if they don't understand it.
  6. Cook's Theorem, like many things, you don't get the first time you see it, but you do the second, helped by having seen it once.
  7. Even if we now they mostly won't get it, we want them to know that we WANT them to get it.
  8. Students should know that the proof that SAT is NP-complete goes all the way back to the machine model (very few other NP-completeness proofs do). Is it good enough to TELL them this. I would DEFINE Turing Machines, SAY that you CAN code a computation into a formula, but not actually do it?
  9. I like to teach things that I can ask problems about. Cook's theorem IS okay for this- I usually show how to make a formula out of some TM instructions and leave as HW making a formula out of other TM instructions. Even so, not that much can be asked about it. The students have a hard time with this (Much more so than other parts of the course) so I wonder if its too much sugar for a cent.
  10. I teach this course a lot so I want to mix things up a bit. I realize this is not really a good reason for the students.
  11. An argument for: See how it goes! The decision I make this semester need not be what I always do.

Wednesday, April 18, 2012

Some Announcements

STOC early registration deadline is Thursday. This year STOC will have both tutorials and workshops and its second poster session. Hope to see you all there.

The Simons Foundation is offering Ph.D. Fellowships in Theoretical Computer Science. Deadline May 1.

Finally a big congratulations to Josh Grochow who defended his thesis yesterday at the University of Chicago. Josh was co-advised by Ketan Mulmuley and myself and he'll be joining Toronto as a postdoc next year. Josh is my seventh and last Ph.D. at Chicago.

Monday, April 16, 2012

ACM/IEEE Curriculum 2013

[STOC 2012 Early Registration Deadline Thursday]

On a roughly ten-year cycle, the ACM and IEEE Computer Society get together to create a list of core topics that every getting an undergraduate CS degree should know. The joint task force has a strawman draft proposal and would like your comments for a final report hopefully next year.

This version breaks the requirements into Core-Tier 1 and Tier 2 with the more critical material in Tier 1 and also suggests some elective topics.

In Algorithms and Complexity there is a combined 21 lecture hours of material for Tier 1 and Tier 2 that should comfortably fit into a single quarter long course though that seems aggressive given the list of material. That is built on 41 hours of "Discrete Structures" that includes basic combinatorics and proof techniques. It's a testament to the fundamental importance of theory that it holds more core hours than most other areas.

While I'm always wary of outside committees setting courses and topics, these reports do give guidance in what the "community" believes are important topics for all well-trained computer scientists to know. They can heavily influence curriculum in theoretical computer science especially at the too many CS departments that don't a significant theory group. So take a look at the draft and give your comments to the task force.

Thursday, April 12, 2012

Unique Golf Winners

In the Masters Golf Tournament held last weekend there were 55 players who had a final score between 10 under par and 9 above. By the pigeonhole principle one would expect many players to have the same score and in particular multiple players to have the best score. There was a tie for first that required a playoff.

But this is the exception, there were only four playoffs in the last twenty years of the Masters. In most years the Masters tournament has a unique player with the lowest score. Why?

I used Golf Tournaments to motivate the Mulmuley-Vazirani-Vazirani isolation lemma. If you have a collection of sets over {1,...,n} and choose random weights w1,...,wn from {1,...,r} let the weight of a set be the sum of the weights of its elements. The MVV lemma states there will be a unique maximum weighted set with probability at least 1-n/r.

It's a cool lemma with a short proof: you argue that with high probability each element i is either in all or none of the max weighted sets.

So how do I tie the lemma to golf tournaments? Actually I don't see a direct connection. But the spirit is the same.

Tuesday, April 10, 2012

I'll be on that list until the day I die. Or later.

How hard is it to get OFF of lists?
  1. Univ of MD at College Park (UMCP) Professor Carl Smith was an editor for JCSS before his death in 2004. I put together a memorial issue of JCSS in his honor that appeared in 2008. SO HOW COME HE IS STILL GETTING ISSUES OF JCSS AT UMCP IN 2012? Normally I would email JCSS about this, but whenever I do that they stop it for a while and then it starts again. So I am not going to bother.
  2. I recently got the following email:
    
    William Gasarch
    SIGACT News
    
    Hi William,
    
    The winners of Russia's prestigious Debut Prize are
    presently in the U.S.  The newest of the winning books
    is Irina Bogatryreva's Off The Beaten Track which
    contains her story and two others about hitchhiking in
    Russia.  Irina is also available for interviews as she
    speaks English fluently!
    
    Please let me know if you would like to receive a copy
    of this new and interesting novel.
    
    The Debut Prize winners will be visiting select cities
    (Washington, Boston and New York) during their stay and
    will also return in June for BEA.
    
    I look forward to your thoughts.
    
    Best,
    Shirley
    
Why did I get that? As SIGACT NEWS book review editor I am on lists of book review editor. Technology is sophisticated enough generate lists of book review editors. I suspect it is not cost effective to figure out which editor is appropriate for which types of books since email is free. I have sometimes gotten actual books in the mail that are not math or CS--- that seems like more of a waste of the companies money (though the Biography of Ted Kennedy that I got was a good read.)

Is this a problem? Overall yes since it costs society something (money? time?) to have people (even dead people) on lists where they shouldn't be. But there does not seem to be an incentive on anyone who could fix it to fix it.

Thursday, April 05, 2012

Guest post on Tom Cover by Yoav Freund

(Details on registration and student travel awards for the 2012 IEEE Conference on Computational Complexity in Porto available at here.

New York Area Theory Day information here.)



Tom Cover passed away recently. Today we have a guest post about him from Yoav Freund which was written with help from Gabor Lugosi, Nicolo Cesa-Bianchi, Avrim Blum, Rob Schapire, Sanjoy Dasgupta and Manfred Warmuth.

Tom Cover was a leading light, a mentor and an inspiration. I was shocked to hear of his passing away. It was only a few weeks ago that I saw him in the recent ITA conference in San Diego. As usual, engrossed in conversation with a young colleague, gesticulating with his long arms to punctuate his words.

I was first introduced to Cover's work when, as a graduate student in UCSC, I read the classical book of Cover and Thomas on Information theory. Lured by the psychedelic cover I was enchanted by the lucid examples (betting in the horse races) and the elegant math. Like me, many computer scientists were introduced to the deep connections between probability, information, coding and prediction. It is a standard reference for information theory in computer science.

Beyond the book, Cover wrote wrote many of the foundational papers in information theory and its applications. A very partial list of his contribution include his work with Peter Hart on the convergence of the nearest neighbor classifier, His work on the capacity of linear classifiers preceded Vapnik and Chevonenkis. His work on finite memory algorithm for deciding whether the bias of a coin is rational or irrational is a beautiful mind-bender. His papers on gambling and portfolio management started much of the work on online learning.

Speaking of online learning, Cover took part in the first workshop on online learning that was held in UC Santa Cruz at 1992. Cover, with his typical sense of humor, suggested we call the workshop something for nothing. Later we had a bon-fire by the beach, I will always remember Cover as he was that day, a brilliant scientist, a wonderful communicator, a pursuer of beauty and of fun.

Annotated Bibliography

There's also his work on "Nearest neighbor pattern classification" (mid-60s, with Hart).

Yes, that's a very important one. He had two other early influential papers on nearest neighbor classification:

Thomas M. Cover. Rates of Convergence for Nearest Neighbor Procedures. Proceedings of The Hawaii International Conference on System Sciences, Honolulu, Hawaii, January 1968. Thomas M. Cover. Estimation by the Nearest Neighbor Rule. IEEE Transactions on Information Theory, IT-14(1):50--55, January 1968.

Even before Vapnik-Chervonenkis, he calculated the VC shatter coefficient for half spaces:

Thomas M. Cover. Capacity Problems for Linear Machines. Chapter in the book Pattern Recognition, Thompson Book Co., 1968. ed. by L. Kanal.

Then, of course, he was among the pioneers of online learning:



Thomas M. Cover. Universal Gambling Schemes and the Complexity Measures of Kolmogorov and Chaitin. Stanford University Dept. of Statistics Technical Report No. 12, October 1974.

Robert M. Bell and Thomas M. Cover. Competitive Optimality of Logarithmic Investment. Mathematics of Operations Research, 5(2):161--166, May 1980.

Thomas M. Cover. Log Optimal Portfolios. Chapter in Gambling Research: Gambling and Risk Taking, Seventh International Conference, Vol 4: Quantitative Analysis and Gambling, ed. by W.E. Eadington, 1987, Reno, Nevada.

Paul H. Algoet and Thomas M. Cover. Asymptotic Optimality and Asymptotic Equipartition Properties of Log-Optimum Investment. The Annals of Probability,, 16(2): 876-898, 1988.

Robert Bell and Thomas M. Cover. Game-Theoretic Optimal Portfolios. Management Science, 34(6): 724-733, June 1988.

He was among the first to use Blackwell's approachability for online learning:

Thomas M. Cover and David H. Gluss. Empirical Bayes Stock Market Portfolios. Advances in Applied Mathematics, (7):170-181, 1986 This one was a big seminal paper for structural risk minimization: Andrew R. Barron and Thomas M. Cover. Minimum Complexity Density Estimation. IEEE Transactions on Information Theory, 37(4): 1034-1054, July 1991.

Avrim Blum says: I don't know if the following had a big influence on COLT but I remember Ron Rivest describing this result which totally blew my mind: COVER, THOMAS M (1973). On determining the irrationality of the mean of a random variable. Ann. Math. Statist. 1862-871. COVER & HIRSCHLER (1975). A finite memory test of the irrationality of the parameter of a coin. Annals of Statistics, 939-946

You are observing a sequence of flips of a coin of unknown bias p, and after each flip must predict if p is rational or irrational. Shows that with prob 1 you can do this making only a finite number of mistakes....

Tuesday, April 03, 2012

We Think Like Our Fields

Have lunch with economists and they'll talk about the decision making processes and equilibriums of everything from politics to sports. Computer scientists worry about the misuse of information. Systems people create policies that look like computer programs. Mathematicians internally create models of society and derive consequences from it. I find myself treating the world as one large computational process. Isn't it?

It's not just academics, I've seen the same kind of thinking from lawyers, doctors and business owners. In one sense this isn't a bad thing. It's hard to reason about the world and using our tools and talents to makes sense of it all helps us cope with society. We all have our own religion whether it be spiritual or scientific. 

The problem comes when we just hang with our own, in our social networks both in person and online. We start believing that everyone else thinks the same way we do. Then we have difficulty working with people outside our community and that just isolates us even more. 

Sunday, April 01, 2012

Trick question or Stupid question?

April fool days Blogs or Columns may

present an open problem as closed,

present a closed problems as open,

make us wonder if it's a joke or not, (Lance STILL isn't telling!), or

make us question some well held beliefs.

This year I will take a different route- similar to what Lipton did here, which is NOT try to fool you, but present a post ABOUT tricky or foolish or odd things.

When is a trick question a stupid question? I give some examples and my opinions. But I want your opinions- are these questions (with the answers I give) TRICK QUESTIONS or STUPID QUESTIONS?
  1. How many states are in the United States? Answers and Commentary
  2. What is the least common Birthday in America? Answers and Commentary
  3. What is the degree of (x-a)(x-b)(x-c)... (x-z)? Answers and Commentary
  4. What US state has the eastern most point in America? Answers and Commentary
  5. What is the least common first names for a U.S. President? (The answer is a tie.) Answers and Commentary
  6. Which two numbers come at the end of this sequence?
    2,4,6,30,32,34,36,40,42,44,46,50,52,54,56,60,62,64,x,y
    Answers and Commentary. See also this blog entry on the problem.
  7. An expert on tracking animals notices one day that there are bear tracks and rabbit tracks converging on the same spot. He can estimate that they converged at 6:00PM with a margin of error of 17 seconds. Hence they must have been there at the same time. He also notices that from the spot they converged only rabbit tracks can be found. HOW CAN A RABBIT EAT A BEAR FOR DINNER? Answers and Commentary
  8. There are 6 blue socks, 8 white socks, and 10 black socks in a drawer. How many do you need to take out of the drawer in order to get a pair. Answers and Commentary
  9. What is the square root of nine? Give the answer as an anagram of the question. Answers and Commentary