Thursday, July 31, 2014

How NOT to bet

(True Story, but names are changed for no good reason.)

Alice, Bob, and Carol are betting on who will be the Republican Nominee in 2016.

  1. Alice bets on Jeb Bush. Alice's reasoning is that in the past the Republicans always end up with a moderate who they think can win. NOTE: From Alice's prediction you can tell NOTHING about her politics.
  2. Bob bets on Paul Ryan. Bob's reasons  that in the past the Republicans always end up with someone who they are familiar with, quite often someone who has run before. Normally this might be someone who ran in the primaries four years earlier; however, none of Herman Cain, Michelle Bachman, Rick Santorum, Newt Gingrich, seem likely. Rick Perry is possible but seems unlikely. That leaves Paul Ryan (Curiosity- people who lose, even if its close, do not run again. Not sure why. So it won't be Romney.) NOTE: From Bob's prediction you can tell NOTHING about his politics.
  3. Carol bets on Rand Paul. Carol's reasoning is that the American Public is tired of Big Government and is ready for the Rand Paul Revolution! NOTE: From Carol's prediction you can tell EVERYTHING about her politics.
Carol is letting her opinion drive her betting. Can someone take advantage of this? YES- if someone is betting from sentiment or emotion or conviction rather than from facts, that gives you an advantage in a bet. This does not always work, but its a good bet. Now watch, I'll lose my bet to Carol (NOTE- I am Bob).
One full dollar!

I think I read that you are better off betting AGAINST a triple crown if a horse won the first two legs for the same reason- there will be emotion driving the bet FOR, and that increases the payoff for those betting AGAINST. But of course nothing is guaranteed.  And there are some cases where its just ridiculous to vote against (Secretariat  in 1973); however, to use my system you can't just skip one since you ``just know'' that there will be a triple crown. Its a long term strategy. And one could be wrong.  That's why its called gambling!

Are there other cases where long term betting with facts and against sentiment can give you an edge?

Sunday, July 27, 2014

The Change Problem and the Gap between Recreational and Serious Mathematics

In a prior post (here) I discussed the change problem: given a dollar, how many ways can you make change using pennies, nickels, dimes, and quarters (and generalize to n cents). Scouring the web I found either programs for it, the answer   (242), and answers like `is the coefficient of x^100 in ... . What I didn't find is a  formula for n cents. So I derived one using recurrences and wrote this up with some other things about the change problem, and I posted it on arXiv. (I have updated the paper many times after comments I got. It is still where it was, but updated, here.)

I then got some very helpful emails pointing me to the vast math literature on this problem. I was not surprised there was such a literature, though I was surprised I hadn't found any of it searching the web.
The literature didn't have my formula, though it had several ways to derive it (some faster than mine).

Why isn't the formula out there? Why couldn't I find the literature?

  1. The formula falls into a space right between recreational and serious math. I use a recurrence  but not a generating function. Recreational math just wants to know the answer for 100, so a program suffices (or some clever by-hand recurrences, which are also in my paper). Serious math people are happy to show that a formula CAN be derived and to refine that in terms of how quickly the coefficients can be found.
  2. There is a gap between recreational and serious math. In particular- the serious people aren't talking to (or posting on websites to) the rec people.
  3. Different terminologies. I was looking for ``change problems'' and ``coin problems'' and things like that when I should have been looking for ``Sylvester Denumerant'' or ''Frobenius problem''
For some areas Google (and other tools) are still not as good as finding someone who knows about your area. My posting on the blog got some people who know stuff's attention, and my posting on arXiv got one more person (who knew A LOT!).  I'm glad they emailed me comments and references so I improved the paper and cited the literature properly. But this seems so haphazard! Google didn't work,  mathstack exchange and other similar sites didn't work (that is where I saw the Rec people post but not get a formula). Is it the case that no matter how good Google gets we'll still need to find people who know stuff? Thats fine if you can, but sometimes you can't.

Thursday, July 24, 2014

Need some book reviewed- faster than usual

I have been the SIGACT NEWS book review editor since 1997. I have tried to get about 10 books reviewed per column. I have succeeded- perhaps too well! I have gotten so many reviews out that I only have six reviews left.

I DO have many books that could be reviewed, and that is where YOU come in!

List of books available for review: Here

Advice for reviewers: Here

LaTeX  Template for reviews: Here

ALL of my prior columns: Here

IF you want to review a book DO NOT leave a comment- just email me (ADDED LATER- EMAIL ME at gasarch@cs.umd.edu. Someone emailed Lance instead
which you shouldn't do.)
a set of books you want to review. I will pick out one for you--- it may be based
on what I've already got requests for.

Since it is summer  you are not as busy as the regular hear (hmmm- I am running an REU program AND teaching an intense High School Class, AND going to a security conference, and mentoring three high school students hoping for some more free lunches, so I am actually MORE busy) so summer is a GOOD time to review a book.

Deadline to ASK for a book: Request that you make your request BEFORE July 28 so that when I email in my column with the list-of-books-I-need-reviewed, it is accurate.

Deadline for Review: About two months after you get it, though this can be flexible.

Tuesday, July 22, 2014

The Burden of Large Enrollments

This week I'm at the CRA Snowbird conference, the biennial meeting of CS chairs and other leaders in the field. In 2012 many of the discussion focused on MOOCS. This year the challenge facing most CS chairs are booming enrollments in CS courses. A nice problem to have, but a problem nevertheless.

Last night we had a broad discussion about the burgeoning number of students. Ed Lazowska showed his NCWIT slides giving anecdotal evidence. It's too early to get a complete survey of CS departments but hardly anyone in the audience felt that enrollments were not going up by double (or triple) digit percentages. Not only an increase in majors, but a number of non-majors, minors or others, who take upper level courses.
At Georgia Tech it seems every engineering student wants to minor in CS.

We all have to deal with the increase in the short term. Depending on the institution, we have more classes, larger classes, enrollment caps, increases in instructors, PhD lecturers, teaching postdocs, automated grading, undergrad and MS TAs and having other departments take up some of the load. All of this could hurt the undergrad experience but with careful planning we can mitigate those effects.

What's driving the increase? Some suggested a change in the ubiquitous view of computing and the more positive opinion of geeks in society. More likely though, the driver is jobs, the great need for computer scientists and not just from computing companies, and the limited jobs for those graduating with non-technical majors.

Is the big shifts in enrollment a permanent change or just another part of the boom and bust cycle in CS? More than a theoretical questions, a permanent change makes for a better argument with deans and provosts to increase faculty sizes in computer science. There are some signs of "this time is different," such as the increase in non-majors in upper level courses, but it's an argument that will have a hard sell unless the increase is sustained for several years. One good argument: If you draw a linear line through enrollments since the 70's you get a consistent 10% yearly increase averaged over booms and busts.

Thursday, July 17, 2014

Elfdrive

New York Times, dateline June 11, 2019
With a near record-setting investment announced last week, the self-driving car service Elfdrive is the hottest, most valuable technology start-up on the planet. It is also one of the most controversial.
The company, which has been the target of protests across Europe this week, has been accused of a reckless attitude toward safety, of price-gouging its customers, of putting existing cabbies out of work and of evading regulation. And it has been called trivial. In The New Yorker last year, George Packer huffed that Elfdrive typified Silicon Valley’s newfound focus on “solving all the problems of being 20 years old, with cash on hand.”
It is impossible to say whether Elfdrive is worth the $117 billion its investors believe it to be; like any start-up, it could fail. But for all its flaws, Elfdrive is anything but trivial. It could well transform transportation the way Amazon has altered shopping — by using slick, user-friendly software and mountains of data to completely reshape an existing market, ultimately making many modes of urban transportation cheaper, more flexible and more widely accessible to people across the income spectrum.
Elfdrive could pull this off by accomplishing something that has long been seen as a pipe dream among transportation scholars: It has the potential to decrease private car ownership.
In its long-established markets, like San Francisco, using Elfdrive every day is already arguably cheaper than owning a private car. Elfdrive says that despite dust-ups about “elfpricing” at busy times, its cheapest service is usually 30 percent less expensive than taxis.
The competition between Elfdrive and its rivals is likely to result in more areas of the country in which self-driving becomes both cheaper and more convenient than owning a car, a shift that could profoundly alter how people navigate American cities.
Over the next few years, if Elfdrive do reduce the need for private vehicle ownership, they could help lower the cost of living in urban areas, reduce the environmental toll exacted by privately owned automobiles (like the emissions we spew while cruising for parking), and reallocate space now being wasted on parking lots to more valuable uses, like housing.
Paradoxically, some experts say, the increased use of self-driving cars could also spawn renewed interest in and funding for public transportation, because people generally use taxis in conjunction with many other forms of transportation.
In other words, if Elfdrive and its competitors succeed, it wouldn’t be a stretch to see many small and midsize cities become transportation nirvanas on the order of Manhattan — places where forgoing car ownership isn’t just an outré lifestyle choice, but the preferred way to live.
If you haven't guessed all I did was minor edits to a Farhod Manjoo piece on Uber. My point is that it all fits, there really isn't that much difference between a car driven by a magical AI elf or that driving by a real person, if it's all controlled by an app. You can start to see the effects of self-driving cars, the good and the bad, from what's going on now with ride-sharing services. The big difference with self-driving cars will be the complaints won't come just from the taxi drivers, but the truckers and delivery people as well.

Sunday, July 13, 2014

What to call the top and bottom part of (n choose k)

In my last post I asked for candidates for names for the top and bottom part of (n choose k) . Here are the candidates and my comments on them and ... the winner!

  1. Top part: Degree, Bottom part: Index. 
  2. Top part: Bino, Bottom part: Mial
  3. Top part: Numerator, Bottom part: Denominator
  4. Top part: Outcomes, Bottom part: Possibilities
  5. Top part: Binomerator, Bottom part: I've got nothing
  6. Top part: *, Bottom part: *
  7. Top part: Biponendo, Bottom part: Bividendo
  8. Top part: Choosand, Bottom part: choosee
  9. Top part: Set size, Bottom part: Subset size.
I leave out the explanations for these since one criteria is that they be self explanatory.
While choices 4,8,9 are tempting along those lines, the winner is

Numerator/Denominator

Why? One of the people who suggested it gave a pointer. The pointer actually went to calling the top and bottom part of the Legendre symbol Numerator and Denominator. And thats just it- there are several
other math things that have a top and bottom part. We could try to find a name for each one, OR
just use Numerator and Denominator for all of them. That seems to work. SO- next time you write a paper and need to refer to the top part of a bin coeff or a leg symbol or something else, use the terms
Numerator and Denominator.

CODA: `The Denominator' could be an Arnold Schwarzenegger character. A Math teacher by day, a crimefighter by night. Catchphrase: I'm going to put you under!

Wednesday, July 09, 2014

Is there a word for the top part of a binomial coefficient?

Consider the sequence:

x/y,  (x+1)/y, (x+2)/y, ..., (x+z)/y

one could say  in this sequence of fractions the numerators goes through z-1 consecutive numbers.

Consider the sequence

(x choose y), (x+1 choose y), (x+2 choose y),...,(x+z choose y)

one could say in this sequence of binomial coefficients the top-part goes through z-1 consecutive numbers.

Is there a better way to say this? That is, is there a term for the top-part of a binomial coefficient? Or for that matter the bottom part? I have not been able to find one on the web. Hence I propose a contest:

  1. Leave as a comment a proposed name for the top-part and for the bottom-part of a binomial coefficient.
  2. If you find a website that has some official name, leave that as a comment.
I am not sure if there will be a winner of what he or she will get. But if we can agree upon a term then we are all winners!

Monday, July 07, 2014

Favorite Theorems: Compressing the Local Lemma

Not only did Moser give a near optimal construction, but he did so in my favorite STOC talk of the decade.


The Lovász local lemma informally states that if you have a large set of events with limited dependence and they individually have a reasonable chance of happening, then there is a positive probability that they all happen. Moser focused on the special case of Boolean formula satisfiability: If we have a k-CNF formula φ where each clause shares a variable with at most r other clauses with r<2k/8 then φ is satisfiable. Note the result does not depend on the number of variables or clauses.

Moser gave this simple algorithm and a proof using entropy that I recognized as a Kolmogorov complexity argument and so excited me that I immediately wrote up the Kolmogorov proof as a blog post.

Moser's paper kickstarted a whole research agenda, with tight bounds using even simpler algorithms (though more complex proofs) and various generalizations. A whole tutorial day of STOC 2014 focused on the local lemma describing research that directly or indirectly came from Moser's most beautiful construction.

Wednesday, July 02, 2014

Four Centuries of Logarithms

I just returned from visiting my former student Rahul Santhanam in Edinburgh. The National Museum of Scotland has an exhibit on logarithms, first published in a book by John Napier in 1614.

Napier invented logarithms to make calculations like multiplication, division and exponentiation easier, using identities like log(ab)=log(a)+log(b). Logarithmic tables and slide rules came soon thereafter. Slide rules became the standard way to do mathematical computations for your typical scientist/engineer until reasonably priced calculators appeared in the late 70's. My chemistry teacher in 1980 taught us how to use a slide rule, even though we all had calculators, and I even (foolishly) tried using my father's slide rule on a chemistry test.

The exhibit struggled to find current uses for logarithms, mentioning only the Richter scale. In theoretical computer science we use logarithms all the time. Here's an incomplete list off the top of my head.
  • As part of a running time usually as a result of divide and conquer. Sorting, for example, takes Θ(n log n) comparisons.
  • The size of a number, pointer or counter. To count up to n requires log n bits of storage.
  • The representation of a small amount of space as in the complexity classes L and NL.
  • To capture entropy, coupon collector and other topics in probability and information.
  • Roughly captures the sum of 1/i or exactly capturing the integral of 1/x.
  • The inverse of exponential growth.
Thanks John Napier for the logarithm and making our lives just a little less linear.