Friday, September 30, 2011

Bibliographies

Lots of buzz about Princeton's new policy that prevents faculty from giving away the right to publish papers on their own web pages. Never seen faculty so happy to have their rights restricted.

Princeton is behind the game for Computer Science.  All the major publishers I use explicitly give the right to authors to publish their own versions of papers on their web pages including ACM, IEEE, Springer and Elsevier. Even before, publishers rarely went after authors' pages. I have posted FOCS and Complexity papers for years and IEEE just updated their policy last November. Also see this discussion about ACM giving authors links so their visitors can freely download the official ACM versions of their papers.

We have these rights so EXERCISE THEM. No computer scientist has any excuse not to maintain a page of their papers with downloadable versions.

It gets harder to maintain these pages. I have to keep an old bibtex-to-html system running and at some point I should redo the whole page. Sites like DBLP and Google Scholar let people find my papers easily without my intervention. But if I want anyone to see all of my papers I have little choice but to keep the page going.

Wednesday, September 28, 2011

A candidate for a new Millenium Problem

In a prior post I wrote that the Erdos-Turan Conjecture should be a Millennium problem. Today I am going to (1) suggest a generalization of the Erdos-Turan Conjecture should be a Millennium problem instead, and also suggest (2) a far harder problem be another candidate. For some history see here. (This is an excerpt from my still-being-written book on VDW stuff- so if you find a mistake or suggestions please comment or email me.)

Recall the poly VDW theorem:
Let p1,...,pk ∈ Z[x] such that pi(0)=0. Let c ∈ N. There exists W=W(p1,...,p_k;c) such that for any c-coloring of {1,...,W} there exists a,d such that a, a+p1(d), ..., a+pk(d) are all the same color.
Much like VDW's theorem was before Gowers result, (1) there is a proof that gives bounds that are not primitive recursive (Walters proof) (2) there is a density-type theorem that does not give good bounds, (3) there is a proof by Shelah that gives primitive recursive bounds, but they are still quite large.

We make the following conjecture which we hope will lead to better bounds on the poly VDW numbers.
CONJ: Let p1,...,pk ∈ Z[x] such that pi(0)=0. If Σx ∈ A 1/x diverges then there exists a,d such that

a, a+p1(d), ..., a+pk(d) are all in A.
  1. The case where all of the polynomials are linear is often called the Erdos-Turan Conjecture. It is still open.
  2. The following is a subcase that is orthogonal to the Erdos-Turan Conj: If Σx ∈ A 1/x diverges then there exists two elements of A that are a square apart.
  3. Green and Tao showed that the set of primes have arb. large AP's. Then Tao and Ziegler showed the following: Let p1,...,pk ∈ Z[x] such that pi(0)=0. There exists a, d such that

    a, a+p1(d), ..., a+pk(d) are all primes.

    SO the CONJ is true for a particular set of interest.
We also pose a question which is more interesting but likely much harder:
If A is a set then let dA(n) =|A ∩ {1,...,n}|/n. The lim supn → ∞ dA(n) is the upper positive density. We will be interested in the function dA(n).
QUESTION: Let p1,...,pk ∈ Z[x] such that pi(0)=0. Find functions e1(n) and e2(n) (not that far apart) such that e1(n) ≥ e2(n), and the following occurs:
  1. If for almost all n, dA(n) ≥ e1(n) then there exists a,d such that a, p1(d),...,pk(d) ∈ A.
  2. There exists A such that, for almost all n, dA(n) ≥ e2(n), and there is NO a,d such that a, p1(d),...,pk(d) ∈ A.
For the case of 3-AP's the following are known:
  1. If for almost all n dA(n) ≥ (log log n)5/log n then A has a 3-AP. (See this Tom Sanders paper.)
  2. There exists a set A such that, for almost all n dA(n) ≥ 1/n\sqrt(log n) and A has no 3-APs (See Michael Elkin's paper for a slightly better result that I didn't have the energy to typeset.)
These two functions are NOT close together. So even in the case of 3-AP's we do not know the answer. I want to know the e1,e2 for ALL finite sets of polynomials with 0 constant term. We've got our work cut out for us.

It is my hope that progress on the question will lead to better bounds on some poly VDW numbers. A similar question DID lead to better bounds on the VDW numbers.

Monday, September 26, 2011

Moneyball

I saw Moneyball over the weekend. This movie gives a fictionalized account of the how the general manager of the 2002 Oakland A's used the right kind of statistics to build a strong team with a low budget. This article on the GM Billy Beane gives a nice follow-up to the movie.

A bit surprising one can get a good movie about the math in baseball. Moneyball was based on the book by Michael Lewis with a screenplay co-written by Aaron Sorkin who also wrote the screenplay to the Social Network. Sorkin writes nerdy things well.

Moneyball is really a computer science movie. It's not about the computers themselves which play a small role, but its about taking a large amount of data and deriving the conclusions that help make the crucial decisions in developing the team.

You can also see the difference in computer science over the last decade. At the time of Moneyball, people would try many statistical models and test them out. These days via Machine Learning we give the computer the data and the results and the computer determines the right models.

Oddly enough Billy Beane's actions led to an even greater separation between the large and small market teams. The statistical ideas that Beane pushed have been adopted by the other teams. Now we've hit an equilibrium so the teams that spend more win more as well.

Oddly the New York Times ran an editorial piece Not-So-Smart Cities by Greg Lindsey yesterday arguing that cities shouldn't rely on these kinds of statistics for planning because of a failed project from the 60's. Sounds like Lindsey needs to see Moneyball.

Friday, September 23, 2011

Mahaney's Theorem

Bill has a lot of posts where he questions whether to teach Mahaney's theorem in a graduate complexity class. Since it is one of my favorite theorems and most of you young folk won't see a proof in your complexity classes, I'll give it here. This proof is not the original of Mahaney but based on a paper by Ogihara and Watanabe.

Mahaney's Theorem: Let c be a constant and A be set such that for all n, A has at most nc strings of length n. If A is NP-complete then P=NP.

Proof: We define the left-set of SAT as follows: Let B = { (φ,w) | φ has a satisfying assignment a with a ≤ w in lexicographic order}. We'll restrict ourselves to w that are (not necessary satisfying) assignments for φ.

Assume φ is satisfiable and let a' be the lexicographically smallest satisfying assignment. We have (φ,w) is in B iff w ≥ a'.

Since B is in NP by assumption B reduces to A via some function f computable in polynomial-time, (φ,w) is in B iff f(φ,w) is in A.

Fix φ and n and let m=nk bound the number of strings in A of any length that f(φ,w) could query. Pick w< w1 < w2 < … < wm+1 evenly spaced from each other.

Let zi = f(φ,wi) for each i. Note that if zi is in A then zj is also in A for all j ≥ i.

Case 1: zi = zfor some j > i. We know a' cannot be between wi and wj.

Case 2: All the zi are distinct. There are only m elements in A so z1 is not in A and a' cannot be between w0 and w1.

Either way we have eliminated a 1/(m+1) fraction of the possible assignments.

We repeat the process choosing the wi equally spaced among the remaining possibilities and eliminate another 1/(m+1) fraction of those assignments. We continue O(mn) times until we narrow down to a set S of m+1 possible assignments.

If φ is satisfiable then a' is in S so at least one assignment in S and will satisfy φ. If φ is not satisfiable then none of the assignments in S satisfy φ.

By trying all the assignments of S and seeing if they satisfy φ, we get a polynomial-time algorithm to determine if φ is satisfiable. Since Satisfiability is NP-complete we have P = NP.

Wednesday, September 21, 2011

Where do theorems go to die?

(Joint post by Bill Gasarch and Daniel Apon)

Recently I (Daniel) reviewed Dexter Kozen's Theory of Computation (it was AWESOME). You can find the review here. Many of the topics were standard but some we (Bill and Daniel) suspect are not covered in any complexity class (well... maybe in Dexter's). This DOES NOT detract from the book, however we now raise the question:
Where do Theorems go to die?
In some eras there are some theorems that are taught in EVERY basic grad complexity courses Then all of a sudden, they are NOT taught at all. (Both the EVERY and the NOT are exaggerations.)
  1. The Blum Speedup theorem: In the 1970's EVERY comp sci grad student knew it. In the 1980's EVERY theorist knew it. In the 1990's EVERY complexity theorist knew it. In the 2000's (that sounds like it means 2000-3000) EVERY... Actually NOBODY knew it, except maybe students who took Dexter's class. I doubt Blum teaches it anymore. Blum's speedup theorem was proven before Cook's theorem when people were still trying to get complexity right. Blum has since said Cook got it right! Even so, Blum's Speedup theorem should be in the preface to God's book of algorithms (See here) as a warning that there might not be an optimal algorithm.
  2. The Complexity of Decidable theories (e.g, Presburger, WS1S, S1S, S2S). We all know that by Godel's theorem the theory of (N,+,times) is undecidable. There are some subtheories that are decidable. However, the decision procedures are complete in various rather high up classes (WS1S is provably not primitive recursive). This material I (Bill) learned from Michael Rabin (who himself proved S2S undecidable) in 1981 but my impression is that it is not taught anymore. Was it ever widely? We think that students should know the STATEMENTS of these results, because these are NATURAL problems (Bill did the capitalization) that are complete in classes strictly larger than P. (See the second conversation here for comment on that.) One of us thinks they should also know the proofs.
  3. omega-Automata. This is used to prove S1S decidable and we've heard it is used in fair termination of concurrent programs. Sounds like it should be more well known. But it is rare to find it in a complexity cousre. It may well be taught in other classes. (I (Bill) touch on it in undergrad automata theory.)
  4. Jon Katz is pondering not teaching Ladner's theorem!!! The students should surely know the result (YES, and stop calling me Shirley) however, should they know the proof?
  5. Spare Sets: Bill Blogged about no longer doing Mahaney's Theorem Karp-Lipton is still used A LOT, but Mahaney's theorem is not really used at all. Daniel thinks Mahaney's Theorem is AWESOME, however Daniel is bright eyed and bushy tailed and thinks ALL of this stuff is AWESOME.
  6. Computability Theory: Friedberg-Muchnik, Arithmetic Hierarchy, Analytic hierarchy, Recursion Theorem. This is like Blum Speedup- this material used to be better known to complexity theorists but is hardly taught to them anymore. It IS of course taught in courses in computability theory. But such courses are taken by complexity theorists far less often then they were. When I (Bill) tell someone there are c.e. sets that are not decidable and not complete! they say Oh, just like Ladner's theorem. Its the other way around, Ladner's result came far later.
  7. Part of this is that Complexity theory has changed from Logic-Based to Combinatorics-Based (see post on CCC Call for papers) We don't teach Blum Speedup (and other things) any more because we have better things to teach. However, are there results that the students should know even if the fields they come from are dead? YES, and Ladner's theorem is one of them.


Who decides such things? Textbook writers for one. A Proof Theorist who worked on lower bounds for Resolution told me he hopes that Arora-Barak's book would include this material. He even said If it does not, the field will die. Is this true? If so then Arora and Barak have more power than they realize.

Monday, September 19, 2011

Conferences Again

Lots of conference news and views going around. Let's sort it out.

FOCS early registration deadline is September 29th, fast approaching. Deadline for applying for student support is this Thursday September 22. Apply even if you don't have a paper there and take the opportunity to attend one of theory's most important meetings.

Also the STOC 2012 submission deadline is still November 2. There was a mistaken deadline listed on a theory events site.

The SODA 2012 accepted papers (and links to Arxiv) are out. Program Committee Chair Yuval Rabani explained the PC process. Seems pretty much along the lines of PCs I've been apart of. I wished they did penalize people who had poorly written papers, otherwise there's little incentive to write well. I'm also not a big fan of the "pet paper" idea, people tend to choose papers of their friends and colleagues.

Michael Mitzenmacher posted about the number of good papers not accepted and whether SODA should accept more (an issue at most of our major conferences). In this blog, Sami Khuller guest posted about whether it makes sense to have SODA in Japan. There are a handful of SODA accepts with primarily Japanese and other Asian authors but the vast majority of the authors are US-based.

Some of the comments on Khuller's post talked about having a virtual conference. How about this idea: We don't bother meeting and just collect the accepted papers into a single volume. We can give this idea an innovative name like a "journal".

In this month's CACM article, Moshe Vardi complains about the quality of conference talks. (I like the "journal  that meets in a hotel" quote but it didn't originate with me). You see this at STOC and FOCS too, people giving a talk only to other specialists in their field instead of aiming for a general audience.

Most of you readers know my position on conferences, that we need to get conferences out of the ranking business in CS so conferences can instead play their role of bringing community together and escape from the explosion of conferences just to give people who were rejected from another conference a place to submit their papers.

Friday, September 16, 2011

Happy Constitution Day

September 17th officially is known in the United States as "Constitution Day and Citizenship Day" but is celebrated today because the 17th this year falls on a weekend. I  have nothing but love for the our Constitution. Not only does the Constitution and its amendments provide the great freedoms we have in America but it also shows how to balance states of different sizes with a strong central government. The EU could do worse than by following its example.

What bothers me is a law that Robert Byrd snuck into an omnibus spending bill in 2004.
Each educational institution that receives Federal funds for a fiscal year shall hold an educational program on the United States Constitution on September 17 of such year for the students served by the educational institution.
With a few exceptions like the military academies, US universities are either run by the states, municipalities or are private institutions. It defeats the whole point of the Constitution for the Federal government to be dictating to US universities what educational programs must be held when.

Most universities do the minimum. Northwestern just points students to a website with Constitution related information and videos.

To help all of you celebrate Constitution Day here is the great preamble as I learned it as a kid.



Wednesday, September 14, 2011

Conventions in Math- just to make the rules work or more?

  1. Why is
    a1/2 = sqrt(a)?
    true? To make the rule
    ax+y=ax a y
    work out.
  2. Why is
    (∀ x ∈ ∅)[P(x)]
    true? To make the rule
    (∀ x ∈ A ∪ B)[P(x)] iff (∀ x ∈ A)[P(x)] AND (∀ x ∈ B)[P(x)]
    work out.
  3. Why is the sum over an empty index set equal to 0 and the product over an empty index set equal to 1? Same reason- it makes various math laws work out.
For the second and third item on the above list I would say its not JUST to make some rule work out, it also makes sense. But the first item, that a1/2 = sqrt(a), seems to only make sense in terms of making a rule work out and not for any other reason. I have no objection to the rule; however, if you have a REASON other than it makes the rules work for this convention, please leave a comment.

Monday, September 12, 2011

Imagine

Looking back, it's pretty amazing how new technologies like Google, cell phones, Facebook and Twitter have changed society in completely unexpected ways. Let's play a game with an up and coming technology that will surely change society but it is not clear yet how.

We already have the technology for autonomous cars, cars that can drive themselves with no needed changes to roads or other cars. By 2020, this technology will be cheap enough to be built into most new cars.

  1. When will society and laws be accepting of autonomous driving? Will we ever be able to get rid of the drivers seat?
  2. What will cars look like when there is no driving seat?
  3. Will there be a major elimination of jobs of taxi and truck drivers? Is this a bad thing?
  4. Will we still need parking right near our destination? Will driveways disappear?
  5. Once most cars are autonomous will they be networked for better coordination? Will we see the elimination of now unneeded street lights and signs? Will there be privacy concerns?
  6. These cars will stop or serve around obstacles. How do we stop pedestrians from just walking out in front of cars knowing they will stop?
  7. Will people mostly own cars or just rent one that happens to be close by?
  8. Suburbia exists because of the automobile. Will an autonomous car change the nature of suburban life?
  9. Most importantly, what is the big societal change that will occur that we can't imagine right now.

Friday, September 09, 2011

Guest Post on Conference Locations

(Samir Khuller Guest Post.)

On Conference Locations:

I recently looked at Orbitz for fares to Japan for SODA 2012 in Jan. The round trip fare from DC to Tokyo is close to $1700. Together with the train to Kyoto, we are looking at a $2000 travel cost to attend SODA for 3-4 days. Together with the registration and hotel, I am sure the cost will exceed $3000. I wonder how many people will be able to attend this conference? In times of declining budgets, we should make these decisions after careful thought since our travel budgets are only shrinking. In 2012, all conferences are likely to be expensive to attend -- SODA in Kyoto, CATS in Melbourne, STACS in Paris, Complexity in Porto, ESA in Slovenia, ISMP in Berlin, ICALP in UK, SWAT in Finland, LATIN in Peru, SIGMETRICS in UK, FST&TCS in India, not to mention all those exciting Bertinoro and Dagstuhl meetings. At least STOC is in NYC!

I am sure that the same dilemma is faced by people in the far east when we hold conferences in the US. However, I will be curious to know what the numbers look like. Are we going to reduce costs for 25 students who otherwise might not be able to attend SODA because its not in Japan (I hope this NOT the case, and the numbers look much better)? However, at the same time we might be making the cost prohibitively high for 100 students who could have attended the conference, but are not going due to the high cost. Wait, we did this once. We had FOCS in Rome! According to this blogpost it looks like only 172 people registered for FOCS 2004. Given that most likely 100 of the attendees were authors, the drop in attendance of non-authors is by a factor of 50% since FOCS most likely gets close to 230 attendees.

I am for the argument that once in a while its not bad to move a conference around to help people attend who normally might not; but we could explore other ways of helping such people attend. Once we move a conference to a place where not much local population will attend, its a problem (FOCS 1991 in Puerto Rico). SODA in Israel or Germany makes more sense to me since a large part of the algorithms community is from those places. One way to help defray the costs is to use part of the conference funds to help provide travel support for people whose cost to attend would be too high. Co-locating meetings might benefit us more (FCRC, ICALP-STOC 2001 in Crete). If we want to maximize interaction among people we should aim to have one large meeting as opposed to lots of small ones - the ISMB and ISMP conferences do this very well. More edges in a clique of 1000 nodes than 10 cliques of 100 nodes each. ALGO in Saarbrucken makes sense to me. High density of researchers, several meetings are combined into one along with ESA. Frankfurt is easy to get to from a lot of places in the world (except for the folks in Australia, NZ and Hawaii), and there is great train service from the airport to Saarbrucken.

One of the cheapest conferences I attended was the SIAM Conf. on Discrete Math at the University of Toronto campus. Very low registration cost and we could stay on campus in the dorms for $20/day. Having looked into (and having organized) conferences recently, I know that the dinner can cost $100/person, and the coffee break $20/person. Do we really need to have academic conferences at such large hotels and hard to reach places? Why not have a conference that encourages participation, as opposed to one that discourages it? Even I would not mind a conference in space, lets see if NSF would approve "foreign travel" for that one.

I am going to have to start a new US "regional" conference, that I can afford to send my students to! It will be held on a university campus and the reg fee will be $100/student; and it will be cheap to get to. At least for the years when SODA is not in the US, such a meeting might be a success.

NOTE: I have nothing against Kyoto and Rome, they are among my favorite places in world.

Wednesday, September 07, 2011

Next in the sequence

When I posted on sequence problems here some people said that they did not have unique answers. Many problems FORMALLY do not have a unique answer, but common sense and simplicity lead to a unique answer. This recently happened on Jeopardy. The topic was Next In The Series. I'll give the questions here and pointers to the answers (actually the answers here and pointers to the questions since Jeopardy is an answer-and-question game.) Do you think some of these have non-unique solutions? If so, what are the other solutions? Are those problems unfair? As always I ask nonrhetorically. (my spell checker wanted me to put a space between non and rhetorically. I disagree.)
  1. FA, SOL, ... NEXT
  2. In the US Army: First Lt, Captain, ... NEXT
  3. XXXVIII, XXXIX,... NEXT
  4. FEDERAL HOLIDAYS: Memorial Day, Independence Day, ... NEXT
  5. Orange, Yellow, (Wavelength around 510 nanometers)... NEXT

Tuesday, September 06, 2011

Knowing some History is a good thing- Why?

(FOCS registration is open: here. Note that there are tutorials on Sat Oct 22 and the conference talks are Sun Oct 23-Tues Oct 25.)

(Guest post by By Jeffery D. Stein, Chairman, IT History Society (info@ithistory.org), but first a related post by me.)

POST BY ME:

  1. Often you find that the origin of your field is from a different field. Ramsey's paper where he proved (what is now called) Ramsey's theorem was actually a paper in logic, though he did say that his combinatorial lemma may be of independent interest. Knowing what he was working on expands your horizons.
  2. If you study some history and then look around at the present world you will see some things in a different light. For example, if you study the history of Group Theory you realize that they didn't just write down some axioms and see where they led- they had actual applications in mind (e.g., showing the quintic had no solution in radicals). The axiomatic approach is fairly recent.
  3. When teaching (say) cardinality it is good to know that this concept was once controversial and some mathematicians disagreed with it. Hence be patient with your students. I tell them that this concept was troubling and some of the controversy around it.
  4. You can pick up some factoids of interest (and if you learn more about them they can become facts). I read an interview with Sheila Greibach (early Formal Lang Theorist) and, in passing, she mentions that any r.e. set can be written as the intersection of two context free languages. I never knew that! The other direction- the intersection of two context free Languages is an r.e. set is easy, but good to know for a HW or Exam question. (NOTE ADDED LATER- a commenter pointed out, correctly, that this cannot be correct. I will check what she actually wrote later.)
  5. The items above are actually about the history of the IDEAS and not the people. Knowing something about the people can be interesting, but is likely less useful for research and teaching. If you disagree I would love to hear a respectful counterargument.
SO, how can we LEARN history? Today's GUEST POST is about some resources for this for the history of IT.

GUEST POST: Introducing an IT Teaching and Research Resource

Guest Post by Jeffery D. Stein, Chairman, IT History Society (info@ithistory.org)

In 2007, the IT History Society was formed. The Society is dedicated to informing IT companies about the value in preserving their history, helping archivists to be more effective in their work in preserving IT history, and most importantly being a reference point for the many international places of computing history information.

The Society wants to assist educators, students of information technology, and researchers in learning more about the history and background of the information technology industry, an industry that has had a significant effect on mankind in the past seven decades. It has nearly 700 international institutional and individual members (no charge to be a member). Institutional members include IBM, HP, Intel, the Smithsonian Institution, Computer History Museum, Charles Babbage Institute, MIT, Caltech, Hans Nixdorf Museum, British Library, Stanford Silicon Valley Museum, Deutsches Museum, IEEE History Center, UK National Archive, Hagley Museum, and more. Individual members include historians, computer scientists, and people who have worked in the industry from various countries. Currently the Society has many online databases; but, two in particular may be of great value for teaching information technology and research:
  1. IT Historical Resource Sites Database. over 400 and growing every day, sites that have historical information about the information industry.  This entire database is completely indexed and searchable, which can be a beneficial aid in targeted search and research.
  2. IT Honor Roll is database of over 800 names and growing, discussing individuals who have made a noteworthy contribution to the information technology industry.
Other information technology resources from the IT History Society are:
  1. Calender of upcoming IT Historical and Archival events
  2. Research links and tools to aid in the preservation of IT history
  3. Over 1,000 Technology Quotes
  4. An active Blog with discussions about historical IT events and the people behind them
  5. A Social Network of IT history professionals, archivists, and hobbyists.
The Society is also in the process of creating three more databases about: (1) All information technology companies both past and present (2) All information technology software created, both past and present (3) All information technology hardware created, both past and present

The Society feels that these valuable resources can be of great benefit to information technology professors, teachers, assistants, researchers, and students. All databases are works in progress and each database has links for the IT community to add and grow the entries of each database. The Society is a non-profit educational and research organization.  It does not charge for membership or the use of its information.  The IT community supports our operations through donations to our 501 (c) (3) non-profit foundation. Please visit this link for further information.

Friday, September 02, 2011

The Anti-Privacy Generation

A physicist I knew refused to fly on small commuter planes. He knew what could go wrong and he was sure they weren't safe. In fact flying even on small planes is statistically safer than driving a car.

I thought about this story while I was reading Blown to Bits by Hal Abelson, Ken Ledeen and Bill's advisor Harry Lewis. The book is all about how the information revolution has put all our personal stuff out there. Knowing how computers and the Internet work can make one paranoid about information and this is why privacy is always a big issue among computer scientists and tech workers. But then I finally gave up on the book when I realized they gave very few examples of people who actually came to any harm from losing their privacy.

I've seen many a crypto talk talk about a situation where someone's personal information comes back to haunt them when they run for public office. But we live in a society that values openness. Obama didn't hide his illegal drug use, he talked about it in his autobiography and it didn't hurt his campaign. Anthony Weiner resigned from Congress not because he tweeted an inappropriate picture, but because he lied about it. Bill Clinton was impeached and Nixon resigned not for their actions but for their coverups.

Being open has its positive effects, it allows search engines, recommender systems and even people to tailor their behavior to your needs. We can imagine easily, as computer scientists, scenarios where loss of privacy has disastrous effects.  But your chances of running into such problems are about as high as being in a plane crash.