Monday, January 25, 2010

When is a theorem really proven?

One of the comments on this blog pointed out correctly that for a theorem to be accepted by the community is not a Eureka Moment. It is a social process. The author of the comment was probably alluding to an excellent article by DeMillo and Lipton that I blogged about here. I highly recommend reading the article that blog points to.

We often say or write things like
  1. In 1978 Yao proved that if you store n elements of U in a table of size n then (for large U) membership requires log n probes (Ref FOCS 78).
  2. In 1981 Yao proved that if you store n elements of U in a table of size n then (for large U) membership requires log n probes (Ref JACM 81).
Personally I try to include both the conference and the journal version in a citation. That solves the citation problem. However, what does prove mean? It could be that Yao proved this in 1977. The exact time/day/year when something is proved is not that well defined. The original post was about the Fund Lemma where the paper was written in 2004 and accepted in 2009. What was its status in 2006? Proven or unproven? Is there a better way to say these things?

  1. In his FOCS 1978 paper (see also Journal version in JACM 1981) Yao proved the following:
  2. In his JACM 1981 paper (original in conference version in FOCS 1978) Yao proved the following:
These both sound awkward. I am willing to live with using proved even though its not quite right. Does someone have an alternative?

Was Time Magazine right to say that the Fund Lemma was one of the big Science Stories of 2009 even though the paper was written in 2004? I think so- acceptance in a journal seems like a good time to declare YES THIS IS TRUE. And I do not know an alternative.

20 comments:

  1. How about

    "In (Ref JACM 81) Yao showed that if you store n elements of U in a table of size n then (for large U) membership requires log n probes."

    It's true that he did show it in the cited reference, and the wording above makes no claim that 1981 is when he first conceived the proof.

    ReplyDelete
  2. Formally you are correct,
    but the sentence you have seems to indicate that there is no earlier work.
    If the JACM 81 Ref says within it

    Earlier Version in FOCS 78

    then that might work.

    ReplyDelete
  3. Fund Lemma was proved in 2008. The result in 2004 is for a special case.

    ReplyDelete
  4. Bill, you are confusing things. Date of publication has never been interpreted as date of proof. This is why journals typically include submission and revision dates in the final form.

    If for some reason the date of proof was in dispute, historians would look at the historical record.

    Now, quite arbitrarily (but not necessarily unjustly) we have decided that date of submission should be the sole criterion to resolve priority claims (e.g. Smale, Stallings and Wallace on the Poincare conjecture) except when we don't (e.g. Levin and Cook).

    ReplyDelete
  5. The question "When is a theorem really proven" is mildly transgressive ... and the coverse question "When is a theorem really forgotten" is so transgressive, that I know of only one article that has been written upon it (BibTeX appended) ... and *that* article was humorous.

    And yet, isn't it hard to imagine achieving a satisfactory understanding of either question, without achieving an understanding of both?

    And don't these issues become particularly evident on a peaceful, prosperous planet whose population is 10^10? ... assuming, say, one-in-10^4 citizens is an active professional mathematician (the US prevalence).

    The point is that *if* these two questions both have good answers, then (potentially) those two answers are similarly interesting.

    ---------

    @article{Author = {Colin Adams}, Journal = {Mathematical Intelligencer}, Number = {4}, Pages = {15--18}, Title = {The Adventures of Robin Caruso}, Volume = {31}, Year = {2009}}

    ReplyDelete
  6. FYI, Ngô Bao Châu has just accepted an appointment at Chicago.

    Prominent mathemetician accepts appointment at UChicago

    ReplyDelete
  7. Alternatively, one could write that 'In 1981 Yao logically/formally/cogently/convincingly/strongly argued/put forward the argument/reasoned that ...', with appropriately chosen qualifications reflecting how one assessed the reasoning.

    ReplyDelete
  8. was there any specific need for mentioning Yao ?

    ReplyDelete
  9. Bill, you started a post about what the community considers "has been proved or not", and then you managed to dedicate the rest of the post to an almost entirely uninteresting semantic question ("should we cite date of proof or date of publication") rather than focusing on wonderful aspect of the question you started with. This is very typical for this blog: more often than not it deals with gray, technocratic, superficial issues rather than focusing on interesting issues. I can't help but thinking this is related to the blog's authors taking blogging as a "Chore" which should be tended to every few days; this approach produced a constantly-active, but extremely superficial, blog. This is not necessarily a bad tradeoff, but it is often upsetting.

    I'd like to raise two issues related to the original question that you asked:

    1. how does the theory community deal with mistakes in already-published research (hint: very unsatisfactorily).

    2. the community very often has misconceptions about what is actually known; to what extent is this common? (I think it's way more common than we think) And what can we do to deal with this?

    About #1: What happens when published papers have mistakes? Does the fact of the mistake propagate through the community? Is the community doing enough to acknowledge mistakes and fix the resulting problems (papers citing false results, etc)? Can we deal with this issue better? I know of many papers that have mistakes, and the fact of the mistake has been published in other papers, but not enough is done to prevent people from reading the original papers and thinking they are correct. This is a horrible situation, and something should be done about it. I know it's hard to make sure than every published paper is 100% correct, but even when we already found a mistake, we do way too little to amend it.

    About #2: I haven't found a good example to make my case with, so I'll make it with two examples and if someone thinks of a better one, please comment.

    Example 1: Suppose I say in a paper that "sorting takes Omega(nlogn) time in the worst case". Is this true? To what extent? How much should I diclaim? What details should I include? Obviously, I should say that this is true in the comparison model. Should I say anything else or is that enough info? If the problematic nature of this issue is not apparent, look at example 2.

    Here is a more involved and interesting (but more niche)

    Example 2: Suppose I say that approximating the longest increasing subsequence in a stream takes Omega(sqrt(n)) space. Should I say that this has only been proved in the deterministic model? Should I give more details? There are many variants of the streaming model; should I say which one I mean? Will future generations understand my claim, considering that new variants of models are routinely introduced, so "streaming" might not mean in 10 years what it means now?

    In other words, I'm pointing out that the community has *many* misconceptions about what is known, often thinking that more is known than was actually proved, because of sloppy phrasing of results, often not including vital information. And I claim that we're all guilty of such sloppiness and that to a large extent, this "sloppiness" cannot be avoided, since it's an inherent part of scientific behavior, and especially of the constantly-changing nature of science (in particular because definitions and terminology are always in flux). And I'm asking if there's a way to handle this situation better than we do now, potentially leading to less mistakes and to more open questions and more progress.

    ReplyDelete
  10. Why be so specific?

    "In the late 1970s, Yao proved that if you store n elements of U in a table of size n then (for large U) membership requires log n probes [FOCS 78, JACM 81]."

    ReplyDelete
  11. 1. how does the theory community deal with mistakes in already-published research (hint: very unsatisfactorily).

    About ten years ago some friends went through this process and it proved (no pun intended) to be very difficult to get the paper published.

    First it was seen as bad form to point out that the original result was incorrect. Second, referees kept on rejecting the paper since "it had been done before" in spite of a clear discussion of the paper about the flaws in the original result.

    I'd like to believe things are now a bit better and that people are nowadays more receptive to that type of paper, but maybe that's just the optimist in me talking.

    ReplyDelete
  12. 1. how does the theory community deal with mistakes in already-published research (hint: very unsatisfactorily).

    I recently submitted a conference paper that pointed out a bug in a side result in one of my own papers (published at a different conference several years previously) and then proceeded to fix the bug using entirely new techniques. The paper was rejected on the grounds that it would be unethical to let me present the same result twice.

    ReplyDelete
  13. The paper was rejected on the grounds that it would be unethical to let me present the same result twice.

    It is nice to see that we care more about proper paper counting than about truth in science.

    Yipee!

    ReplyDelete
  14. Jeffe: aren't bugs usually pointed out in corrigenda as opposed to entirely new conference submissions? Why not take that route? Put the fixed proof on your website, in a separate pdf next to the old paper's link, and also post the fix to arXiv as a corrigendum.

    ReplyDelete
  15. Umeshism: If none of your papers have bugs, you're not working on hard enough problems.

    ReplyDelete
  16. I think I was on a/the PC that dealt with Jeffe's submission w/ the error and I recall some discussion about it.

    I'm interested to know what the community in general thinks a committee should do in this situation. I agree with the poster who thinks that errors to a conference version should be addressed in the journal version.

    In the same conference, there was also a submission whose main result was a theorem that had already been stated as a theorem in a stoc paper (by an overlapping subset of the authors on the current submission) but whose details had been "left to the full version". I guess the details were more complicated than expected, and it was turned into an entirely new submission. The thing about this submission, however, was that it did not actually say that the theorem had already been stated elsewhere--a reviewer had to point that out. That paper was rejected basically on the grounds that it was unethical, but appeared in another conference soon after. Is this ethical or not? It would be good to have consistent guidelines about such situations.

    ReplyDelete
  17. there was also a submission whose main result was a theorem that had already been stated as a theorem in a STOC paper (by an overlapping subset of the authors on the current submission) but whose details had been "left to the full version".

    Right there is your problem. Conferences shouldn't be publishing results that are not contain accompanied by full proofs.

    In Jeffe's case the reasons why a corrigenda might not be the most appropriate choice are that (1) it was a side result and (2) the corrected version introduces new techniques which are of independent interest.

    ReplyDelete
  18. To reply to the last few anonymice (anonymoi?):

    I did post a corrigendum on my web page, exactly as Anon 2:34 suggested. (It was too late to correct the paper in the journal version; it had already been in print for four years.)

    If our correction had been straightforward, or if all we had to say was "Oops, sorry", then we wouldn't have submitted the paper to a conference. (Been there, done that, bought the T-shirt, ate the crow.) But the correction required substantially new techniques that we believed were interesting and nontrivial enough in their own right to justify a conference publication.

    What I found most frustrating (and surprising) was that the new contributions of our paper were apparently not judged on their own merits because we admitted that we were fixing a bug. The reviews we got back were generally positive, which suggests that the paper might have been accepted if I had (unethically!) removed my name from the author list.

    On the other hand, I can't really be all that shocked when my sloppiness in the previous paper has unforeseen repercussions. (Whine, whine, whine. Ever since the dog ate the baby, it's been Get ridda the daawwg! Get ridda the daawwg!)

    All water under the bridge. Our new paper was accepted to the same journal as the original buggy paper. All's well that ends well.

    ReplyDelete
  19. This thread exactly proves my point (from comment #8). I ask how does the community deal with mistakes, and people start bitching about conferences not accepting their papers.

    Seriously, I don't care about your paper counts. I think the following is way more important: There are dozens of papers written every year (maybe even hundreds) that use wrong results that are known to be wrong. Can the community please find a way to deal with this satisfactorily?

    ReplyDelete
  20. Since papers are mostly read online now a days, it seems like there should be an append functionallity given to authors to make notes of mistakes.

    ReplyDelete