Thursday, November 11, 2010

A Problem with Wikipedia and the Next Best Thing to Winning a Godel Prize

(There are TWO theory day events in NY this semsester: Nov 11, 2010 and Nov 12, 2010. The first one has likely already started as you read this.)

I searched for myself on Wikipedia and I found out something scary. Not about me. About Wikipedia.
  1. One of the hits was to the page Godel Prize Winners. Since I don't recall winning a Godel prize this surprised me. The links to the Godel-prize winning papers Proof Verification and the Hardness of Approximation Problems by Arora, Lund, Motwani, Sudan, Szegedy and Probabilistic checking of proofs: a new characterization of NP by Arora and Safra both point to the copies of these papers on MY PCP website. This is truly frightening. I plan to keep the link up (I suppose now I have to); however, Wikipedia has such a problem with Link Rot that they even have an entry on it! For now. Why did they use my link for those papers? Was it the only copy of the paper that was free online? Doubtful- the authors surely have a copy on their websites. My PCP website is NOT well known: I whipped it up for a seminar. I never intended it to be stable or referenced by anyone else. (NOTE to Ryan Williams- I'll be runing a seminar on circuit lower bounds next semester. Please put your recent AWESOME paper someplace free and easy to find so that they point to your copy, not mine, when you win a Godel Prize.)
  2. One of the hits was on the page Recursively inseparable sets. Two sets A and B are rec. insep. if they are disjoint and there is no decidable set C that contains A and is disjoint from B. Here is a quote from the entry:
    Let φ be the standard indexing of the partial computable functions. Then the sets {e | φe(0) converges and equals 0} and {e | φe(0) converges and equals 1} are recursively inseparable (Gasarch 1998, p. 1047).
    This is NOT my result- I think it's folklore. They reference my Survey of Recursive Combinatorics which states the result without proof or reference (my bad). They don't even point to the (free online) paper-they point to the math review of it. (If you know who proved that result then let me know. I plan on changing this entry by correcting the mis-attribution, giving a correct attribution if I have one (else attribute folklore), and including a proof. If you don't know who proved it then you can always look it up on Wikipedia. Oh. Don't!)
  3. One of the hits was on the page The Coppersmith-Winograd algorithm for matrix multiplication in O(n2.376) time. The link to their paper points to my applications of Ramsey Theory website (the algorithm uses 3-free sets which is a concept from Ramsey Theory). This pointer makes some sense since my applications-of-Ramsey website is well known and it seems to be the only free online copy. Even so, seems fragile.
  4. One of the hits was on the page Graham's Number. The link to the paper Ramsey's Theorem for n-Parameter Sets by Graham and Rothchild points to my van der Warden's Theorem website (not to be confused with my applications of Ramsey Theory website). This is very odd since (1) there is a well known website of Ronald Graham's papers and (2) my vdw website is NOT well known.
I will at some point FIX some of these entries.

The above points to both the strength and weakness of Wikipedia. On the one hand, some of these links are fragile and there are likely better ones out there. On the other hand, for some of them they found a link that worked and went with it. If they had waited around for a better or less fragile link then the entry would not have been made.

One lesson: If you find something online that you want an ecopy of, download it! It may not be there next time. (I made the same point here.)

13 comments:

  1. There is no weakness in Wikipedia. It works because people like you spot mistakes in entries relevant to their interests and fix them; it converges very fast. The mistakes that stay there for a long time are in entries that nobody cares about.

    ReplyDelete
  2. I’m not certain I completely understand your message. Anyway, thank you for checking some of the off-site links on Wikipedia and finding them — as far as I can tell — in working order.

    As for the result you talk about under item 2: The reference to your writing does not attribute the result to you. It merely makes the result verifiable. Wikipedia is an encyclopaedia, not a scientific journal; the purpose of references is to make statements verifiable, not to mete out academic credit. So you should not remove the reference and call the result “folklore,” because nobody else can then check the truth of the statement, nor find an authority that has called the statement true (in this case, you).

    Of course, if you can find an even better reference than (Gasarch 1998, p. 1047), then by all means add it.

    ReplyDelete
  3. WRT (2), Wikipedia encourages not citing the original result: http://en.wikipedia.org/wiki/Wikipedia:No_original_research#Primary.2C_secondary_and_tertiary_sources
    At least that's how I understand it.

    ReplyDelete
  4. The phenomenon of evanescent references is not unique to Wikipedia, or even the modern era.

    An extreme example was described recently on the weblog Entertaining Research in a post titled The case of the curious reference.

    The Landau-Lifshitz-Gilbert (LLG) equations describe the interaction of quantum magnetic moments with a local dissipative environment. Using the arxiv server's full-text search, we find there presently are 668 on-line preprints relating to the LLG equation ... and the rate at which new LLG articles appear is accelerating year-by-year.

    The ubiquitous interest in the LLG equation arises because almost any modern laboratory experiment and/or mechatronic apparatus---whether classical or quantum, macroscale or nanoscale---contains dozens of devices whose internal magnetodynamics is described by the LLG equation.

    But when we investigate the origins of the LLG equation, we find that they are shrouded in mystery. The original citation is frequently given as "T.L. Gilbert, A Lagrangian formulation of the gyromagnetic equation of the magnetic field, Phys. Rev. 100 (1955) 1243" (227 citations).

    Yet that 1955 citation does not exist ... for the very good mathematical reason that the state-space manifold S_2 has the wrong topology to be a cotangent bundle manifold ... such that Gilbert's claimed Lagrangian formulation, when worked-out in detail, is blocked by a topological obstruction.

    As it turns out, the LLG reference is to a conference-proceeding abstract, that in turn is associated to Gilbert's never-published 1955 thesis, which in Gilbert's own recent assessment, contains "a number of errors and misconceptions" that (in essence) are associated to the previously-mentioned topological obstruction.

    In summary, the LLG equation is physically correct and mathematically well-posed, but it cannot be derived by the (oft-cited) Lagrangian methods. For details, read the (highly entertaining) account on Entertaining Research.

    Can similar comedies-of-error happen in mathematics/TCS? Absolutely. Does the web make such comedies easier? Hmmmm ... probably not ... because the web makes it easy to publicize these mistakes .. and to cite historical examples of how readily even egregious errors can become embedded in STEM culture.

    ReplyDelete
  5. A person of your skill level would do more good for the world by doing actual research/writing, rather than worrying about wikipedia articles. The last thing the site needs is even more voluntary unpaid slave labor. About the links, they aren't meant to be about original discoverers of theorems but rather about verifying that the thing in question wasn't made up out of thin air. Ironically according to wikipedia ideology, a newspaper article about a theorem, written by some unqualified reporter, is a "more desirable" link than the actual paper the theorem comes from.

    ReplyDelete
  6. Wikipedia has scripts that spider the links on the site. If you take down your website it'll get logged and somebody would find an alternative source. Is there any reason to assume the author page has better longevity than yours? There's not much prevention you can do besides the archival work going on in the web.

    Regarding Wikipedia's preference against primary sources, they're allowed "with care, because it is easy to misuse them". Mathematical proofs are an edge case where you can verify the veracity directly within the proof. Contrast with, say, a physics paper where the data may have been fabricated. Thus the appearance in a news article at least suggests that people have been made aware of this paper, and if it seems outlandish have had a chance to respond (contrasted with an article in a little read publication that essentially nobody has ever been made aware of).

    ReplyDelete
  7. Oh dang, I hate the "You should go back to your research (ie. hermit's cave), leave wikipedia to us idiots" thing. Wikipedia is a useful tool for mathematicians / computer scientists / etc. for the same reason that it is useful to other people. So please fix the mistakes; the rest of us sane people will thank you.

    ps. I know of at least one fields medalist who has contributed a lot to wikipedia.

    ReplyDelete
  8. Please do edit the entry with the misleading citation to indicate properly that is is not your result: http://en.wikipedia.org/wiki/Wikipedia:Be_bold

    However, please do not remove the citation to your survey unless you find a more appropriate citation to replace it with. Even then, it may be good to keep both the primary source and a link to your survey. Citations are good, citations to surveys especially so.

    Not everyone who edits Wikipedia articles (even on technical subjects such as this) is competent to write or review proofs, and indeed proofs should only be included when they help improve the reader's understanding of the subject, not merely because they verify that some statement is true. So instead everything that is not just a trivial calculation should be backed up by a pointer to the literature, regardless of whether that pointer goes to the paper that deserves original credit for the statement.

    ReplyDelete
  9. Wikipedia is full of abusive editors, administrators and vandals. It is a sad thing, despite of its use. If they followed their own policies, things would be OK. However a couple of biased editors can do a lot of harm, especially when they are given authority that is too often in proportion to their incompetence. A good article is often replaced by a mediocre (or worse) edits by arrogant nobodies with surplus of free time and lack of intelligence. It is like a public toilet, that can be useful in case of urgent diarrhea, but you have to endure a lot of suspicious mess - using it is like sitting on a warm toilet seat, you can't help but wonder who was there before.

    ReplyDelete
  10. @last anon, probably one of the best comments i have come across so far.
    i like the public toilet analogy.

    very nice.

    ReplyDelete
  11. Anonymous says: I know of at least one fields medalist who has contributed a lot to wikipedia.

    An amazingly and gratifyingly large fraction of the wikipedia entries relating to differential geometry, algebraic geometry, and dynamical flows are outstandingly well-written.

    I have often wondered whether there might be an anonymous, modern-day "wikiBourbaki" / "Scarlet Pimpernell" / "Wonder Woman" / "Lone Ranger" that is keeping an eye on these entries.

    ReplyDelete
  12. The recursive inseparability result probably goes back to Kleene, Rosser, Novikov and maybe Trakhtenbrot.

    BTW the sets as given don't seem to be disjoint. Simpler to use

    [e: phi_e(e) converges to 0]

    [e: phi_e(e) converges to 1]

    ReplyDelete
  13. Ambrosiac.

    THANKS- I have made the correction.
    It was MY copying Wikipedia incorrectly,
    not their fault.

    Do you have a REFERENCE for any of those folks having proven this?

    ReplyDelete