Monday, January 24, 2011

Why My Kids Trust Wikipedia

Guest post from Annie and Molly Fortnow

Our teachers used to tell us not to use Wikipedia because anybody can edit it and therefore it isn't trustworthy. A couple of years ago we decided to test it out. [Not with my knowledge - Lance]

We went to the Cow page but it was locked showing that some subjects can't be edited. We then proceeded to the Grapes page which was unlocked. So we added to the bottom: "Grapes are good. Nerds are cows." We immediately got a message popping up on the screen that said something like "That's an inappropriate remark. It's being deleted. You are getting a warning."

Now we know that if someone edits Wikipedia with something silly it will always be edited back. So, now we trust Wikipedia and use it often. Every time one of our friends asks us why we use Wikipedia we tell this story and everyone always believes us. Now all our friends trust Wikipedia too.

1. The reason that you should not trust Wikipedia is that NO ONE claims that what is there is TRUE, what is written there is stated in some "reliable" source but that does not make it true.

Your test is completely inadequate. The real situation is completely different from adding obviously false/incorrect statements. I can go to a page and add a false sentence which an arbitrary reader cannot recognize to be false with high probability. Say I add to the page about Zero-Knowledge Proofs a section about Witness Hiding and state that they are not closed under asynchronous composition. What is the expected time that someone will notice that it is false? (No, I haven't done it there, I have tested it on some other page.) The probability of correctness of a statement on Wikipedia depends on how many people know about its truth/falsity.
If the number of such readers is relatively small then the possibility of the statement being false is higher.

The situation is even worse for topics where there is no objective truth.

2. Wikipedia? Didn't these kids use to solve major open questions in TCS?
Man, kids these days.

3. I would also recommend this (wikipedia!) page: http://en.wikipedia.org/wiki/Reliability_of_Wikipedia

In numerous occassions I had to explain to people how wikipedia works. As a F/OSS enthusiast myself, Linus' law is all the proof I need for casual use of wikipedia. If it is important, I doublecheck with a textbook or even better a paper, perhaps to retrieve all the necessary details, as I would if I had retrieved the information from any other encyclopedia.

Personally, I use wikipedia in 3 major ways: First, as a certificate to my own knowledge. If I am not completely sure about a fact, e.g. if a subset in a theorem is proper or not, checking wikipedia and getting the same answer as the one I suspected is useful if I don't want to get into too much detail.

Second, I can use it to read summaries on various subjects. By skimming through the article I can tell if this is what I am looking for and use wikipedia as a basis for more advanced search. Most of the time it provides me with the proper keywords for a search engine query or with references to papers that are fundamental in the particular subject.

Finally, sometimes my professors use native (Greek) terminology instead of the English one. Wikipedia can help me find the appropriate English terms, especially if there is an entry in the country-specific wikipedia and then research the english literature on the subject, which is of course richer.

4. Wow, Anonymous, way to miss the joke. Go read the last two sentences again.

5. Aha! So perhaps this post is a parable ... a follow-up to the previous post "What is a breakthrough? Lets have an intelligent discussion!"

Pursuant to Lance and GASARCH's post, I've taken to looking up the MathSciNet reviews of seminal articles in mathematics ... and it's remarkable how commonly breakthroughs at first pass unrecognized even by the most august reviewers ... for example, the highest praise that Andre Weil could bring himself to offer regarding Samuel Eilenberg's and Saunders Mac Lane's Natural isomorphisms in group theory (1942, review 0007421 (4,134d)) was that their ideas were "likely to be helpful".

Perhaps two take-home lessons, from *both* posts, are "think for oneself" and "trust, but verify" ... regardless of whether one is reading Wikipedia or the academic literature! :)

6. Compared to the usual completely neutral and unevaluative tone of Math Review, "likely to be helpful" is strong praise.

7. David asserts: Compared to the usual completely neutral and unevaluative tone of Math Review, "likely to be helpful" is strong praise.

Lol ... David, your assertion inspired me to wonder: How many AMS Math Reviews contain the word "breakthrough" in the field "Review Text"?

As a hint, here's Google ngram chart for the phrase "mathematical breakthrough."

Perhaps folks would care to guess the "breakthrough number" (without looking)?

8. For the benefit of readers of Computational Complexity who don't have access to AMS' Math Review, appended is a link to a screenshot of the "Math Review breakthrough number."

For maximal fun, please consider posting your prediction of the "Math Review breakthrough number" before clicking to see the answer.

9. why should we believe your story

10. @John Sidles

To be fair, the Google ngram viewer link you posted has been set to "smoothing of 3", which tells a very different story than "smoothing of 1." Taking either very seriously would of course presume that Google ngram viewer is a good way to view trends in this sort of data... I would have to suppose it is not.

11. The grapes page appears to be the victim of multiple attempts by vandals. Vandalized pages can often be auto-detected.

12. There are two different problems (among many I would guess) with Wikipedia.

First, incorrect information can be slipped into a page (on purpose or not) and not be noticed for a while.

Second, once incorrect information is in a page it is sometimes difficult to remove because ignorant white-knights keep restoring it.

13. Lance, why the disclaimer? They didn't actually do it. Was your disclaimer an attempt to make the lie seem real?

14. I trust a wikipedia page on first reading about as much as I trust a random textbook from the library. Which, to be honest, depends on what kind of deadline I'm working on.

15. I would like to thank Annie and Molly Fortnow for raising this wonderful question. It seems to me that they are getting to be mature enough to read and enjoy Mark Twain's description of the scientific process, that is in his book Life on the Mississippi, in the famous passage that begins "Any calm person, who is not blind or idiotic" and concludes:

--------------
"There is something fascinating about science. One gets such wholesale returns of conjecture out of such a trifling investment of fact."
--------------

Let us embrace Mark Twain's reasoning, in reflecting upon the number of AMS Math Reviews whose text contains the word "breakthrough. Here is the number of such reviews, arrayed in five-year bins:

1951-1955: 000 AMS Math Review "breakthroughs"
1956-1960: 003 ...
1961-1965: 007 ...
1966-1970: 007 ...
1971-1975: 009 ...
1976-1980: 029 ...
1981-1985: 055 ...
1986-1990: 071 ...
1991-1995: 105 ...
1996-2000: 160 ...
2001-2005: 182 AMS Math Review "breakthroughs"

Now, to adopt Mark Twain's language, any calm person, who is not blind or idiotic, can see that the doubling cadence for mathematical "breakthroughs" is 8.06 years, and that consequently, in the year 2100, there will be 127,495 "mathematical breakthroughs" cited in AMS Math Reviews.

Which is to say that in year 2100, Math Reviews will be reporting a mathematical "breakthrough" about once every four minutes and seven seconds.

Is this an unreasonably large number of mathematical breakthroughs? Definitely not! Because we are to consider, that on a happy, prosperous, free, and secure planet of 10^10 people (the kind of planet we all hope Annie and Molly will be living upon), we can anticipate that about one person in 1000 will choose a career in mathematics. In which case, the mean interval between breakthroughs, per living mathematician, will be 78 years.

Thus we are justified in feeling optimism, that in the year 2100, achieving a mathematical "breakthrough" will be a once-in-a-lifetime dream ... that is reasonably accessible to every working mathematician, including Annie and Molly, and including their children, and including all young people who are inspired to do mathematics.

Good! :)

16. 1) On a recent post I used Wikipedia to point to a result. I phrased it carefully saying
According to Wikipedia...''
and gave the link. There were some comments that objected to the link to Wikipedpia, one of which demanded I remove the link. It would have made more send to demand I provide an additional link to the actual paper being referenced (which I did).
How did I find the link to the actual paper? You can guess.

2) In a recent paper I wrote the ONLY source I could find for a proof was on Wikipedia. In the paper I referenced wikipedia but also provided the proof for completeness. The referee objected to even the reference to wikipedia.
But to NOT reference it would have been dishonest- that is where I found it. And I provided the proof so the paper is fine in that regard.

3) Annie and Molly- do your teachers allow you to use WIkipedia?

4) Parents- do you allow your kids to use it? Do your kids teachers allow them to use it?

5) When asking how reliable Wikipedia is one non-rhetorical question comes to mind- compared to what?

17. GASARCH notes: When asking how reliable Wikipedia is one non-rhetorical question comes to mind- compared to what?

As a case study, just yesterday, for a review article, I tried to track down a source text for the Wikipedia assertion that Saunders Mac Lane once said:

---------------
"I didn't invent categories to study functors; I invented them to study natural transformations."
---------------

This quotation appears (verbatim) at least 46 times on the internet ... but it ain't easy to find a verifiable source for it in the mathematical literature.

In their article Categorification (1998) Baex and Dolan assert "Mac Lane has said that categories were originally invented, not to study functors, but to study natural transformations!" ... but like Wikipedia, they provide no citation.

Looking back further, with Baez and Dolan's version as a help, we find Mac Lane himself saying, in his article The development and prospects for category theory: (1996): “But I emphasize that the notions category and functor were not formulated or put in print until the idea of a natural transformation was also at hand.” (italics as in Mac Lane's original text).

Is this the final word in the matter? Heck, don't ask me! We see that in this instance, Wikipedia supplied only a rough approximation to the facts of the case ... and the peer-reviewed literature wasn't much better.

So perhaps Mark Twain said it best: "Truth is precious; let us economize it." Hey ... wait a second ... was that *really* what Mark Twain actually said? :)

18. John Sidles, what is your profession ? are you a professional blogger? it seems that your comments are very long!

19. Whenever I am drafting the introduction and/or discussion sections of an article, I try out the ideas in the blogosphere ... so much of this prose gets recycled.

At present our Quantum Systems Engineering (QSE) group is focussed on (1) simulating large-scale quantum dynamical systems and (2) experimentally verifying that the simulations work. To accelerate this "virtuous circle", our design and coding efforts draw heavily on recent advances in quantum information theory and complexity theory.

You can expect our hardware-driven interest in QIT/CT to tail off in about eighteen more months ... `cuz by then we hope to understand quite a lot ... and to have experimentally verified our understanding ... and to be able to explain it fluently in several different mathematical languages ... including in particular the language of computational complexity. :)

Of course, *lots* of mathematicians, scientists, and engineers have pursued this same dream---including von Neumann and Feynman---and it has always proved to be a humbling pursuit ... so maintaining sense of humor and proportion is an absolute necessity.

Recent advances in complexity theory, quantum information theory, and category theory have provided powerful new mathematical tools that are speeding the pace and retiring the risk of this centuries-old pursuit ... for which our appreciation and gratitude as engineers and medical researchers is extended.

The questions that Annie and Molly Fortnow are raising about "trusting Wikipedia", are partly about a world in which Feynman's and von Neumann's comprehensive dynamical and informatic visions are are beginning to come true ... that's why they're wonderful questions IMHO ... even though the answers can be discomfiting to contemplate.

20. The details might not sound right but it is likely (at least based on) a true story at a certain level. A very similar edit was made in September of 2009 from a computer apparently located in Illinois. What was actually added was "i like grapes. nerds are cows." and it was removed around two minutes later with a message of "The recent edit you made to the page Grape has been reverted, as it appears to be unconstructive."

21. I don't see how you can NOT use Wikipedia, unless you're willing to ignore one of the top five search results on almost any topic you care to name.

22. E is probably on to something! I bet that the girls just forgot the exact words of their comment. This proves that this post is true! Thanks, E.

23. You can use it to more easily find potentially correct information and then more traditional references. I have seen and fixed a wide variety of errors on Wikipedia pages. Some have been reverted by others ruthlessly even though my correction was correct. While the pages there are often easy to find and use to "learn" something, if it is something that matters to you then you should go to the source which is referenced on the page. If no source is given, trust it as much as something you hear at a cocktail party.

24. Sidles, I am not sure what your point is, but have you looked at the ngram chart for just plain "breakthrough"? According to Google, the word didn't catch on until the 1940's and has experienced an almost exponential growth since then. Is the growth of the phrase "mathematical breakthrough" commiserate with the use of "breakthrough", or has "mathematical breakthrough" grown faster, or maybe even slower?

It's always useful to compare a two-word phrase to the single-word baselines.

25. Anonymous asks: I am not sure what your point is ... have you looked at the ngram chart for just plain "breakthrough"?

My lexical investigations of the word "breakthrough" were intended mainly to amuse ... hence the Mark Twain reference.

The Google ngram incidence for "breakthrough" indeed is very interesting, and so is the chart for the even faster-increasing term "roadmap". Paging through the links that Google supplies to Google Books links, we see that "breakthrough" was once a mining and geologic term, that was embraced in WWII by military strategists, that was subsequently widely embraced by the whole STEM community.

Now that we are all jaded by six decades of STEM breakthroughs, the new trendy term for STEM strategists is "roadmap"! :)

A thoughtful, systematic abstraction of the generic elements of viable STEM roadmaps can be found in the International Roadmap Committee (IRC) More than Moore White Paper ... so these light-hearted investigations with Google ngrams can sometimes lead us to serious documents.

26. John Sidles; you write too much. focus on substance rather than length next time. what was your phd in ? literature or science fiction ?

unless you have something great to tell, plz, shorten ur stuff.

27. Your wish is my command, anonymous!

Under Lance's topic The Ideal Conference I have posted a brief roadmap-related question (it is #9) ... this question is adapted from a delightful essay by Scott Aaronson.

Thoughtful responses from all (including anonymous) are welcome, needless to say.

28. I can give an example of my opinion of Wikipedia. Around New Year's 2007 I noticed the Natural Proof page had a request for expert attention, so I answered it. The bulleted excerpt from the paper and most of the following text you see today were already there, so it was mainly a case of filling in detail and adding some references (they let a self-cite stand after I drew notice to it).

However, I expressly stopped short of trying to achieve the level of a one-hour seminar lecture. I notice that someone last summer added a separate definition for natural proof as opposed to property, but that doesn't change the point. I impressed that point on my just-finished Fall 2006 intro-grad-theory class, who I thought did too much running away from my lecture notes and the Homer-Selman text to Wikipedia.