Thursday, July 07, 2011

Scooped by 400 years

This article claims that finding the area under a curve by dividing up the region into rectangles is helpful. Maybe they could take some sort of... limiting process where the rectangles get skinnier. Who knows- they may be able to get an EXACT formula for, say, the area under the curve y=x2 from x=0 to a. (The Wikipedia entry on the article claims that it rediscovered the trapezoid rule. See here.)

I was going to begin an April fools day post with the above; however since the article (which is real) is already out there and being criticized here, here, here, here, and here, I decided to not go that direction. I have a different angle.

How often do you get a result and then find that someone else already had it? Is this MORE or LESS common in the internet age? There are several competing forces:
  1. With Search tools its EASIER to find whats already out there.
  2. With email or various websites like stack exchange it is EASIER to ASK if something is already known.
  3. Some people are gathering up stuff and putting it on line making it EASIER to find- if you know where to look. (Examples: Website of all of Ronald Graham's papers, or Website of some number theory related to VDW stuff). I thought I found a website of secret sharing papers and when I tried to find it again I hit several sites about different kinds of secrets.) ADDED LATER: updated site for Ron Graham's papers: here, and list (but not links) of papers on secret sharing are here.)
  4. Areas are getting more specialized and each use their own terminology making it HARDER to know if what you have done is original.
  5. There is so much is out there that it is HARDER to see whats already been done.
  6. NOT everything is online. The material that is not online might be HARDER to find then they used to be.
There are other issues. What if your proof is cleaner but essentially the same? What if it truly is independent? What if people come from completely different motivations? What if you read a result, forget that you read it, and later think its yours (its not). I've seen all of these things things happen and in all combinations.


  1. For my own amusement, I once searched the pre-WWII physics literature to find out how much literature John von Neumann had to read to keep current on quantum measurement theory -- it turns out that JvN had to read (at most) two articles per month. Nowadays that number is several articles per day ... very roughly 100X greater.

    Are we thereby 100X smarter than JvN, either individually or collectively? Few would say so ... and yet this flood of literature *does* have advantages.

    Consider for example this month's Bull. AMS survey by Pelayo and Ngoc titled "Symplectic theory of completely integrable Hamiltonian systems" (2011). This article deals with classic topics like the dynamics of a pair of interacting spins ... topics that have been discussed in literally thousands of articles spanning more than a century ... all the way back to Moriarty's Dynamics of an asteroid (1902). Yet Pelayo and Ngoc find plenty of new things to say in their wonderfully interesting survey. What's more, the Bull. AMS publishes comparably interesting articles pretty much every month. Bravo! This flood of integrated understanding is both the glory and the curse of our 21st century.

    That is why (IMHO) we are all of us living in a golden era for mathematics, for science, and (most of all) for engineering. JvN of course was all three ... I often wish that he could have lived to enjoy the abundant opportunities of the present era.

  2. That just show how far math from real science.

  3. I agree with the fact that it can be hard to find what was already proved given the number of papers published nowadays. But the example is of completely different nature to my mind. As stated in the title of the article, the authors aims at giving a mathematical method for doing whatever. As a non-mathematician, he should have asked a mathematician about this! If I am going to work on, say, biology, I begin by asking some people knowing the field to know whether I am mistaking or not!

  4. My reaction was pretty much the opposite of the consensus: So what?

    This was published in a DIABETES journal, it is pretty understandable if applied people in different fields end up "re-inventing the wheel" occasionally since they have no real interest or expectation of being mathematically original - they just want to solve their problem.

    It would of course be quite different if a "scooped" math research was published in Annals of Math or something like that.

  5. If you are still looking for that list of secret sharing papers, it may have been Stinson's (outdated) Secret Sharing Bibliography:

  6. @Anon#4: The role of research is to "advance human's knowledge". How does a paper about something known for 400 years (actually, more than 2200 thousand year as Archimede used this method) advance anything? Not to mention that this method is actually taught in high schools. How would one claim to be a scientist if one don't know something any one from a creditable high school would know?

    The fact that the paper is published in a Diabetes journal only make it worse. Are you gonna trust someone without (slightly advance) high school knowledge to design your next drug?

  7. How did enough people not take Calculus and yet end up in this area of the medical profession that this got written, reviewed, and published?

  8. This article was written in 1994. The online discussions of it date back at least 5 years. Why not keep it until April 1st?

  9. If commercial journals have their way, point number 3 will be nullified: at least for people who don't have a top notch academic library, most science published in the last century is hidden behind ridiculously overpriced paywalls.

  10. FYI, all of Ron Graham's papers can be found here:
    (The link originally given only covers through 2008)

  11. It is paradoxical that pessimists regard GASARCH's questions optimistically, while optimists regard GASARCH's questions pessimistically.


    Pessimist: Research funding is flat, resources are are short, and the planet is overheating. In consequence, there will never be substantially more STEM professionals than there are now, and so all of GASARCH's issues are self-limiting. Problem solved.

    Optimist: On a secure, prosperous and healthy 21st century Earth of 10^10 people, at least three percent of those people will be STEM professionals, and at least three percent of those STEM professionals will work in research-level CSE. If each of these professionals publishes three CSE articles per year, the resulting publication rate will be one article per second ... and so all of GASARCH's issues are utterly intractable.


    Although I am myself a committed optimist, it appears that many (most?) readers of Computational Complexity belong to the pessimist faction ... is this assessment correct? :)

  12. Count me as an optimist:

    The fact that human knowledge may be expanding at an exponential rate is probably a good thing.

    Organizing and searching the emerging database should keep many a post-doc gainfully employed.

  13. Bill asked: "What if it (your result) truly is independent?"

    This is indeed a subtle issue. I faced it right now when finishing my book on circuit complexity. People from Moscow university supplied me with a lot of papers (in Russian, I can read them) with results that were obtained years and years ago, and then were re-discovered in the West. So what? Are these "re-discovers" results? I think they are. Because for people in the West these results do not existed at all! But this is a very special case. "Rediscovering" and publishing results that *could* be accessed with a little effort is just a crime.

    Bill also asked: "What if your proof is cleaner but essentially the same?"

    Cleaner and simpler proofs also make a *real* contribution, I think. Especially when they use some less fancy mathematics than original ones. One of my favorite examples is Razborov's proof of Hastad's Switching Lemma. There are many more.

  14. Major domains of human activity, like transportation, communication, software,... have their own infrastructure.
    But we lack an "infrastructure" for mathematics (and theoretical computer science), by which I mean a digital formal (logical) library of almost all the interlinked definitions and theorems we have, supplemented with a proof checker, and people be compelled to add their results to it.
    There are experimental implementations:
    This is well comparable with the extensive software libraries.
    Our library can be enhanced by a (a little smart) search engine that given the formal form of a conjecture or definition, can search for similar results.
    Such a facility can improve our overall performance by 30%, if nothing.