Monday, April 05, 2010

What Does It Meant to be Published?

I don't remember what prompted it but about a month ago I tweeted
Your paper might appear on Arxiv or ECCC, be widely read and even well cited. But don't think that it is in any way "published".
Suresh responded
Question is, isn't the point of publication to be "widely read and well cited"?
So what is the point of publication? Certainly you want your paper easily read and cited. But also you want a careful peer review leading to a polished version that has the stamp of approval by appearing in some respectable conference or journal. Publishing also acts as a filter, allowing the reader to get some idea of the level of quality of the paper before reading it. Almost any paper can appear on an archive site but it takes more to be published.

Nevertheless if you get major kudos for your archive paper, why bother taking it further? Even if people like your paper now, it may be forgotten years from now. Complain as you will about journal publishers, they are scanning in all the old articles to make them available in a digital age. Papers that years ago appeared as old department technical reports may be lost forever. Nobody can predict what form research papers may take one hundred or even ten years from now. One day those PDF files may no longer be readable. Get your paper really published and you have a much better chance of it surviving far into the future. 

A few years ago, the IEEE saw no reason to scan in old FOCS proceedings thinking that any of the important old papers appeared in better form in some journal. We knew though that if these papers weren't put in digital form, many of them might disappear forever. With some strong pushing by Paul Beame, Bob Sloan and others, those papers are now available on both the IEEE and Computer Society digital libraries.

I wanted to read a copy of Karp's NP-completeness paper which only appeared in the proceedings of a one-shot workshop in 1972. I ended up going to the library to dig it up. But library books get lost and many young people today don't even know where the library is. Later I found out Luca had scanned the paper for a course he taught. But how long will Luca's Berkeley pages last and what about all the papers that don't lead to Turing awards.

So publish your papers, best in a journal as well as a conference. Even if you don't think it matters for you in the short run, it can make a big difference for the community long into the future. What good is pushing the boundaries of science if those boundaries get snapped back because work gets lost.

19 comments:

  1. It sounds like you propose two independent reasons to publish in a journal: vetting by referees/stamp of approval, and permanent archival.

    Certainly arXiv.org does not referee papers, but for the second justification, is there some reason to think that the copies of papers maintained by journals will necessarily last longer than those on arXiv.org? We may not be able to read PDF's one day but we also may not be able to read copies of Springer's journals if that company goes away. Of course, using both arXiv.org and a journal strictly increases the odds of survivability, but I don't see that one is necessarily more permanent than the other.

    ReplyDelete
  2. ...and/or push for better digital archiving of departmental technical reports, textbooks, old workshop proceedings, library holdings, and the like.

    I'm not the least bit worried about ArXiv being around in 100 years; there's too much stuff in there, mirrored in too many places, with too many stakeholders, for it all to disappear. When some other document standard replaces PDF, there will be software to convert PDF into the new format. It's the old dry paper stuff rotting in libraries and the modern publisher-owned "intellectual property" that worries me.

    And just to play devil's advocate... Why should I want "the stamp of approval of a respectable conference or journal", as opposed to, oh, I don't know, actual respect? Sure, that's the way we've always done it (except for, you know, when we didn't), but so what? Why buy into a publication system that has so many flaws?

    ReplyDelete
  3. many young people today don't even know where the library is

    I would've assumed this was a joke if I hadn't come across it so many times lately. Many young professors don't even know where the library is!

    ReplyDelete
  4. "We may not be able to read PDF's one day but we also may not be able to read copies of Springer's journals if that company goes away. "

    Many people can't read Springer's journals now, due to the paywall!!

    ReplyDelete
  5. On the other hand, us young folks are used to googling stuff, and googling for Karp's paper works quite well (mostly links to Luca's scans).

    Doesn't work all the time though: I've personally had great difficulty tracking down an oft-cited conference paper from 1989. It's even a recurring, fairly major conference, even.

    ReplyDelete
  6. "But also you want a careful peer review leading to a polished version that has the stamp of approval by appearing in some respectable conference or journal"

    Lance, you seem to be implying here that conference papers are carefully peer reviewed :)

    ReplyDelete
  7. I already can't read journal papers without googling them and finding some alternative source! It is wrong to think that these institutions will be better at long-term archiving than the arXiv.

    Journals come and go, especially those with more favorable access policies. How long will Theory of Computing's archives last?

    The main reason for putting work in conferences is recognition and dissemination. The arXiv is a comparable venue, because the peer review in a conference is too shallow. It has the disadvantage of offering no review at all, but the advantage of offering much greater accessibility, better archiving, and the ability to revise the paper to lead to a polished version.

    The reason for journal versions is not archiving, but rather the chance to go through significant peer review. This is very important, but there might be better ways in the future.

    ReplyDelete
  8. "A few years ago, the IEEE saw no reason to scan in old FOCS proceedings thinking that any of the important old papers appeared in better form in some journal. We knew though that if these papers weren't put in digital form, many of them might disappear forever. With some strong pushing by Paul Beame, Bob Sloan and others, those papers are now available on both the IEEE and Computer Society digital libraries."

    Aren't you admitting the shortsightedness of this institution? The arXiv has been around, archiving and making accessible research, for almost 20 years. It is good that IEEE finally woke up. Why are they still hiding papers behind a paywall? What are their long-term archival policies? The IEEE was happy to let old FOCS papers disappear into history. It has no credibility on this.

    ReplyDelete
  9. Moreover, Paul Ginsparg and the arxiv have been taking concrete action to make the arxiv more permanent - including soliciting contributions from many libraries: http://blogs.nature.com/nautilus/2010/03/nature_physics_calls_for_suppo_1.html

    The problem of archiving for posterity is problematic. I don't think anyone has a really good solution right now, and this extends well beyond the arxiv. But it really does sound like your primary argument for journals is their ability to archive. I don't know, but I think journals aren't going to be pleased at having their value limited in such an extreme way ;)

    ReplyDelete
  10. I trust the ArXiv, particularly when it is combined with a distributed mirroring effort such as LOCKSS ( http://www.lockss.org/lockss/Home ) to be around for a long time much more than I trust all the supposedly "archival" journals to be around forever.

    Freely available content is almost impossible to remove from the Internet. Putting a paper up anywhere makes it likely to be mirrored by archive.org. Putting it on the ArXiv makes it even more likely to be widely archived and copied, and when combined with LOCKSS, it seems like you would need to assume a scenario involving the destruction of the Internet and all electronic media (but no corresponding destruction of paper or of journal companies) in order to make journals the safer place for archiving things.

    ReplyDelete
  11. Does IEEE have a plan to ever release its papers from behind its paywall? I just tried to download a paper from FOCS 1991, but it is still not available to the public. Why not?

    The arXiv has been much more successful at establishing funding sources that allow it to actually distribute its archived papers.

    ReplyDelete
  12. I agree with the rest. There is no point to publish in journals any more, unless you are interested in having a slick looking paper copy of it on your self. The disadvantages of publishing are much more, as someone else has pointed out the worst one is to transfer the right to a company only interested in making profit.

    I think arXive is doing a very good job in keeping papers around, much better than springer. For the referee process, conferences are doing relatively well. Adding a approval and comment option to arXive can make it much better, if a paper really worth being published and referred later, it will be read by at least one specialist in the area and that person can just point out errors if there are any. How many specialists do you think have read Karp's paper? On the other hand asking specialist in areas to read papers as referees for papers which are not that important and no one is going to refer to them is just waste of resources.

    The comment about IEEE is very pointed. I can assure you that a for profit company would never keep these papers available if they decide at some that providing these services are not profitable anymore. Trusting them for keeping everything around is absurd.

    There are some reasons to publish in journals, but not in those which want the copyright on your work and get paid for providing them to readers. Let me give you an example, I published a paper in one of the most famous springer journals which has been around for decades. The editor from springer missed a number of very simple typos which a friend of mine pointed out to a few weeks ago.

    I won't go into the amount springer is charging libraries of universities for accessing the same papers they have written, Don's letter is explain it clearly.

    Let me summarize. arXive is doing a good job in keeping papers available and can be trusted more than for profit companies to keep them for the next centenary.
    For vetting, there is a problem, but the solution is not necessary publishing in journals. Finally, if we want people to write more readable papers, the solution is not asking them to publish in journals, but to ask them to write better papers.

    ReplyDelete
  13. "For the referee process, conferences are doing relatively well. "

    What measure are you using to evaluate how well conferences do in terms of refereeing?

    I'm sure we've all had experiences where a conference referee makes good comments or points out an error/ambiguity. But when you read a paper in a conference proceedings, you have no idea if a referee actually read this paper carefully. Sometimes, referees do a good job, but sometimes they do not. How do you know?

    ReplyDelete
  14. I'll join the pile-on in agreeing that I trust arXiv much more to archive papers long term than any journal or conference. Plus, these archives are available to anyone, not just those with extremely expensive institutional journal subscriptions.

    ReplyDelete
  15. I and many researches I know consider ECCC or arXiv to require a more refined submission state than a conference (usually a full version with all the proof written out nicely). A conference provides an okay stamp of approval, and no real guarantee of correctness. In some of these papers not even the authors have checked correctness, though I think in many cases the authors police themselves by having a proof in some state written somewhere. At least all full version papers on ECCC and arXiv have had one person (an author) check correctness of the proofs.

    Now consider this: what kind of stamp of approval can you really give without even checking correctness? This causes conferences to actually get in the way of correctness checking. The author has already received her due and is not anxious to spend more time prove herself wrong. Extended abstracts are often never extended to full versions. And it leaves the rest of the community in a bit of a lurch when errors are suspected.

    As it stands, I don’t see publishing in conferences to substantially advance the values of Lance beyond what ECC and arXiv already do. To fix these problems, conferences could require a full version of the paper be available at the time of submission or publication.

    Journals are inaccessible by many, but do actually check correctness and have a stamp of approval. Open (or nearly open) Journals provide a lot of the advantages that people here are looking for. One minus is that the review procedure is still pretty arduous, and they still need to have some financial support to ensure their continued existence and archiving.

    ReplyDelete
  16. I agree with most of the comments supporting arxiv publications.

    However, since we don't leave alone, we also should be aware that in most of other fields journals are the main mean of respectable publications (and books in some fields). Therefore, if you want to argue with a dean or other people in university administration about the quality of TCS publications (what is often happening in the tenure and promotion process), having all papers in arxiv wouldn't bring much.

    Also, since frequently on this blog I hear comments about the culture of maths publications, in most of the fields in mathematics, journal is almost the only mean of respectable publications.

    ReplyDelete
  17. "Therefore, if you want to argue with a dean or other people in university administration about the quality of TCS publications (what is often happening in the tenure and promotion process), having all papers in arxiv wouldn't bring much. "

    Unless you're Perelman?

    If you *are* well-known/cited and all your papers are on arxiv, doesn't that speak well for you?

    ReplyDelete
  18. Putting something on arxiv is far more permanent than putting it in a journal, since libraries are cancelling subscriptions right and left, but anyone can download arxiv papers.

    For best results, use open-access journals. Get the best of both worlds.

    ReplyDelete
  19. I read somewhere that paper lasts longer then other storage mediums. So, lets say some galactic menace uses an EMP to attack our electronic storage, we will still have our low-tech journals.

    -- Josef

    ReplyDelete