Monday, June 15, 2009

Conference Proceedings should have final version in Journal! (Guest Post)

(Guest Post by Samir Khuller.)

I have noticed that even though in the 80's and 90's a large number of FOCS, STOC, and SODA papers eventually appeared in journals (David Johnson even kindly edited a book that gave pointers to the journal versions of FOCS and STOC papers), the percentage appears to have declined significantly (I have not done a systematic analysis, but when I look for papers I often find that no journal version appeared.)

Most conference papers (even the top conferences) are not reviewed extremely carefully for correctness. The PC has a limited amount of time, and a large volume of papers to review.

What is the reason for this decline?

Are we so desperate to publish papers that we do not want to do a thorough job writing up the proofs after "staking our claim"?

Its very frustrating when when you are reading a paper and details are omitted or missing. Worse still, sometimes claims are made with no proof, or even proofs that are incorrect. Are we not concerned about correctness of results any more? The reviewing process may not be perfect, but at least its one way to have the work scrutinized carefully.

Now that proceedings are on CD's do we still need a strict 10 page limit on the conference version? Is this limit there to simply encourage people to submit to journals? If so, I am not sure its working.

What can we as a community do to ensure that results get properly reviewed and published in full in journals after they are published at a conference?

23 comments:

  1. Have you considered journals with submission deadlines?

    When there is no deadline, it's all too easy to postpone submission indefinitely.

    ReplyDelete
  2. There's an easy solution to this problem. SIGACT could simply enact a policy that conference publications older than 5 years should not be considered when evaluating candidates for awards and promotions.

    ReplyDelete
  3. 1) ``it seems as though there are more conf papers that don't make it into journals then their used to be'' AH- nostalgia for an age that perhaps never was.
    Someone (samir?) should do
    a study of whether this is really true. I do know one
    annecdotal counterexample:
    most of the early papers on
    Communication Complexity were never put in journals
    (I should study this more carefully).

    2) IF people are putting less into Journals the reason may be the ease of putting things on the web.

    ReplyDelete
  4. Putting things on the web does not mean that they were carefully reviewed. The journal review process is a reasonably good one, and we should not dump it in a hurry. It has its problems of course (primarily that it takes a year or more to get reviews back!) but it has worked for a very long time. The web is a good distribution mechanism and perhaps can be adopted in place of print journals.

    ReplyDelete
  5. I think the main benefit of having a full version is that other people know that at least one person (i.e. the author) has gone through all the details of the proof. I guess we all know how many surprises await when one sits down and starts filling the little details of a simple lemma that "must be true".

    ReplyDelete
  6. A conference paper is not a scientific publication, in the sense that it is not falsifiable: often, there is not enough detail to check the proofs. On the other hand, even if a journal paper is not necessarily correct, at least it should be written in such a way that you can pinpoint a fatal mistake if there is one.

    In the long run, it will bey harmful to us to stick to this system were conference publications have so much weight, it makes us look as fake scientists.

    ReplyDelete
  7. It makes sense to require that conference papers include the full proofs, as in FOCS. Keep the page limitation, and let the reviewers only look at the first ten pages. Forcing the authors to write out full proofs in appendices before submitting to a conference should hasten the appearance of paper full versions. On the other hand, since proofs often evolve considerably in preparing the final version, the official conference proceedings would still include only the body of the paper. The online version could have everything or not, depending on the author's preference, but a third party would know that she could politely ask for the submitted version including all proofs.

    If there is a decline in journal versions, then I would guess it is because the primary purpose of journals has evolved to become locking away content rather than promoting it. It is not surprising that authors are not enthusiastic about this. If you want people to read your papers, you had better put them all on your home page, such that Google can find them, because the journal will do its best to keep people away. We need more open-access journals.

    ReplyDelete
  8. There are probably more people, more conferences and more papers in CS these days than in the 80s. But I do not know if the journals have grown at the same rate. I know of cases where the delay after acceptance and before appearance is too long. I generally do not see any discussion when people talk about lack of journal papers.

    I have two questions to people here:
    Is there a trade off in the quality of journals as more journals are introduced? And two, as people improve or build on results of other papers, is it not some sort of acknowledgment of correctness? And if there is substantial improvement in the field isn't there less of an incentive to publish journal versions of older results?

    ReplyDelete
  9. I generally do not see any discussion when people talk about lack of journal papers.

    I meant, I do not see any discussion of the state of the journals when...

    ReplyDelete
  10. There's an easy solution to this problem. SIGACT could simply enact a policy that conference publications older than 5 years should not be considered when evaluating candidates for awards and promotions.

    You can make all the policies you want, but you can't control how search committees actually work. How many departments would deny a strong researcher tenure, or not make an offer to their top candidate, for the sole reason that their best papers never appeared in a journal? It's just not going to happen (even though it would help fix some of the community's problems).

    ReplyDelete
  11. I think younger researchers like me seldom have the motivation to write a journal version of a paper, simply because of the long turn-around time. I, for one, have no journal publications, and I don't know how that affects my record. This does not mean I don't do proofs; I usually submit the proofs in an appendix, and make a technical report with any such details before I commit the final version. As some commenter points out, the key is to document the effort to write out a (formal) proof; any author worth his salt will be able to catch mistakes merely by putting in this effort.

    My point is that for most results, a proof checked by the author is as good as a proof checked by several reviewers. Of course, I'm not talking about results that settle long-standing problems here, where you may gain something by removing the bias.

    Another idea that we sometimes explore in the programming languages community is to have machine-checked proofs, which more or less close the debate; there is a lot of recent effort in automating the tedious but uninteresting parts of such proofs and providing other tools that can make proof assistants more efficient to use, but these will take some time to mature.

    ReplyDelete
  12. "You can make all the policies you want, but you can't control how search committees actually work. How many departments would deny a strong researcher tenure, or not make an offer to their top candidate, for the sole reason that their best papers never appeared in a journal? "

    Having being in a math dept with close connections to TCS I have seen this happen multiple times. It depends on the orientation and culture of the department one is working at. So one (especially younger people on tt positions) should be on the lookout for this journal vs. conferences trap.

    ReplyDelete
  13. I agree with the sentiment that journal articles are nice to have. On the other hand:
    - Not every paper needs a journal version. For example, how about a paper which is improved 6 months later? Should journal versions of both appear?
    - How about papers whose proofs fit into the proceedings version? (This is often possible for papers published in LNCS.) Some journals have a policy about requiring additional material.
    - Finally, as other commenters have mentioned, there is the problem of long turnaround time. Imagine how much worse the problem would get if the number of journal submissions is increased!

    ReplyDelete
  14. I guess I have some comments to add to some of the issues raised.

    1) Often results are "improved", but the methods used in the improvement could be completely different. This does not mean that the original article is not interesting simply because the result was improved. The techniques developed could apply elsewhere. (As a side remark, because a result was improved does not mean that the original method was completely correct.)

    2) Often authors are unable to include full proofs in print due to the page limit (when you format conference papers in the LNCS style a lot needs to be cut from the paper). However we do not require a full version to be available before accepting the paper. Its important to get the full paper out quickly.

    3) Even if a conference paper includes full proofs, they have not been verified by a third party.
    This is one of the most important jobs of a journal reviewer. Even the famous proof of Fermat's last theorem needed fixing and the errors were only found in the reviewing process.

    However our community seems to not care too much that claims are being made and years later no proof has appeared.

    The situation is not that much better in some other areas outside theory. While doing some experimental algorithmic work, I wrote to the authors asking for the data sets for claims in a published paper so that we could run our algorithm on the same data
    set - they said that they could not make the data available as it was "company data". All we wanted were the graphs output from some compiler analyzing code. This seemed a bit odd to me; that for published work we cannot check any claims made in the paper since the claims were based on experimental evidence.

    ReplyDelete
  15. "The situation is not that much better in some other areas outside theory. While doing some experimental algorithmic work, I wrote to the authors asking for the data sets for claims in a published paper so that we could run our algorithm on the same data
    set - they said that they could not make the data available as it was "company data"."

    In that case their paper should not have been accepted! Even in medicine and biotechnology, journals are starting to require that raw data be publicly available. Surely we should do the same.

    ReplyDelete
  16. This seemed a bit odd to me; that for published work we cannot check any claims made in the paper since the claims were based on experimental evidence.

    Aren't there any other data sets you could use? Sure, it's nice to use the same one as in the original paper, but if that paper is any good, then the techniques must be applicable to more than one data set. In fact, investigating the applicability to another data set is more valuable (and not much harder) than simply verifying the results from the original data set. This issue seems to be at most a mild annoyance for experimental algorithms papers. (The real issue is when the original paper was about the data rather than the algorithm.)

    In that case their paper should not have been accepted!

    There are a lot of data sets for which privacy of personal data or confidentiality of business data is a real concern. It's not realistic to announce that research depending on such data should be unpublishable. In an ideal world, authors could at least make available suitably anonymized or censored versions of the data. (You'd have to take it on trust that the changes were insignificant, but you already have to take on trust that the data set wasn't fake in the first place.) However, this is itself a profoundly difficult problem. For example, as AOL discovered, it's next to impossible to anonymize search query logs. We're better off keeping some data sets secret than pressuring authors to provide a necessarily poor anonymization.

    ReplyDelete
  17. Samir, ad #3: What about STOC/FOCS/SODA papers with all proofs in proceedings that are submitted to journals and then rejected, because journals (especially IEEE but also ACM) want to see some new contributions and don't accept almost identical versions as ones in proceedings?

    ReplyDelete
  18. Response: I apologize for the delay - I dont read this blog, or any blog daily!
    Most CS theory conferences expect that the work will be published in journals, and some state this explicitly. I have not heard of a paper being turned down because it was substantially identical to a prior conference version that was published.

    Almost all of my own work has appeared in journals, and the journal papers are definitely more polished, and sometimes have new results and observations as well as updated citations. However, my *impression* is that adding new results to a conference publication is not strictly required to submit to a journal (at least this seems to be the case for our community).

    ReplyDelete
  19. ACM now has (or had) a self-plagiarism policy where, I understand, a substantial part of the paper submitted to their journal should be different from the conference version.
    (I recall I signed that short time ago. However, I do not find it now on the net. Did they change it?)

    ReplyDelete
  20. IEEE has even more restrcitive policy than ACM.

    Also: what's about a paper shorter than 10 pages? Should I have it in STOC/FOCS/SODA proceedings in an incomplete form to make sure that ACM or IEEE will accept it as a journal? It would be silly.

    ReplyDelete
  21. I am glad you brought that question up since I was wondering if the ACM policy you mentioned was really going to force some authors to omit material from the conference version, so that they could still publish in a journal. Perhaps these new policies are the reason fewer authors are sending papers to journals(?) -- however I feel that its not desirable for a field when there claims with no detailed proof ever appearing (after being carefully reviewed both for correctness and readability).

    ReplyDelete
  22. Just for the record: could some quote the exact statement of this "self-plagiarism" policy? Here is the copyright form for ACM journals:


    http://www.acm.org/publications/CopyReleaseJrnlFinal-6.2.09.pdf

    ReplyDelete
  23. That copyright is new from this month. I do not find the older one. However, a look at
    http://tods.acm.org/30PercentRulePolicy.html
    shows the policy.

    ReplyDelete