Thursday, March 21, 2013

The Slot Machine Theory of Paper Submissions

Why do scientists publish so much? There is the whole "publish or perish" thing but that doesn't explain the large increase in quantity of publications. With a focus on important publications and measures like H-indices, a simple publication in a conference often won't help someone's career. Yet we continue to publish as much as we can. Why?

When you play a slot machine and you win, you get lights, music, sounds of coins coming out of the machine. If you lose, nothing. Lots of positive feedback if you win with no negative feedback if you don't.

If you submit a paper and it gets accepted into a conference, you feel excited. Excited to see your name on the list of accepted papers. Excited to update your CV and to give that talk or poster that only those few who's paper was accepted get to give.

If your paper is rejected, nothing. You don't list rejected papers on your CV. Nobody will even know you submitted the paper. And you can just take that paper and submit it to another conference. Lots of positive feedback if your paper is accepted with no negative feedback if it isn't.

Some people might argue the analogy doesn't work since slot machines are arbitrary and random. Those people have never seen a program committee in action.

13 comments:

  1. Lance, I'm sorry to hear that gambling problems are so wide-spread in our community. There are resources for people who find themselves addicted to the blinking rejection/acceptance emails (perhaps conference submission pages should have a hotline posted for serial submitters who want help?).

    > Yet we continue to publish as much as we can. Why?

    There is something to be said for the theory that people write and submit so many papers because of irrational emotional reasons. But I disagree with the slot machine version of the theory, for a few reasons:

    1)
    > With a focus on important publications and measures like H-indices,
    > a simple publication in a conference often won't help someone's career.

    For better or for worse, I don't think that's the case. Depending on where you are in your career (and depending on who is reading your CV), an extra publication or two can make a big difference in your job prospects -- think of grad students applying for postdocs or faculty jobs.

    Since many papers involve grad students and postdocs, this factor comes into play a lot, I suspect.

    2) When you start a research project, it's hard to tell whether it will turn into something major. Often you take on a question, solve (part of) it, and realize that you have a paper that is not Godel-prize material but still nice. What do you do with it? Your post suggests you should not even bother to write it up, but why get something out of your efforts so far? (And also why not tell the world about what you have learned? Maybe someone else will turn it into something more interesting.)

    3) I hate slot machines. (But I continue to write papers, even occasionally ones that are not Godel prize material.)

    ReplyDelete
  2. 1) Maybe we should have rejected papers on the CV :-)

    2) When you work on a problem and make progress you might THINK that you
    won't get futher so you submit. It gets turned down. Then you DO make some mor pogress (perhaps with a new co-author). So you submit and you don't think of it as the same paper. This can go on for a while. OR it can even get in
    and you submit an improvement. So TIMING can be an issue.

    ReplyDelete
  3. If your paper is rejected, nothing.

    Clearly, you are a better man than I. For most people, rejection hurts, Umeshisms notwithstanding.

    ReplyDelete
  4. When I saw the title of this post I thought it was going to be about how random and arbitrary many program committee decisions are, and how that encourages us to keep re-submitting the same rejected papers to more conferences in hopes of a luckier outcome.

    ReplyDelete
  5. Another important reason that Lance forgot to mention is the pressure to publish coming from the funding model in the US. To get a grant it is important to have "recent" and "fair" number of publications. The grant is for a specific "project" which means that one can get more money by having more projects which means more publications. And so on. There are of course advantages for this model as well but I believe that it drives to some extent the publication culture.

    ReplyDelete
  6. I'm with Adam -- there are a variety of reasons to write papers, and "slot-machines" doesn't seem like the right analogy to me.

    To start, I think you're confusing the question a bunch. It seems like underlying your question is not, "Why do we write small-result papers?" but "Why do we do so much small-result research?" Often, once we do the small-result research, we write it up and publish it as an act of completion, so we can feel done with it and move on. That's a fine thing; what may be a small result may lead someone to a larger result later on, so better to write it down that not, I'd think, although one might argue it's not the best use of time.

    As to why we do small-result research, again, there are lots of reasons, some pointed out here. Training of students is one. (And again, that's a good reason for going through the writing process as well.) Not knowing ahead of time whether a result will be "big" or not is another. The desire to make some sort of steady (if small) continuous progress rather than aim for paradigm-breaking jumps every time out comes into play.

    ReplyDelete
  7. It is actually good that people make their results available for others. I hurts more when you do something and later learn that someone has already done it some decades ago but has not published it. So writing these up and making them available for others is not a bad thing.

    The question is why people submit these small results to good conferences? Why not just pot them on arXiv? When we invest in something we normally want to get as much as we can. That is normal. Part of the problem is that if these works are not published in a good conference there is no positive feedback. It is an all-or-nothing situation. Either we publish in a good conference and get the stamp on it or we don't get anything, they don't count. It is not just the lack of negative incentives for rejected papers, it is also the lack of positive incentives and appreciation for good and reasonable but not awesome research. We often forget that most results are combination of several small ideas that have appeared in not paradigm-breaking works. We tend to think of paradigm-breaking results as wonders made by a single or few authors in vacuum without much attention to small ideas that they have borrowed from various places and people.

    In this situation, it is expected that people will try the only way of getting a positive feedback. A negative feedback for rejected works may reduce such submissions but a more fair solution is to change the game so it is not an all-or-nothing situation. For example, submitting small but nice results on arXiv should count and have a reasonable positive feedback on people's careers.

    As usual, this is not an issue for awesome ground-breaking work, nor for works of too low quality. The issue is for the gray area between them where deciding what to accept and what to reject is influenced significantly by factors other than the quality of the works and therefore looks random from outside. Punishing these rejected papers which had similar chances to the accepted papers further is not a good solution.

    ReplyDelete
  8. How old is the H-index concept? Five years?

    What is a "good" H-index? Imagine that I have done a solid dissertation and published one rock solid paper each year of being an assistant professor that is cited by dozens of people each year. I am coming up for tenure review. My H-index doesn't hit double-digits. My competition has been publishing 5 papers a year, many of which cite others he has written. Even without anyone ever citing one of his papers other than him, he has an H-index of 12.

    I have better results. He has more papers and a better H-index. Who looks better to colleagues who don't know my area? To the Dean's office? The President's office?

    ReplyDelete
  9. The real problem is not small results or repeated submission. The real problem is the culture of computer science that insists that we waste hundreds of hours on deciding what the "good" papers that "deserve acceptance" are at "prestigious" conferences. Why don't we act like mathematicians, who hold conferences where pretty much everyone gets a turn to speak (unless the results are true crackpottery, and sometimes even then)? After all, good papers will get cited and be influential whether or not they appear at "prestigious" conferences. Nowadays the real action is on the arxiv anyway, isn't it?

    ReplyDelete
  10. Nice post, lance.

    I think there is room for publication models to address some of these issues? One approach is to have, say, all submissions appear on arxiv or some other central repository, and reviews then also appear with these papers. This has weaknesses (just keep revising until all obvious "weaknesses" in a fundamentally lame idea are bandaided), but in general computers+internet offer opportunities not yet being fully exploited. I should note of course that ideas like this are in circulation
    http://yann.lecun.com/ex/pamphlets/publishing-models.html , and being experimented with.

    My own view (perhaps too obvious/simplistic) is that the "randomness" is an inevitable side effect of the level of competition. Basically there are a few things that are "automatic in" for various reasons (interest in the topic, technical quality, go ahead and add your favorite reasons), but then most things are in the gray area (I'm imagining a gaussian: most mass is effectively in a block near the median), and no one can really identify if the literature is better with one or the other. Same deal with faculty hiring and basically anything else with this low-acceptance rate....... As such, all we can really hope for is that the definite-ins are not discarded, and the gray area will slowly sort itself out.

    I do think the system could use constant overhauls/vigilance. I think it is good to worry that small results are over-emphasized, and also that the system is gamed in far too many ways.

    ReplyDelete
  11. It seems like a high publication count is needed to get hired -- or even interviewed -- for academic positions (though obviously it is not sufficient). Can someone provide a single example of someone who was recently hired at a top school who didn't have a large number of publications? At CMU for example, it looks like pretty much everyone giving job talks has 15+ papers from grad school alone.

    ReplyDelete
  12. If your paper is rejected, nothing.

    Maybe true if you are a long tenured professor. But if you are a grad student, you invested 12 months of life into doing the work and the clock is ticking until you need to graduated and move on. Putting nothing on your CV doesn't make this happen.

    With a focus on important publications and measures like H-indices

    I find this somewhat contradictory: everyone I know that has a high H-index publishes lots of papers. If you write something that is really important, there are plenty of good reasons to write a flurry of offshoot papers. They may not be groundbreaking or difficult, but will help guide at least a subset of your prior audience.

    ReplyDelete
  13. I am a graduate student who moved to CS from pure maths and I don't see one good reason to publish papers that are incremental. Maybe, my voice is insignificant in front of big names who have already commented above me, but I feel responsible to the community to make my point. Personally, I prefer submitting to journals rather than conferences. I am fine having 3 good publications than 15 insignificant one.

    I feel that if the hiring process is so much directed by numbers of papers rather than number of significant papers, computer science department at such schools needs to do some introspection. I will cite Harold Stark story of how he got tenured at MIT. His graduate thesis is on one of Gauss' conjecture. He did not had many results in 4-5 years post graduation, but then he proved the Gauss conjecture and got tenured at MIT.

    2. On the point of getting paper rejected or accepted. We further need to scrutinize the fact why papers with some subtle errors gets accepted. This is again due to the "incentive" of getting accepted paper in top venue. Frankly speaking, PC cannot give so much time to proof-read 40-50 pages. I believe, as a community, we should understand our responsibility to submit papers when we are "thoroughly" convince that every step of your proof is correct. PC should be just delegated to weigh if our work is better enough to be in 30 odd papers that will end up being accepted.

    One personal experience: in one of the conference, I was discussing with two tenure track professors (say X and Y) about one of their coauthored paper. Incidentally, X and Y was wrong in citing the major contribution of its own work and when I showed the paper, the response of one of the faculty was, "I don't remember what I worked at that time." The paper was published online in 2010 and I was talking to the faculty in 2012. If you cannot remember what you worked on two years back, think twice if you really worked on it!

    That sums up my frustration because as a graduate student, I need to do background reading covering many such papers.

    ReplyDelete