Thursday, February 01, 2018

Flying Blind

Many computer science conferences have made a number of innovations such as a rebuttal phase, multi-tiered program committees, a hybrid journal/conference model with submission deadlines spread through the year. Not theoretical computer science which hasn't significantly changed their review process in the major conferences since allowing electronic submissions in the early 90's and an ever growing program committee now at 30 for FOCS.

Suresh Venkatasubramanian learned this lesson when he ran a double blind experiment for ALENEX (Algorithmic Engineering and Experiments) and laid out an argument for double blind at broader theory conferences to limit the biases that go along with knowing the authors of a paper. The theory blogosphere responded with posts by Michael Mitzenmacher, Boaz Barak and Omer Reingold and a response by Suresh. I can't stay out of a good conference discussion so here goes.

Today major orchestra auditions happen behind a screen with artists even told to remove footwear so sounds won't give away the gender of the musician. On the other extreme, the value of a piece of art can increase dramatically in price if it is discovered to be the work of a famous artist, even though it is the same piece of art. Where do research papers lie? It's more a work of art than a violinist in a symphony.

Knowing the authors gives useful information, even beyond trusting them to have properly checked their proofs. Academics establish themselves as a brand in their research and you learn to trust that when certain people submit to a conference you know what you get, much the way you would more likely buy a new book from an author you love.

Suresh rightly points out that having authors names can and do produce biases, often against women and researchers at lesser known universities. But we should educate the program committee members and trust that they can mitigate their biases. We can completely eliminate biases by choosing papers at random but nobody would expect that to produce a good program.

Having said all that, we should experiment with double blind submissions. Because everything I said above could be wrong and we won't know unless we try.

17 comments:

  1. A proof should NOT be seen as a work of art. It is a PROOF. If it is correct, then it DOES NOT MATTER who worked it out. A paper in COMPUTER SCIENCE THEORY should stand on its own.

    Please do not sit in your ivory tower wearing manly shoes poo-poo'ing the biases and yammering on about known authors bringing more than trust and throwing out insincere strawman "options" like choosing papers at random and tossing in a "we won't know unless we try" at the end to make it seem like you are really willing to change.

    Between known names getting elevated and conferences costing thousands to attend the ivory tower seems to keep itself very secure.

    ReplyDelete
    Replies
    1. My experience is that many advanced proofs are works of arts, similar to paintings (more than music), up to the point that often one can recognize the author from the proof style itself.

      Also I have other people's experience (i.e. they told me) of the world of classical music. That world acts in some ways like the academic community, with some important exceptions - on of which is having a culture of bias. In the music world who you know really matters (partly by design). Specifically for a 1st round orchestra audition (the foot in the door), the current sentiment is that name recognition and connections should cease to matter.

      As a general rule, comparisons of the academic and musical worlds are interesting, but should explicitly take into account both similarities and differences.

      Delete
  2. You say: "Suresh rightly points out that having authors names can and do produce biases, often against women and researchers at lesser known universities. But we should educate the program committee members and trust that they can mitigate their biases."

    Lance, can you point me to the studies or any evidence that shows that educating and trusting a PC (or people in general) is an effective strategy for mitigating biases? I've apparently missed the scientific evidence backing adopting that approach.

    I'd of course also like to hear how, given that we have distributed PCs that either do things online or meet for 1-2 very busy days to make decisions, what form that "education" is going to take -- who would administer it, pay for it, etc.

    Thanks! Mike

    ReplyDelete
  3. I'd like the TCS community to consider zero-blind refereeing. Upload all papers to a forum like openreview.net, have everyone's name fully visible (authors, reviewers, etc.). I think that will increase more rigorous and carefully written submissions, more civility, more honest reviewing, and less pettiness overall. Conference PCs can then simply look at the recent six months worth of papers on this forum and invite whomever they like to make oral presentations, keynotes, posters, lightning talks, PhD forums, industry perspectives, etc. at the conference venue. They can obviously take into account the level of interest expressed on each paper; when they make decisions using information that everyone has access to, we can expect fairer and more meaningful decisions -- after all, their integrity and reputations are also on the line. The world needs more (validated) theory papers, not fewer; an open and inclusive forum might be the best way to go.

    ReplyDelete
    Replies
    1. I'm not sure if this is a workable idea (are there communities that currently work this way successfully?), but it sounds like an interesting idea. I wonder if it could be done at a single-conference level? It's certainly an interesting option to think about.

      Delete
    2. For years now I have been interested in seeing a publication venue where the referee reports are public. I have had good papers rejected without any real justification, including cases where the referees admitted to not reading the paper carefully... A colleague who used to work at a prestigious ivy league institution says he can tell the difference: papers were accepted when he worked there, which were lesser papers than the ones he gets rejected nowadays.

      Openreview.net is really interesting.

      Delete
    3. I think we might want to be careful about this. Izabelle Laba has an excellent article explaining how public forums with visible author identity tend to turn into open season on female authors (this is why she doesn't participate in mathoverflow). I think there's a considerable amount of evidence from the current state of our public discourse that asking for identities doesn't necessarily help. Moreover, anonymity actually protects members of underrepresented groups when they want to make statements in public.

      https://ilaba.wordpress.com/2011/03/28/why-im-not-on-mathoverflow/

      More generally, I'd really like to emphasize that a) issues of institutional bias are very very subtle (which is not to say that they're not real or profoundly impactful). While it's good to hear suggestions of what one might do to mitigate this, it's also important to realize that almost certainly anything one suggests has been studied before, and there is'a wealth of scholarship on what works and doesn't.

      Delete
    4. I think this is actually a horrible idea, for a number of reasons, although it sounds nice in theory.

      First, no junior person would then ever review critically a work by a senior person - too many are wiling to be vindictive, and junior people are dependent on getting good letters during job searches and for promotion. I'd have been terrified to ever review anything for certain people in my field who are known to have a temper - and if I wasn't sure, I wouldn't have reviewed it just to be careful.

      In addition, women and minorities are significantly less active in most online venues, and there is documented evidence that they get very different reactions when posting things online in general. In general, if you're not senior and extremely confident in online interactions (which tends to be much less true for women and minorities), it is not worth posting things publicly - too many consequences and ways it can be held against you, and not worth the hassle of the resulting attacks.

      Delete
  4. In Science and Math the results are supposed to matter the complete, total, and utter exclusion of the author. In judging papers with knowledge of their authors you risk letting deficits of rigor slide on account of personality. Further you put people into positions of power that can be abused. And stifle the younger researchers who find themselves pressured to include non-contributing senior peers onto their papers because of the value of their name, which if it's actually a good paper will inflate the value of that non-contributing senior peer's name to the point where their junior peers can never eclipse them.

    If you want to have reviewers keep knowing the authors' names, then stop pretending to be Science and Math and accept that you are acting as a religion, a loose collective of cults in which some people are more right than others.

    ReplyDelete
  5. I haven't reviewed for CS for quite a while, but I do review for CSCL (computer support for collaborative learning) and ICLS (international conference on learning sciences) and they employ blinded reviewing. My experience is that it works. Full blinding is not really possible if you know the literature well, since you can sometimes figure out which research group a paper came from just from the topic. Also a unique writing style can give an author away.

    ReplyDelete
  6. By far the most people arguing against double-blind have been tenured males, usually white, at prestigious institutions. Exactly the subset of scientists who stand to “lose” the unfair advantages they have enjoyed for so long. There is zero evidence against double-blind, and lots and lots of it in favor.

    ReplyDelete
    Replies
    1. Notice that most of the comments in defence double-blind are made by white (probably even heterosexual) males :(

      Delete
  7. There is a culture of making research results publicly available very early, via posting on ECCC or on ArXiV. Can someone please explain to me how double-blind reviewing is consistent with this culture? Should authors submitting to such conferences be forbidden to post their work (listing their names and affiliations) on such sites?

    In the current conference program committee culture, there are many steps that are taken to remove conflicts of interest, by having PC members not participate in any in discussions of papers submitted by colleagues, former students, recent co-authors, etc. How are such conflicts identified when double-blind reviewing is used?

    I am a tenured white male at a fairly good university, and on the frequent occasions when I've had papers rejected, it doesn't feel as if I'm receiving preferential treatment, but of course I'm not a good judge of such things. I wonder if there have been any statistical studies showing that submissions by various groups (female, male, asian, caucasian, etc.) from the same institution have significantly different acceptance rates. I haven't noticed that any of these groups get preferential treatment ... but then again, perhaps I wouldn't notice. In contrast, I have noticed that submissions from different institutions have very different acceptance rates, but this is consistent with my perception that there are institutions that have more research activity than others.

    -- Eric

    ReplyDelete
    Replies
    1. Eric, to answer the question in your second paragraph, you might check out my posts that Lance links to. I'll emphasize that a large fraction of CS conferences are double blind, and there are many many standard processes that are implemented to facilitate conflict of interest management and the review process. This is common enough that most standard conference management software (CMT, Easychair, hotcrp, softconf and so on) have systems in place to deal with this (semi)-automatically.

      Delete
    2. If you check out the list of STOC 2018 accepted papers, it seems that a large number of them are not publicly posted (on arXiv or anywhere else) and are not public (no talk abstracts--in fact, no abstracts at all!).

      So I think for a large percentage of papers, double blind makes sense. Anonymity need not be obligatory; one could be allowed to post papers when they wish.

      Also, missing anywhere in this conversation is that during the straw poll at the SODA 2018 business meeting, the vote was around 35-11 for double blind. Why can't the overwhelming opinion of the community be taken into account?

      Delete
  8. Science is not a democracy. If a work is really good, coming from anybody anywhere, then it will eventually triumph, surpassing all possible barriers, against all oppositions, it does not mattering where are they from.

    ReplyDelete
  9. I agree with the other Bruno: review reports, acceptance and rejection letters should be public. This makes the submitter and the reviewer more careful.

    My personal experience in TCS is that people try hard to review fairly. But will this be still true in 20 years? I think one should try to make the reviewing system as robust as possible.

    ReplyDelete