Thursday, October 23, 2014

Guest Post by Dr. Hajiaghayi: A new way to rank departments

(This is a guest post by MohammadTaghi Hajiaghayi. His name is not a typo- the first name really is MohammadTaghi.)

Due to our belief in the lack of transparency and well-defined measures in methods used by U.S News to rank CS departments in theoretical computer science (and in general), my PhD. student Saeed Seddighin and I have worked for several months to provide a ranking based on a real and measurable method of the number of papers in TCS for the top 50 US Universities. To make this possible, we gathered the information about universities from various resources. You may see the ranking and our exact methodology here.

Indeed  we have some initial rankings based on similar measures for computer science in general as well which we plan to release soon (we are still in  the process of double-checking or even triple-checking our data and our analysis due to several factors). CS theory ranking is our initial ranking release to get feedback at this point.

Please feel free to give us feedback (hajiagha@cs.umd.edu).

48 comments:

  1. It seems like there needs to be some kind of normalization with respect to the size of the group. It's funny that UMD is ranked 6th on the list !

    ReplyDelete
    Replies
    1. Its clear that a bigger group must be ranked better than a small group

      Delete
  2. This is a rather interesting effort to bring transparency to rankings. Whatever biases this measure has they are out there in the open. For one, it favors a large department with so-so researchers over a small group of superstars. It is also highly sensitive to very productive people being omitted. For example I noticed that Erik Demaine is not listed in the Brown database, which alone removes about an extra 100 points from the MIT list. Similar omissions elsewhere mean that a department could easily be listed 5-10 places below its true rank.

    ReplyDelete
  3. This appears to be counting the career total of papers for the current faculty. That's a bit of an odd metric, rather heavily dependent on whether the senior theorists at a given school have yet retired. Papers in the last five or the last ten years would probably tell you a lot more about the current state of the group.

    ReplyDelete
  4. The hunt for *the* quantitative evaluation continues... but maybe, just maybe, we need more Mulmuleys or Renegars or Aaronsons and less Hajiaghayis. This quantitative approach won't help with that.

    ReplyDelete
    Replies
    1. Totally agree.

      Delete
    2. A superlike for that. Is there a way to devalue a researcher who always writes papers with 5 other coauthors? I wish, I wish.

      Delete
    3. You guys got it wrong. The measure here isn't "are this the perfect rankings" but rather are they preferable to the other ones out there. Once we have been given tools to lower the impact of size and concentrate more on recent output, there is no doubt in my mind that these rankings would be superior to those of the US News & World Report, which admittedly set a low bar themselves.

      Delete
    4. Anonymous at 12:51 / 1:34 PM - an easy quantitative way to achieve that would be to say that each researcher gets a lifetime limit of, say, 150 papers. Whatever scientific discoveries one wants to make can certainly fit in 150 papers.

      If such a policy were in place, Dr. H would be very close to striking out by now, at least according to DBLP. :)

      Delete
    5. To Anonymous1:34 PM, October 23, 2014 who said:
      "A superlike for that. Is there a way to devalue a researcher who always writes papers with 5 other coauthors? I wish, I wish."
      Devalue? The role of publishing a paper is not just to strengthen your resume. People collaborate to come up with new ideas and techniques to advance our knowledge.

      Delete
    6. Exactly who is the superior house contractor: the one who brings the appropriate specialist for each phase of the construction or the lone wolf, jack of all trades, master of none who builds the house alone, badly and at much greater effort?

      Nowadays, as a rule of thumb people who always publish alone are not grade A scientists. Of course there are well known exceptions to this rule, but that's what they are: exceptional cases.

      For the regular folk like the rest of us, publishing in teams is optimal in terms of quality and amount of scientific output.

      Delete
    7. Yes, devalue. Collaboration is very important. But when someone is writing 100 papers consistently with 5 other coauthors, the noble goal of collaboration is lost. It often becomes sharing credit for others work. Most people knows who does real work, and who does not. I do not need to argue about it here. That is one reason why this ranking makes very little sense.

      Delete
    8. Wow, you are stating plain out loud that five author papers are dishonest? either you have some magic powers of perception or simply you have reduced yourself to making up facts verging on the slanderous..

      Just because you don't know how to collaborate with four other people it doesn't mean others don't either. Or do you care to enlighten us who of Arora, Lund, Motwani, Sudan, and Szegedy didn't deserve to be there?

      Delete
    9. Anonymous 7:10 PM
      Master of none? That's some gut you have, man. Academia is not a contracting business as you want to make it. Learn to look UP to people who work hard. And also don't club everyone saying "rest of us".

      Anonymous 9:31 PM
      So you have found one paper in this whole list: http://en.wikipedia.org/wiki/G%C3%B6del_Prize Go and check how many other significant works are done by Arora, Sudan, Lund, Motwani and Szegedy. Collaboration is of course good, but hiding behind four coauthors the entire career is not. Just because you don't know how to solve a problem by yourself doesn't mean others don't either.

      Delete
    10. To "Yes, devalue..." Have you done enough collaboration in your life ? I am not sure. Do you really think if people know somebody does not do his/her share still continue collaborating with him/her?

      Delete
  5. To Anonymous 12:51PM, Please try to be constructive: this is an actual measure, surely you can define another measure which is reasonable and base your findings on that.

    ReplyDelete
  6. Finally, what we have been waiting for -- a ranking that puts UMD above Stanford, Cornell, and Harvard for CS theory! :-)

    ReplyDelete
  7. Why is interesting to rank departments? Surely there are better ways to spend ones intellectual effort (and student's effort) than investigation such banal questions.

    ReplyDelete
  8. To Lee: Indeed the ranking is very important for hiring faculty, Ph.D. students, job placement of Ph.D. students, grants, etc

    ReplyDelete
  9. I think that ranking schools is tasteless, but for the record the ranking is wrong in my opinion: each paper should be credited 1/n, where n is the number of authors.

    Otherwise, we can get absurd results like a school with 10 faculty members writing a single STOC paper with 10 authors being ranked the same as a school with one faculty producing 1 STOC paper.

    Or 100 schools, cooperating to write a single SODA paper with 100 coauthors being ranked the same as a school with one faculty producing 1 SODA paper.

    ReplyDelete
  10. To Anon 2:17 pm, actually the ranking is not important at all, and in any case has no scientific significance, obviously.

    ReplyDelete
    Replies
    1. You state your opinion in rather absolute terms with no evidence to back it up. People rightly or wrongly at various times rely on rankings. Just because of that they are important.

      Delete
  11. This is like a press release from the Soviet Government - it achieves the exact opposite of the intended effect.

    Specifically, the ranking claims to improve upon US News. However, simply sorting departments by paper count is probably one of the few ways to produce a ranking worse than US News.

    ReplyDelete
  12. Seems like UMD controversies continue--first Hal Daumé--we don't need algorithms (http://nlpers.blogspot.com/2014/10/machine-learning-is-new-algorithms.html), and now Hajiaghayi. Why announce a naive ranking which has obvious omissions (EC, APPROX, ITCS/ICS .....), data errors (two Minnesota), giving equal weight to papers published in seventies and eighties with recent ones? What is the purpose of such a futile endeavor? I have a request--please do not publish the full ranking on computer science--or put some real effort to do so if you insist.

    ReplyDelete
  13. Anon 2:33 do you not see the improvement? It is very similar to US news, except it corrects USnews mistake of not making UMD a top school. This ranking at least gets it partly right, and makes UMD #6. I'm sure future iterations will improve and get UMD up to #1.

    ReplyDelete
  14. Also UMD gets all 27 of Hajiaghayi's SODA papers, when he wrote 23 of them before he joined UMD!

    ReplyDelete
    Replies
    1. You mean to say all 27 of Hajiaghayi's SODA papers including the 13 with 4 or more authors, and all with at least 3 authors ;)
      I totally agree with Anonymous 2:28.

      Delete
  15. We academics spend all the time measuring and ranking other people, be it students with exams, faculty candidates, tenure committees, program committees, refereeing. All of those measures have well known and agreed flaws yet we carry on using them without much opposition.

    Yet every time someone comes up with another measure the community jumps in an uproar to point every minor thing that the new ranking methodology gets wrong, while overlooking the dramatic flaws of established methods. For example, the h-index, while admittedly flawed in well known ways was, at the end of the day a contribution to the set of tools we have at hand to evaluate an academic career. It is a welcome addition to straight paper count, reference letters, prestige, interviews and citation counts, all which have their foibles.

    ReplyDelete
    Replies
    1. Yes, add that to your tool kit. Create a ranking of schools based on each of these measures, compare the rankings, check how they are evolving and make an interesting KDD paper. Just don't try to make a big deal out of it.

      Delete
  16. If UMD being #6 bothering you, check http://www.cse.ucsd.edu/node/2652
    Who makes news based on an erroneous ranking!!

    ReplyDelete
  17. His total 27 papers will count to 8.32 if you give 1/n credit to an author for writing a paper with n authors. Is that same as writing 8.32 single author papers? Of course not-but IMHO a better measure.

    ReplyDelete
  18. To Anonymous2:28 PM: Actually I do not think the ranking will change that much by just assuming "each paper should be credited 1/n, where n is the number of authors". Look at SODA'15. There are few papers with one or two co-authors, but the majority have between 3 and 5 co-authors and when you work with big numbers this small changes cannot change the result that much. Ok your new measure can change those whose scores are close, but so what? We know by defintion if scores are close (just look at the scores), their ranks are permutable. So I think your point is not making that much sense after all.

    ReplyDelete
    Replies
    1. Do you have the numbers to back up your assertions? Just for sampling, on the first and last page of accepted papers, we have

      1 author = 6
      2 authors = 6
      3 authors = 13
      4 authors = 2
      5 authors = 4

      This is a sampling, and I counted very quickly, so there may be mistakes, but it doesn't look like you are right.

      Delete
    2. Exactly. Both measures are highly correlated so it does not make much difference which one you use.

      This has been studied before and someone quite conveniently and serendipitously posted such a graph today.

      Delete
  19. Indeed Dr. Hajiaghayi as done a reasonable job with a reasonable result. I do not understand any point of personal attacks as Anonymous. I think such people should be brave enough to mention their names at least such that their credebility can be considered as well (or just be filtered by blog moderators).

    ReplyDelete
    Replies
    1. I don't think these are personal attacks - more like humorous chuckling. The substantive issue is that the ranking:

      1) Does not take paper quality into account at all.
      2) Misses a normalization factor - the number of coauthors on each paper.
      3) Misses a second normalization factor - the size of the group.

      As a result, you get a ranking that is hugely biased toward quantity over quality. Yet in the end it is offered to us as a ranking of the quality of academic departments.

      Delete
    2. Why you are writing this as anonymous poster? This just proves anonymity has its merit. People can be more honest - and moderators do reject comments if they are abusive or off-topic. I have not seen any such comment here.

      When I cast my vote, I do not write my name on it. Reviewer's comments do not come with their names.

      Delete
  20. Not including papers by students and postdocs is another huge omission. That they write many of the best papers in theory is well-known, but see the FOCS 2014 best paper (which has only student authors) as a recent example.

    ReplyDelete
  21. Sick:
    Hey, I have a great idea. Let us just base our ranking on counting papers. Wow, how novel!

    Sicker:
    Oh, by the way, this paper generator metric works so great, especially for me and my department. Fantastic! Stanford and Cornell, eat that...And, Godel with three papers in all, you are out of luck man in this magical world of "Big Data".

    Sickest:
    Oh, lets pat ourself on the back for being so great and awesome with just the right press release. After all, it is all just marketing, right? How does it matter what ranking?

    Really folks, when this came out I first thought it was an April folks joke. Then I quickly realized that it was still just October!

    ReplyDelete
  22. Here's an anecdote that might be relevant here:

    A few years back at a discrete math conference I overheard some middle of the road mathematicians nitpicking about the unusual research style of one of the superstars in the field who had just finished presenting a solution to a decades old conjecture.

    Needless to say, their criticisms came across as nothing more than petty jealousy.

    And no, it wasn't Erdös, but it could well have been since people also used to mumble under their breadth about various aspects of his output, including that many of his papers were in collaboration with other authors (gasp!) at a time when most papers in math were single authored (how dare he collaborate).

    ReplyDelete
    Replies
    1. You are missing the point here. No one wants to condemn collaborating. Collaboration is a good thing. But the author of a single-author paper has put significantly more effort than a similar quality five-author paper. And this hard work does not go unnoticed in the theory community. It may sound corny, but reputation is something that you earn. The universities that have such well-respected researchers should rank higher than those that do not have them. But this ranking overlooks this normalization factor and some people have way more impact in this ranks than other more deserving ones, which is grossly unfair.

      Delete
  23. I agree with the idea of introducing quantitative measure, but just using the number of papers doesn't make much sense to me. Once we go quantitative, we should really take it as far as possible :)

    For instance, we can start counting the major award in theory as a rough measure of quality of papers:

    * Nevanlinna award, Fulkerson prize = 10
    * Godel prize = 7
    and so on ...


    At least the rank by Dr. H looks better than the US news rank (which is clearly politically motivated; two *rich* and powerful schools get ranked much higher than they should have been). But I would be really interested in seeing a more refined measure which does not simply use bean counting.

    ReplyDelete
  24. I think it would be interesting to look at the number of papers for each school in the last 5-7 years to see how the ranking changes over time.

    ReplyDelete
  25. To Anonymous7:53 PM: I agree about adding more measures and then take say a weighted average. However the weights and measure should be well-thought as well. At the same time, compare rank1 and rank1+1/2 (rank 1 gives more weight to quality, that is, quality is already considered by Dr. H.). The ranking really does not change that much exept for close schools which are close anyway (note that since in this ranking you have scores you can compare two schools and see for example how much school A is stronger than school B).

    ReplyDelete
  26. Here is my 2 cents: publications indeed might be more robust than awards for ranking. For conferences, often there is a large PC and lots of reviewers who select papers. What about awards? Often a very small set of people from high-rank departments select people from the same high-rank departments (so we just boost the rankings).

    ReplyDelete
  27. ( I hope it's OK to post non-anonymously on this thread. )
    I think a scientific approach would try to define what is the goal of a ranking and in what sense it is "better" then the US News ranking, apart from being more "transparent" (for which one could also rank according to alphabetical order.)

    For example, one potential way to compare two rankings is to choose a random pair of universities A B such that A is ranked much higher than B in the first ranking and much lower in the second ranking, and then go gather data on the results of "head to head competitions" between A and B (e.g., for grad students or faculty) and see with which ranking it agrees. For example one could compare UMD (#6 in this ranking, below #17 in USN) with Harvard (#21 in this ranking, #7 in USN).

    However, without such data, the best approach I can think of to understand which ranking is better is simply to "eyeball" the top 10 list of each one and see which makes more sense to me. I am including these lists below for the US News vs the proposed ranking so people can make their own judgement.

    Boaz Barak

    US News ranking
    #1 University of California—​Berkeley Berkeley, CA
    #2 Massachusetts Institute of Technology Cambridge, MA
    #3 Stanford University Stanford, CA
    #4 Princeton University Princeton, NJ
    #5 Carnegie Mellon University Pittsburgh, PA
    #6 Cornell University Ithaca, NY
    #7 Harvard University Cambridge, MA
    #8 Georgia Institute of Technology Atlanta, GA
    #9 University of Washington Seattle, WA
    #10 University of Texas—​Austin Austin, TX

    Proposed ranking
    #1 Massachusetts Institute of Technology
    #2 Carnegie Mellon University
    #3 University of California - Berkeley
    #4 Princeton University
    #5 University of California - San Diego
    #6 University of Maryland - College Park
    #7 Cornell University
    #8 Georgia Institute of Technology
    #9 Stanford University
    #10 Duke University

    ReplyDelete
  28. Eye opener. So this is how academia conducts discussion, personal attacks and criticism - hiding behind the face of anonymous. (Ironically, so is this comment)

    ReplyDelete
  29. I've probably let the comments go a bit too far already so I'm closing comments on this post.

    ReplyDelete