## Wednesday, September 29, 2010

### NRC Rankings

The NRC "rankings" of Graduate programs was released yesterday. I put up a Google spreadsheet of the CS rankings. Phds.org will also generate rankings using various criteria.

You are not getting actual ranking from the NRC rather two ranking ranges: Statistical-based (R-ranking for "Regression") and Reputation-based (S-ranking for "Survey"). For example, Northwestern CS has a 90% chance of being ranked between 42nd and 73rd (R-ranking) and between 26th and 69th (S-ranking).

Even with these wide ranges there are a number of problems in CS rankings: no citation information is being used, the data is five years old and much of the information is inaccurate (as the Pontiff complains). A simple sanity check on any ranking of CS departments would list the top four as some permutation of MIT, Berkeley, Stanford and Carnegie-Mellon. Congrats to Princeton for breaking this check.

Why no citation information? The NRC originally used the ISI data which didn't cover most conference proceedings that computer scientists consider their main venues of publication. After some discussions with the CRA, the NRC admitted this as a problem and decided to ignore CS publication data citing lack of time to find a different approach.

Yesterday the CRA released a statement on the reliability of the CS rankings.
CRA is pleased that the NRC acknowledges there are errors in the data used to evaluate computer science departments and that, in the words of NRC Study Director Charlotte Kuh, “There’s lots more we need to look at for computer science before we really get it right.”
Schools are taking the opportunity to promote their rankings. Northwestern Engineering boasts how well they did with some nice charts. We're certainly not the only ones.

Did the NRC fulfill their main mission of giving a valuable alternative to the US News rankings? US News uses a "wisdom of crowds" approach by just surveying department and graduate program chairs. The NRC tried a scientific approach which led to years of delays, many complaints about methodologies and accuracy, and a lack of a true ranking. After all the measures we can feed into a formula, the one thing that draws students and faculty to a department is its reputation. Complain as you will about US News, they do try to capture that one statistic.

Meanwhile what do I say to prospective graduate students who cite the low Northwestern CS numbers? I could mention the problems in the methodology, point to the CRA statement, say the numbers are based on data that predates most of the theory group. More likely I will fall back to that trite but true statement: You shouldn't choose a graduate program for their numbers, but for their people.

#### 58 comments:

1. Thanks, this is great.

Just as a tip, on the spreadsheet if you go to the Tools menu in Google Docs and click Freeze Columns->Freeze 1 column and Freeze Rows->Freeze 1 row, the spreadsheet will be much easier to read when people scroll.

2. There are other rankings:
CS Overall: http://academic.research.microsoft.com/CSDirectory/org_category_24.htm
Algorithm & Theory: http://academic.research.microsoft.com/CSDirectory/org_category_1.htm

3. 1) I view these rankings as large equivalence classes. I want to say
that the top X are all close enough to not worry about how is 1 vs 2 vs ... vs X.
and the next Y are all close enough.
Not quite sure what X and Y are.
(and Z and ...)

2) IF you are going to grad school and you
KNOW you want to work in area A then find out which schools are really good in area A.

3) We all want a high ranking so we attract good students. We all want to attract good students so we will get a high ranking. When does it end?

4. The economics ranking places Wyoming on top of Harvard in terms of publications, and the math ranking places MIT just a hair in front of Penn State.

5. The statistics rankings list the top 4 as:
1) Stanford
2) Harvard
3) Iowa State
4) Berkeley

6. Want to do electrical engineering? Don't go to MIT (13th) or CMU (18th). Better instead attend UC-Santa Barbara (5th) or Purdue (9th)

7. Oh, but you want to study applied math! I urge you to shy away from MIT (12th). Instead, may I suggest the University of Arizona (5th)!

8. Ah, how I love the smell of smug, condescending, and anonymous MIT attitudes in the morning. Seems the rankings are bringing them out quite nicely.

9. I'm tired of the complaints. Department chairs, if you want accurate rankings then release accurate data. You can't keep your data secret and then complain that your department isn't evaluated fairly.

Scientists should be open with their information, but key information for prospective graduate students is kept under lock and key.

10. Just survey grad students and postdocs in an area an ask them which university they want to work after they finish. Combine this with publication record in top journals/conferences in that area and you will get a pretty good list of top departments.

11. You can also include average earnings of a graduate in an area from each school.

12. These ranking took about 15 years to produce. Someone should seriously be fired, since they don't even pass simple sanity checks in many fields.

13. Some formal whining about the rankings by UW: http://www.cs.washington.edu/nrc/

14. It's clear that the NRC folks don't even understand the basic computer science principle of "Garbage in, Garbage out". How can we expect them to rank Computer Science programs?

I hope that if sufficiently many CS departments lodge a formal complaint (I doubt the big winners, such as Princeton will cooperate, but they should), NRC will have the good sense to withdraw these rankings. But I may be hoping against hope.

15. I think that the mathematics rankings (which are probably more important to TCS than the CS ones) are in the right ball park. The only mildly surprising fact is the relative low rank of University of Chicago (which might have some explanantions that I am not aware of). It is remarkable how across the board MIT has fallen compared to their 1995 rankings. But this was probably expected. NYU is a big winner -- but with three Abel prize and one Chern prize winner on the faculty, this is not so surprising either.

16. Really? My impression of where PhD applicants in math enroll if admitted is in order:
1) Harvard
2) MIT
3) Princeton
4) Stanford/Chicago/Berkeley

The NRC rankings suggest that a student admitted to Chicago, Caltech, Columbia, UCLA, Cornell, Brown, etc. should turn them all down in favor of Penn State.

17. Really? My impression of where PhD applicants in math enroll if admitted is in order:

Well, your impressions are based on well your impressions. The NRC rankings are based on some hard data. Its likely that Penn State has improved a lot recently (which I believe is true). The Phd applicants should and will take note. Despite all the whining, the fact of the matter is that this new ranking will slowly but surely change the pecking order in the choice of graduate schools (never mind the personal impressions).

For example, University of Washington at Seattle has a history of offering low salaries on the grounds that Seattle is such a desirable place to live (at least in math). Well, the chickens has come home to roost as they say.

18. "NRC will have the good sense to withdraw these rankings."

This would not be a good outcome. Rankings are very useful. They are difficult to produce because of obstructionist departments. This is not the NRC's fault.

19. It is not the NRCs fault that it thinks UW has 90 faculty members, and sends 0% of its students to academia? Its not the NRCs fault that it is unaware of any source for computer science conference citations?

20. Thanks for the tip Anon 1. I did as you suggested.

21. Criticize all you want the results of the study. Sure, there are imperfections in the data collection process and methodology, resulting in rankings that don't seem to make sense in some cases. However, one thing that I as a scientist and researcher have tremendous respect for the NRC committee is their honesty and integrity. See, for example, these quotes from
http://www.insidehighered.com/news/2010/09/29/rankings

"The advance briefing for reporters covering Tuesday's release of the National Research Council's ratings of doctoral programs may have made history as the first time a group doing rankings held a news conference at which it seemed to be largely trying to write them off."

"When one of the reporters on a telephone briefing about the rankings asked Ostriker and his fellow panelists if any of them would "defend the rankings," none did so."

Such honesty is very refreshing, compared to the lies and spinning coming out of many organizations these days.

22. "Wisdom of models" (i.e., ensemble learning) would suggest taking the average of the NRC, US News, and perhaps a third ranking based on citations in ACM DL.

23. Time to send your top undergrads to Santa Barbara. They can have a beach party everyday and be at #12 ranked place.

Never mind that UCSB has never produced a worthwhile PhD in their entire history.

24. Can we learn from this?
Why did the rankings not work?
What can we or the people who make
rankings do in order to get rankings
that do work?

25. Lesson No. 1:

Don't output a random permutation.

26. Can we learn from this?
Why did the rankings not work? ...

This is a typical strategy of discrediting findings not to ones liking. Every department should do some soul searching. For departments that did well, they should identify their strengths and continue to build on them in the future. Those that slipped should seriously discuss internally what cause the slippage. Bad mouthing the NRC serves no purpose -- like it or not the new rankings are here, and they are what they are. The only constructive path for those who are unhappy with their rankings, is to try to improve in the next round.

27. "The NRC rankings are based on some hard data."

No, they're not. The NRC rankings are based on wildly inaccurate statistical extrapolations of badly collected and obsolete data.

For instance, to count the number of faculty with grants, the NRC could have called up the grants and contracts office at every university and asked them for the exact numbers. Instead, they polled a small "random sample" of the faculty, and extrapolated from there.

Data about awards per faculty, strangely one of the highest-weighted factors in both the survey and regression rankings, is skewed by the bizarre perception than membership in the national societies (NAS, NAE, and AAAS) is five times as prestigious as a Turing Award or a Nevanlinna Prize. ACM Fellows and IEEE Fellows don't count at all.

The NRC correctly gave up on counting citations, but still screwed up counting publications. Rather than scraping a standard database like DBLP or the ACM Digital Library (which both have significant gaps), they extracted publication counts from individual faculty CVs. Never mind that CVs are not in a standard format, some people list conference and journal papers together and some don't, and not everyone keeps their CV up to date or even posts it at all.

The placement data doesn't even pass a basic smell test. The top five departments for placing PhDs into academic positions are Georgia State (even though the actual percentage is missing), UC Riverside (59%), U Southern Mississippi (56%), Vanderbilt (50%), and Kentucky (48%). MIT (25%), Berkeley (25%), CMU (22%), and Stanford (19%) aren't even in the top 50.

To put it simply, the NRC data is absolute garbage. Any conclusions drawn from that data is meaningless.

(My department did pretty well.)

28. No amount of "strength" or "slipping" could have placed Caltech at 64 and UCSB at 12. The only reasonable place for these rankings is the trash can.

29. Not just TCS, but the whole North American STEM enterprise, is grossly failing "The 1958 Test".

The "1958 Test" is easy to administer: put a 1958 and a 2010 Scientific American article side-by-side. Rate them by relative quality and opportunities presented by the science/ technology/ engineering/ mathematics (STEM), and by the number and quality of the STEM job ads. Use whatever measure of quality you prefer.

The 1958/2010 quality ratio is so strikingly large—by any and all measures of quality— as to help one appreciate that debates about the NRC rankings amount to debates about deck-chair placement on a vessel that has been slowly ... slowly ... yet inexorably ... settling in the water for fifty years.

The reason the "1958 Test" is not more widely discussed, is that (IMHO, yet fairly obviously) rather few people retain substantial hope that this long-term decline can be reversed, and even fewer people have specific ideas for how to reverse it.

30. LSU > Yale, Rice, and NYU????

31. Caltech 64th in CS? This is a joke!

32. Especially considering that Caltech is full of superstars and Santa Barbara is full of duds.

33. UCSB - theory dudes (not duds!):

Subhash Suri
Wim van Dam
Oscar H. Ibarra

34. Anon, you are comparing an elite, internationally renowned school with the lowest rung of the UC ladder. What else did you expect?

35. If NCR doesn't withdraw these rankings soon enough, they will go down as the biggest duds from this mess.

36. This is hilarious: http://valis.cs.uiuc.edu/~sariel/cgi-bin/deranker

37. "It is not the NRCs fault that it thinks UW has 90 faculty members, and sends 0% of its students to academia? Its not the NRCs fault that it is unaware of any source for computer science conference citations?"

Somebody at UW obviously thought they would play games with the NRC and inflate their faculty count. It backfired. I think UW is to blame.

"The placement data doesn't even pass a basic smell test. The top five departments for placing PhDs into academic positions are Georgia State (even though the actual percentage is missing), UC Riverside (59%), U Southern Mississippi (56%), Vanderbilt (50%), and Kentucky (48%). MIT (25%), Berkeley (25%), CMU (22%), and Stanford (19%) aren't even in the top 50."

I suspect this is just a statistical anomaly. With many very small departments, graduating few students, some are quite likely to have extraordinary placement rates. Larger departments are much lower, though 25% seems too high to me.

38. Start an ACM ranking. Set up a committee, identify the important factors for ranking, design a mechanism to obtain the data about those factors accurately, announce the ranking. It should not be that hard, right?

39. My university is not even listed: Polytechnic

40. ICS acceptances posted on website.

41. Somebody at UW obviously thought they would play games with the NRC and inflate their faculty count. It backfired. I think UW is to blame.

How do you square that with UW's statement, in which UW states that their ranking was hurt by the NRC inflating their faculty count? http://www.cs.washington.edu/nrc/

Just so I understand, your line of reasoning is that UW purposely inflated their own faculty count, then turned around and protested that the NRC inflated their faculty count? Do you also find it a bit too convenient that no UW CSE faculty member was in the twin towers on 9/11? Or that none of them have ever been in a photograph together with Elvis?

42. The numbers for my department from the original spreadsheet seem pretty accurate to me. I'm talking less about the 'overall' ranking, and more about the stats on faculty, productivity, and funding. If I had seen these stats when I was applying to grad school, I would have chosen any of my other options (more or less similarly ranked).

So I'd encourage anyone applying to grad school to look carefully at these fields. In particular: is the department too small? is there a good balance between the number of senior and junior faculty? do most students get research assistantships (if they don't, be warned the department will play it down!)? What is the research output of the department? etc.

43. I suspect this is just a statistical anomaly. With many very small departments, graduating few students, some are quite likely to have extraordinary placement rates.

That might well explain a placement rate of 50% (i.e., n out of 2n for some small n), but 59% is more difficult to understand. The smallest case that legitimately rounds to 59% is 10/17, so there must have been at least 17 students counted from UC Riverside. It could still be a small sample size issue, but I doubt it.

In fact, there is an online alumni list (at http://www1.cs.ucr.edu/people/phd/) which shows lots of people working in industry. It's not clear how to interpret it, since it looks like the information is not complete.

So I'd guess the placement figures are somewhere between misleading and total nonsense.

44. Statistics we like are facts. Statistics we don't like are incorrectly collected.

45. The NRC rankings are here to stay. For one thing the departments which have gone up in the rankings are sure to use these rankings to show their programs in good light. This alone would lend credibility to the rankings in the eyes of would be graduate students. If a university as a whole shows across the board decline in the rankings, then I believe it says something significant. Of course, there would be outliers in any such a statistical exercise. But on the whole the ranking does reflect reality.

Yes, as the previous anon put it so aptly. "Statistics we like are facts. Statistics we don't like are incorrectly collected." Couldn't agree more.

46. put ur thru thous to become her beloved advisee.
--

47. Most of the big universities have so many famous but deadwood professors, that I would prefer to work with someone more active in a smaller university.
Also, atleast in CS the research positions are only dependent on your track record and not on the school name.
If some of the universities and the big shot professors want to stay in denial, they can. But it is not going to change the truth.

48. Economists have an entire journal devoted to this class of questions ... the class that includes thorny-and-dovetailed questions like "Who are the best economists?" and "What is the best kind of economics?"

The journal was originally called Post-Autistic Economics Review, but nowadays it is called Real-World Economics Review.

Over on Shtetl Optimized I posted a brief essay on the many striking parallels between the issues that the post-autistic economics movement sought to address, and the issues with which the TCS community is presently grappling.

In essence, the essay argues that economists are roughly one decade in advance of TCS, in thinking through these thorny issues and acting constructively to remediate them.

49. Is Caltech really that strong in CS? Last I looked they have about 8 faculty members listed. I know they have some areas of strength such as quantum computing and comp vision but they have little representation in entire subfields of CS.

I imagine that it is competitive to get into that program but I think it's US News ranking is grossly inflated by the general prestige of the university and probably should be ranked around 20 instead of near or in the top 10 as it usually is.

50. Based on my own experience at Caltech (absurdly at #64, despite its historical status, and current strength in quantum computing, which began with Caltech's Feynman) and University of Massachusetts at Amherst (#24) this is insane. Granted, I was officially in the Physics, Math, Astronomy, and English Literature degree programs at Caltech, not CS; and in the "Computer & Information Science" department at UMass before it changed its name to Computer Science, but again, I also see this as insane, based on those two data alone.

51. Is Caltech really that strong in CS? Last I looked they have about 8 faculty members listed.

"Last you looked"? I looked now and they have 16: http://www.cs.caltech.edu/people.html.

52. Caltech is very small (16 faculty members) compared to the powerhouses like Stanford (45), MIT (75), Berkeley (61), or CMU (189). What is bothersome is that it's not clear whether the rankings are supposed to represent the department's "total academic output" or its "marginal academic output". It is clear that the total amount of CS knowledge (weighted by its importance) that Caltech contributes each year is less than that contributed by CMU.

But if I am a student deciding between Caltech and CMU for grad school, and they each have a single faculty member doing what I want to do, is CMU that much better? Maybe a little since there are some general advantages to a large department, but it certainly won't be 189/16 times better; those 189 faculty can't all be my advisor. It's the "quality per faculty" that matters for a Ph.D. student.

Similarly, if you look at total research dollars spent as a very rough proxy for total "research output" (this is the whole school, not just CS since I don't know where to find that info), Caltech spent $270M in 2006 (ranking them 61st among all schools), compared to Stanford with about$700M (ranked 8; surprising, CMU is only at $212M), this gives each of Caltech and Stanford about$16M per CS faculty member. So if I am the NSF and I want to decide who is most likely to produce the highest marginal output for the next \$2M of grant money, then Caltech and Stanford look roughly equal, even though Stanford comes out far ahead of Caltech in all the "cumulative" ways of measuring research productivity. Obviously this ignores distinctions between departments.

The larger point is that universities seem to be rewarded for growing ever and ever larger without measuring the marginal payoff of growth and comparing it to the marginal cost of growth. It is similar to the fact that an advisor who takes on ten Ph.D. students, three of whom go on to have successful careers, is seen as a better advisor than one who takes on two students, both of whom have successful careers. This is a ludicrous situation, but it is the reality we have created for ourselves with our current standards for promotion, etc.

53. the harder you push, the more beloved advisee you become.

54. A Request for Lance2:07 AM, October 04, 2010

lance,

for christ sake stop advertising for wolfram ... wth is all this about. just coz u receive spam adverts from that company, doesn't mean you have to pollute the weblog/twitter with ... one additional reason not to tune into here anymore.

try to go free software or open source next time.

55. "Just so I understand, your line of reasoning is that UW purposely inflated their own faculty count, then turned around and protested that the NRC inflated their faculty count? Do you also find it a bit too convenient that no UW CSE faculty member was in the twin towers on 9/11? Or that none of them have ever been in a photograph together with Elvis?"

Did you read UW's statement? It says, "Due to difficulty in interpreting NRC's instructions, NRC was provided with an incorrect faculty list for our program." That's a perfect use of the passive voice, to try to dodge responsibility. It is clear that UW is at fault here.

56. Crowdsourced alternative ranking. Vote here: http://www.allourideas.org/compscirankings

57. "Never mind that UCSB has never produced a worthwhile PhD in their entire history."

I see that Purdue's CS department head got his PhD from UCSB, and I am sure there are more examples.