Sunday, August 07, 2005

The New Research Labs

I am just finishing the last of three west coast trips this summer. I went to the Complexity conference, two universities (U. Wash and Caltech), three ballparks, a Bat Mitzvah and a film shoot. I also visited the Holy Trinity of internet companies: Microsoft Research (both in Silicon Valley and Redmond), Yahoo! Research (Pasadena) and a Google (Mountain View), the last of which seemed more like a summer camp than a corporation.

Microsoft, Yahoo and Google all deal with large amounts of data and need to look at a number of CS related issues often requiring good theoretical techniques in areas like search, auctions on search words, recommender systems, spam filtering and much more. While the research labs of the 80's and 90's (AT&T, Bell Labs, IBM, Bellcore/Telcordia, NEC and others) have pared down their research groups, Microsoft, Yahoo and Google are currently hiring many computer scientists from programming positions to pure theory researchers. For example take the two IBM theorists who organized the last Complexity conference: Sivakumar just joined Google and Ravi Kumar went to Yahoo Research which by the way is now headed by theorist Prabhakar Raghavan. You can see how important researchers are to these companies in the recent fight between Microsoft and Google over Kai-Fu Lee (whom I best know because his Othello program beat my Othello program in a 1989 tournament).

Corporate research labs go in cycles from where they need new ideas in a developing field and build up strong research groups to the point where they have have basic commodities (think long-distance phone calls) and need to cut back research groups to remain competitive. Hartmanis and Stearns developed complexity at the GE Research Labs in Schenectady and soon after both left for academic positions and a few years after that IBM and AT&T built up their theory groups. Will Microsoft, Yahoo and Google eventually find less need for theoretical research? Probably but for now, we once again see theoretical computer scientists needed by companies setting the future of computing.


  1. Yes, the way Microsoft Research, Google (and now that you mention, Yahoo!), are hiring researchers, they are clearly up to something big.

    While the labs of the 80's and the 90's focused on communication, and operations research issues, the research du jour in the "modern" research labs seems to be search technology. I wonder how many of these researchers (especially those that join Google and Yahoo!) actually invest a significant amount of their time doing research rather than mundane coding.

    The labs of the 80's and 90's (most notably Bell Labs and IBM Research) are on their way down, with several excellent researchers on their way out. The main reason seems to be the push by these research labs to make their research more "relevant" to the company. Makes complete sense from the company's point of view, but hurts computer science in the long run. It's a sad development, just considering the great achievements that have come out of these places.

  2. One difference is that the people from the old research labs did/do keep publishing regularly in Theory conferences. Microsoft Research has that feature too, but are there any papers coming out of either Yahoo or Google? You just visited there: do you think that there is anyone there actually still currently doing research in Theory?

  3. I was one of the people who left the theory group to join Google. There seems to be some misinformation about this subject, and I was glad to see Ron comment on the situation at IBM.

    Let me first say that IBM is still a great place to work, and I greatly enjoyed my years there. I had the freedom to do basic research, as well as a good environment to do research that targets "real world" applications. I think the statement that IBM Research is in decline is not true. The company is certainly changing, but it's not the first time in the history of the company. Previous changes arose from the emergence of the PC, the Internet, and now it's the emergence of services. The fact that IBM has sustained some basic research through these changes is a testament to their committment to research.

    It's also false to say that people felt pushed by the need to make their research more relevant. I don't think that was really a factor. If that was the case then we would have gone to universities. They seem like optimal places to isolate yourself from relevance :P

    The statement about "mundane coding" is also strange. For some people, *all* coding is mundane. Most of the coding that we do is for exploratory research involving data, or to make a demo of an algorithmic idea, or to drive it into a product. It's not like we were being asked to write code in something that was irrelevant to our research.

    Most of us end up spending some time in our careers doing things that are not writing papers. If you work as a professor, then you're likely to spend a certain amount of time teaching mundane undergraduate courses. Most of us also end up having to referee a crappy paper once in a while. These things are not much fun, but we do these things out of a sense of obligation to our employer or our community. We could all produce more papers if we neglected these activities, but it would be harmful to the field as a whole.

    I think the most interesting issue is how to define the nature of research in an industrial lab, and the relationship of theoretical computer science to the rest of computer science. The statement that "relevance" is harmful to computer science research seems kind of silly to me. Most of us strive for "impact", but we use different metrics to assess what that means. Just look at the difference between algorithm design and complexity. All that really matters it that your metrics align with those of your employer, and that you be personally satisfied with your own metrics.

    As for measuring the paper output of Google or Yahoo!, I think you should look to the future rather than the past. In the early days of Microsoft there wasn't much research activity either...