In the Spring of 2018 the US News and World Report should release their latest rankings of US graduate science programs including computer science. These are the most cited of the deluge of computer science rankings we see out there. The US News rankings have a long history and since they are reputation based they roughly correspond to how we see CS departments though some argue that reputation changes slowly with the quality of a department.

US News and World Report also has a new global ranking of CS departments. The US doesn't fare that well on the list and the ranking of the US programs on the global list are wildly inconsistent with the US list. What's going on?

75% of the global ranking is based on statistics from Web of Science. Web of Science captures mainly journal articles where conferences in computer science typically have a higher reputation and more selectivity. In many European and Asian universities hiring and promotion often depend heavily on publications and citations in Web of Science encouraging their professor to publish in journals thus leading to higher ranked international departments.

The CRA rightly put out a statement urging the CS community to ignore the global rankings, though I wished they made a distinction between the two different US News rankings.

I've never been a fan of using metrics to rank CS departments but there is a relatively new site, Emery Berger's Computer Science Rankings, based on the number of publications in major venues. CS Rankings passes the smell test for both their US and global lists and is relatively consistent with the US News reputation-based CS graduate rankings.

Nevertheless I hope CS Rankings will not become the main ranking system for CS departments. Departments who wish to raise their ranking would hire faculty based mainly on their ability to publish large number of papers in major conferences. Professors and students would then focus on quantity of papers and this would in the long run discourage risk-taking long-range research, as well as innovations in improving diversity or educating graduate students.

As Goodhart's Law states, "when a measure becomes a target, it ceases to be a good measure". Paradoxically CS Rankings can lead to good rankings of CS departments as long as we don't treat it as such.

## Thursday, November 16, 2017

## Monday, November 13, 2017

### Can you measure which pangrams are natural

A Pangram is a sentence that contains every letter of the alphabet

The classic is:

The quick brown fox jumps over the lazy dog.

(NOTE- I had `jumped' but a reader pointed out that there was no s, and that `jumps' is the correct word)

which is only 31 letters.

I could give a pointer to lists of such, but you can do that yourself.

My concern is:

a) are there any pangrams that have actually been uttered NOT in the context of `here is a pangram'

b) are there any that really could.

That is- which pangrams are natural? I know this is an ill defined question.

Here are some candidates for natural pangrams

1) Pack my box with five dozen liquor jugs

2) Amazingly few discotheques provide jukeboxes

3) Watch Jeopardy! Alex Trebek's fun TV quiz game

4) Cwm fjord bank glyphs vext quiz

(Okay, maybe that one is not natural as it uses archaic words. It means

``Carved symbols in a mountain hollow on the bank of an inlet irritated an

eccentric person' Could come up in real life. NOT. It uses every letter

exactly once.)

How can you measure how natural they are?

For the Jeopardy one I've shown it to people and asked them

``What is unusual about this new slogan for the show Jeopardy?''

and nobody gets it. more important- they believe it is the new slogan.

So I leave to the reader:

I) Are the other NATURAL pangrams?

II) How would you test naturalness of such?

Pinning down `natural' is hard. I did a guest post in 2004 before I was an official co-blogger, about when a problem (a set for us) is natural, for example the set all regular expressions with squaring (see here).

The classic is:

The quick brown fox jumps over the lazy dog.

(NOTE- I had `jumped' but a reader pointed out that there was no s, and that `jumps' is the correct word)

which is only 31 letters.

I could give a pointer to lists of such, but you can do that yourself.

My concern is:

a) are there any pangrams that have actually been uttered NOT in the context of `here is a pangram'

b) are there any that really could.

That is- which pangrams are natural? I know this is an ill defined question.

Here are some candidates for natural pangrams

1) Pack my box with five dozen liquor jugs

2) Amazingly few discotheques provide jukeboxes

3) Watch Jeopardy! Alex Trebek's fun TV quiz game

4) Cwm fjord bank glyphs vext quiz

(Okay, maybe that one is not natural as it uses archaic words. It means

``Carved symbols in a mountain hollow on the bank of an inlet irritated an

eccentric person' Could come up in real life. NOT. It uses every letter

exactly once.)

How can you measure how natural they are?

For the Jeopardy one I've shown it to people and asked them

``What is unusual about this new slogan for the show Jeopardy?''

and nobody gets it. more important- they believe it is the new slogan.

So I leave to the reader:

I) Are the other NATURAL pangrams?

II) How would you test naturalness of such?

Pinning down `natural' is hard. I did a guest post in 2004 before I was an official co-blogger, about when a problem (a set for us) is natural, for example the set all regular expressions with squaring (see here).

## Thursday, November 09, 2017

### Advice for the Advisor

A soon-to-be professor asked me recently if I could share some ideas on on how to advise students. I started to write some notes only to realize that I had already posted on the topic in 2006.

Have students work on problems that interest them not just you. I like to hand them a proceedings of a recent conference and have them skim abstracts to find papers they enjoy. However if they stray too far from your research interests, you will have a hard time pushing them in the right directions. And don't work on their problems unless they want you to.Computer science evolves dramatically but the basic principles of advising don't. This advise pretty much works now as well as it did in 2006, in the 80's when I was in the student or even the 18th century. Good advising never goes out of style.

Keep your students motivated. Meet with them on a regular basis. Encourage students to discuss their problems and other research questions with other students and faculty. Do your best to keep their spirits high if they have trouble proving theorems or are not getting their papers into conferences. Once they lose interest in theory they won't succeed.

Feel free to have them read papers, do some refereeing and reviewing, give talks on recent great papers. These are good skills for them to learn. But don't abuse them too much.

Make sure they learn that selling their research is as important as proving the theorems. Have them write the papers and make them rewrite until the paper properly motivates the work. Make them give practice talks before conferences and do not hold back on the criticism.

Some students will want to talk about some personal issues they have. Listen as a friend and give some suggestions without being condescending. But if they have a serious emotional crisis, you are not trained for that; point them to your university counseling services.

Once it becomes clear a student won't succeed working with you, or won't succeed as a theorist or won't succeed in graduate work, cut them loose. The hardest thing to do as an advisor is to tell a student, particular one that tries hard, that they should go do something else. It's much easier to just keep them on until they get frustrated and quit, but you do no one any favors that way.

Of course I don't and can't hand out a physical proceedings to a student to skim. Instead I point to on-line proceedings but browsing just doesn't have the same feel.

Looking back I would add some additional advice. Push your students and encourage them to take risks with their research. If they aren't failing to solve their problems, they need to try harder problems. We too often define success by having your paper accepted into a conference. Better to have an impact on what others do.

Finally remember that advising doesn't stop at the defense. It is very much a parent-child relationship that continues long after graduation. Your legacy as a researcher will eventually come to an end. Your legacy as an advisor will live on through those you advise and their students and so on to eternity.

## Monday, November 06, 2017

### The two fears about technology- one correct, one incorrect

When the luddites smashed loom machines their supporters (including Lord Byron, Ada Lovelaces father) made two arguments in favor of the luddites (I am sure I am simplifying what they said):

- These machines are tossing people out of work NOW and this is BAD for THOSE people. In this assertion they were clearly correct. (`lets just retrain them' only goes so far).
- This is bad for mankind! Machines displacing people will lead to the collapse of civilization! Mankind will be far worse off because of technology. In this assertion I think they were incorrect. That is, I think civilization is better off now because of technology. (If you disagree leave an intelligent polite comment. Realize that just be leaving a comment you are using technology. That is NOT a counterargument. I don't think its even IRONY. Not sure what it is.)
- (This third one is mine and its more of a question) If you take the human element out of things then bad things will happen. There was a TV show where a drone was going to be dropped on something but a HUMAN noticed there were red flowers on the car and deduced it was a wedding so it wasn't dropped. Yeah! But I can equally well see the opposite: a computer program notices things that indicate its not the target that a person would have missed. But of course that would make as interesting a story. More to the point- if we allow on computers to make decisions without the human elemnet, is that good or bad? For grad admissions does it get rid of bias or does it reinforce bias? (See the book Weapons of Math Destruction for an intelligent argument against using programs for, say, grad admissions and other far more important things.)

When Deep Blue beat Kasparov in chess there were some articles about how this could be the end of mankind. That's just stupid. For a more modern article on some of the dangers of AI (some reasonable some not) see this article on watson.

It seems to me that AI can do some WELL DEFINED (e.g., chess) very well, and even some not-quite-so-well-defined things (Nat Lang translation) very well, but the notion that they will evolve to be `really intelligent' (not sure that is well defined) and think they are better than us and destroy us seems like bad science fiction (or good science fiction).

Watson can answer questions very very well, Medical diagnosis machines may well be much better than doctors. While this may be bad news for Ken Jennings and for doctors, I don't see it being bad for humanity in the long term. Will we one day look at the fears of AI and see that they were silly--- the machines did not, terminator-style, turn against us? I think so. And of course I hope so.

## Thursday, November 02, 2017

### Matching and Complexity

Given a group of people, can you pair them up so that each pair are Facebook friends with each other? This is the famous perfect matching problem. The complexity of matching has a rich history which got a little richer in the past few months.

For bipartite graphs (consider only friendships between men and women), we have had fast matching algorithms since the 1950's via augmenting paths. In the 1965 classic paper, Path, Trees and Flowers, Jack Edmonds gives a polynomial-time algorithm for matching on general graphs. This paper also laid out an argument for polynomial-time as efficient computation that would lead to the complexity class P (of P v NP fame).

After Razborov showed that the clique problem didn't have polynomial-size monotone circuits, his proof techniques also showed that matching didn't have polynomial-size monotone circuits and Raz and Wigderson show that matching requires exponential size and linear depth. Because of Edmond's algorithm matching does have polynomial-size circuits in general. NOTs are very powerful.

Can one solve matching in parallel, say the class NC (Nick's class after Pippenger) of problems computable by a polynomial number of processors in polylogarithmic time? Karp, Upfal and Wigderson give a randomized NC algorithm for matching. Mulmuley, Vazirani and Vazirani prove an isolation lemma that allows a randomized reduction of matching to the determinant. Howard Karloff exhibited a Las Vegas parallel algorithm, i.e., never makes a mistake and runs in expected polylogarithmic time.

Can one remove the randomness? An NC algorithm for matching remains elusive but this year brought two nice results in that direction. Ola Svensson and Jakub Tarnawski give a quasi-NC algorithm for general graph matching. Quasi-NC means a quasipolynomial (2

Matching is up there with primality, factoring, connectivity, graph isomorphism, satsfiability and the permanent as fixed algorithm problems that have played such a large role in helping us understand complexity. Thanks matching problem and may you find NC nirvana in the near future.

For bipartite graphs (consider only friendships between men and women), we have had fast matching algorithms since the 1950's via augmenting paths. In the 1965 classic paper, Path, Trees and Flowers, Jack Edmonds gives a polynomial-time algorithm for matching on general graphs. This paper also laid out an argument for polynomial-time as efficient computation that would lead to the complexity class P (of P v NP fame).

After Razborov showed that the clique problem didn't have polynomial-size monotone circuits, his proof techniques also showed that matching didn't have polynomial-size monotone circuits and Raz and Wigderson show that matching requires exponential size and linear depth. Because of Edmond's algorithm matching does have polynomial-size circuits in general. NOTs are very powerful.

Can one solve matching in parallel, say the class NC (Nick's class after Pippenger) of problems computable by a polynomial number of processors in polylogarithmic time? Karp, Upfal and Wigderson give a randomized NC algorithm for matching. Mulmuley, Vazirani and Vazirani prove an isolation lemma that allows a randomized reduction of matching to the determinant. Howard Karloff exhibited a Las Vegas parallel algorithm, i.e., never makes a mistake and runs in expected polylogarithmic time.

Can one remove the randomness? An NC algorithm for matching remains elusive but this year brought two nice results in that direction. Ola Svensson and Jakub Tarnawski give a quasi-NC algorithm for general graph matching. Quasi-NC means a quasipolynomial (2

^{polylog}) number of processors. Nima Anari and Vijay Vazirani give an NC algorithm for matching on planar graphs.Matching is up there with primality, factoring, connectivity, graph isomorphism, satsfiability and the permanent as fixed algorithm problems that have played such a large role in helping us understand complexity. Thanks matching problem and may you find NC nirvana in the near future.

Subscribe to:
Posts (Atom)