Thursday, September 11, 2008

Another Dutch Defense

Yesterday I attended my fifth Dutch thesis defense. The Dutch defense is unlike most any other country with a formal ceremony much like an American wedding. I wrote a detailed description of the Dutch defense when I attended Hein Röhrig's defense in 2004. This year I marched as a full professor but not as an actual opponent. I just really enjoy the ritual so lacking in American defenses.

The defender this time around was Steven de Rooij, a student of Paul Vitányi and Peter Grünwald. Peter specializes in learning theory and Paul, of course, studies and co-wrote the book on Kolmogorov complexity, an algorithmic measure of information and randomness. True to form Steven's thesis brought together both areas in some exciting ways. His thesis Minimum Description Length Model Selection looks at the formalization of Occam's razor that a model with the shortest description consistent with the data is typically a good one. Steven examines several models with an eye toward accuracy and practicality.

In a few weeks I start teaching a course on Kolmogorov complexity at Northwestern, a course I've taught quite successfully a couple of times at Chicago. The idea is so simple: the amount of information or randomness in a string is the size of the smallest program that generates that string. This simple idea leads to a rich theory with some very neat applications and is just one of my favorite topics.

1 comment:

  1. Lance, you might consider teaching your students the Kolomogorov explanation of why randomness appears in quantum mechanics.

    Namely, quantum mechanics allows so many possible experimental outcomes, that (by Chaitin-Kolomogorov theory) most of experimental datasets necessarily *appear* to be random.

    To take a homely example from our own MRFM experiments, our data records consist of (typically) 10^16 binary-valued clicks in a cantilever interferometer. The number of possible experimental records is so large (2^(10^16)) that perforce, most of our MRFM experimental records are algorithmically incompressible.