Thursday, September 30, 2021

Being the Chair

If you have Netflix and interested in the academic world, I recommend The Chair, a six-episode dramatic series starring Sandra Oh as a new English department chair at a "lower tier ivy league university". The series takes many artistic liberties and compresses much in a short time period but gets much about academics right such as the tension between faculty and the administration with the chair caught in the middle, the need to create majors that attract students, faculty past their prime teaching the same courses in the same way for decades, faculty who get themselves in a hole and keep digging, alumni donors controlling academic decisions, pressure to build a diverse faculty, faculty feeling under appreciated and getting outside offers, and a wonderful exposition of how the field has changed over the past thirty years given to someone who had dropped out before finishing their PhD to take on a different career.

When I served as department chair at Georgia Tech, I dealt with most if not all of these issues above, though not at the same time. I had some challenges that today's English department doesn't face: how to handle enrollments that more than doubled while barely able to hire more faculty than were departing, not that I would trade in a second for the existential crisis that English departments are going through. 

When I left Georgia Tech after seven years, I had outlasted every other current chair in the Colleges of Computing, Science and Engineering. Not sure what this says about me or about Georgia Tech.

Being chair is the most challenging job in academia. The faculty technically report to you but you aren't their boss in any traditional sense--they came to academia because of the freedom to work on what they want and they won't give it up. It's virtually impossible to fire anyone with tenure. The joke goes that a chair needs two umbrellas, one to block stuff coming from the administration going to the faculty and the other to block the stuff from the faculty from going to the administration. Since I left it has gotten much uglier in the University System of Georgia which has no mask or vaccine mandates and glad I'm not the chair to deal with that.

This all sounds like I'm discouraging of becoming a department chair and it certainly isn't a job for anyone but it can be a very rewarding job. You can help shape the future of the department by the faculty you hire and the vision you set and create an environment that helps your faculty and students succeed. 

Sunday, September 26, 2021

My academic lineage and more interesting facts that come out of it

 I got my PhD from Harvard in 1985 with advisor Harry Lewis

Harry Lewis got his PhD from Harvard in 1974 with advisor Burton Dreben (Dreben was in the Philosophy department and did logic). Burton Dreben never got a PhD (more on that later). So I thought my lineage stopped there. A while back I was in an email conversation with Harry and for some odd reason Galileo came up.

He then emailed me the following:

----------------

Did you know you were descended from Galileo, via Newton? See below. The data is from the Math Genealogy project (see here). As you know  Dreben had no PhD, but it would certainly be fair to call Quine his advisor anyway. And, in fact, the Math Geneology project lists Quine as Dreben's advisor. By starting with Dreben and clicking backwards I found the following:

In the list below everyone was advised (in some form) by the person below them.

William Gasarch, Harvard 1985

Harry Lewis, Harvard 1974

Burton Dreben, Harvard 1955

WVO Quine, Harvard 1932

AN Whitehead, Cambridge 1884

Edward John Routh, Cambridge 1857

William Hopkins, Cambridge 1830

Adam Sedgwick, Cambridge 1811

Thomas Jones, Cambridge 1782

Thomas Postlethwaite, Cambridge 1756

Stephen Whisson, Cambridge 1742

Walter Taylor, Cambridge 1723

Robert Smith, Cambridge 1715

Roger Coles, Cambridge 1706

Isaac Newton, Cambridge 1668

Isaac Barrow, Cambridge 1652

Vincenzo Viviani, Pisa 1642

Galileo Galilei, Pisa 1585

--------------------------------------

A few observations

1) Dreben was a philosophy professor at Harvard without a PhD. How? He was a Junior Fellow, which is for brilliant people, some of which were made professors without  the burden of going  through the PhD-getting ritual.  Andrew Gleason was a professor of Math at Harvard without a PhD-- also a junior fellow (he solved Hilbert's 5th problem, which surely helped). Tom Cheatham was a CS professor at Harvard who did not have a PhD but  was not a junior fellow. I do not know how he did that. Things are more formal now, and more people have PhD's, so I suspect it is much rarer to be a professor without a PhD.  Harvard still has the Junior Fellows Program, but even they have PhDs now. If someone solved P vs NP as an ugrad, I suspect they would be hired as a professor even though they do not have a PhD. That's one way for a theorist to get out of taking graduate systems courses. 

2) Note that Galileo and Vincenzo were in Pisa but then a long line of people from Cambridge. In those days schools hired their own. Is this good or bad? They know what they are getting, but you could have an old-boys-network blocking fresh new talent, and you may get stuck in your ways. Nowadays, at least in America, it is uncommon to stay at the same school as you got your PhD.

3) The shift from Pisa to Cambridge might be part of a more general phenomena--- the intellectual center for science shifting from Italy to England. What caused this? Amir Alexander, in his book Infinitesimals: How a dangerous mathematical idea shaped the modern world (see my review here ) speculates that the Catholic Church's rejection of Infinitesimals was the cause.  I suspect that letting non-scientists interfere with science was the cause (a lesson for us all).

4) Lance did a blog on his lineage here. He has Gauss and Euler as ancestors. 

5) To honor the myths about  my two most famous academic ancestors, Galileo and Newton,  I am going to travel to Italy and have Darling drop two apples of different weights off the leaning tower of Pisa and see if they hit my head at the same time.


Thursday, September 23, 2021

Why Conferences?

An undergrad thesis from North Carolina State University tries to tackle the question as to why computer science has used conferences as its main and most prestigious publication venues. The author Elijah Bouma-Sims gives a synopsis with some interesting follow up conversation in this Twitter thread.

The upshot is that the conference culture grew organically early in computing and just took hold as the field grew. My personal non-scientific theory is that technology not available to earlier fields, namely jet airplanes, allowed CS to have national and international meetings that researchers could regularly attend. Before that conferences in more established fields like math were held either locally (AMS sectional meetings) or less often (ICM held every four years), traditions that continue to this day.

Covid has temporarily suspended fully on-site conferences, and new technologies allow us to have virtual meetings. It's still not clear what will be the new normal for conferences. I hope we get to the model where we have more virtual meetings and rarer in-person meetings that people make more of an effort to attend. Conferences focused on networking instead of publications.

The culture of conference publications has been slowly changing. Many subfields in CS, though not no much theory, have moved to a hybrid model where papers are submitted to a journal and those accepted are invited to be presented at a conference.

Conferences used to be the first place you would hear about new results but that's no longer the case. Papers posted on arXiv get noticed and Google Scholar doesn't distinguished citations to an arXiv paper differently from any other publication venue. 

Now you don't even need a presentation or a paper, just a promise of one. How many of you are excited about linear-size locally testable codes based on a talk announcement alone?

Sunday, September 19, 2021

The New Jeopardy Champion is a `A CS grad student from a school in New Haven'

 As of Sept 17, Matt Amodio has won 23 straight games in a row on Jeopardy and won over $800,000 in regular play. The following website is not quite up to date, but its close: here. Of course, that website will change. 

1) They refer to him as A CS grad student from a school in New Haven. My first thought was probably Yale, but whatever it is, they should just say it. I looked it up and it is Yale. So why aren't they just saying A CS grad student from Yale?  If someone works for an airline company they do not tell you which airline- prob to avoid giving that airline free publicity. But I would think a school is different. And I remember (perhaps incorrectly) that they DO say what school someone teaches at or is a student at. 

(ADDED LATER: a colleague of mine who was on Jeop (he lost his only game) tells me that YES, you re NOT ALLOWED to say the company you work for. He was from Riverdale Park, MD which might make some people think there is a Univ of MD at Riverdale Park . He also told me that when he was on the show the following happened:  On One of the shows of the game before I  played, Alex was curious which LA area restaurant somebody worked at (to see if he had eaten there--- he hadn't), and sure enough, they edited the name of the restaurant out.)

2) Longest streak: Ken Jennings: 74. Also most money in reg play: roughly 2.5 Mill

    2nd longest: James Holzhauer: 32. Also  2nd most money in reg play: roughly 2.4 Mill

    3rd longest: Matt Amodio: 23. Also  3rd most money in reg play: roughly 0.8 Mill

3) I do not think Matt will move into second place on any of these categories. He bets big on the daily doubles and it has paid off but either (a) he will miss and it will lead to a loss, or (b) he will just not get the daily double and be against a very good opponent. Item (b) happened to James H- and the person who beat him did have a good enough win streak to be in the Tournament of Champions. I wonder if they try to stop a long streak by picking really good opponents. I also wonder if they can even tell who will be a really good opponent. 

4) Matt has played in front of (or will- counting tomorrow) six hosts: Robin Roberts, LeVar Burton, David Faber, Joe Buck, Mike Richards, and Mayim Bialik. Six is a record which I suspect won't be broken, except possibly by Matt himself if he also plays in front of Ken Jennings (the rest of 2021 will be Mayim B and Ken J as hosts, see here.

5) Matt works in AI. When he gets his PhD and is on the job market will his Jeopardy success help him, hurt him, or neither?  

6) James H and Matt A are both very good at calculating how much to bet. I think Ken J is not quite as good but still good. Generally the players on Jeop are not that good at that aspect. I had the chance to ask some a champions (not any of those three) why that was and she said that most people get into because of the trivia-aspect, not the betting aspect. I wonder if just as players now study lots of facts to prep, they will also learn how to bet better. 

7) Ken J as host is a bit odd in that, if he says (as Alex T did sometimes) That category looks hard I won't believe him. I also have this suspicion that when a contestant gets something wrong Ken might be thinking what a moron; however, (a) by all accounts Ken is a nice guy, and (b)  I might be projecting. 

Sunday, September 12, 2021

Review of A Blog Book based on the Less Wrong Blog

There is a blog called lesswrong. Many people contribute to it (how many people must contribute to a website before it stops being called a blog and starts being called... Not sure what?). The theme is rationality. They (not sure who they are) have made a best-of collection from lesswrong which is named

A Map that Reflects the Territory (available here)

Actually, its not one book, but five mini-books. I quote the titles and paraphrase the first sentence of each:

Epistemology:  How we come to know the world.

Agency: The ability to take action in the world and control the future.

Coordination:  The ability of multiple agents to work together.

Curiosity: The desire to understand how the world works.

Alignment: The problem of aligning the thoughts and goals.

I have written a review of the the book. The book was my first exposure to the blog, except sometimes reading about the blog, probably on Scott's blog.

I am posting this to both complexity blog and to lesswrong, though with lesswrong I will have a different intro since they know lesswrong but might not know complextyblog. 

My review is here.

I would appreciate intelligent comments and suggestions, which I will use to improve the review.

Thursday, September 09, 2021

The Death of Expertise

Four years ago I tried to catch up with deep learning and this summer I aimed to try to catch up again. Who would've thought 2017 is ancient history.

I watched the lectures in the latest MIT course, played with GPT-3 and Codex, read the new Stanford manifesto on what they call foundation models, models trained that can perform on a wide range of tasks instead of a single goal. We've seen machine learning solve protein folding, detect cancer from x-rays better than radiologists, not to mention effectively solving many of the traditional AI problems (language translation, voice and face recognition, game playing, etc.) Watching Codex generate computer code from plain English is a game changer. Far from perfect but this technology is in its infancy.

From what I can tell, the main technological advances focus on learning when we don't have labeled data such as new techniques to transfer knowledge to new domains and using generative models to provide more data to train ML algorithms.

The trend that worries us all is that deep learning algorithms in the long run seem to do better if we limit or eliminate previous human knowledge from the equation. The game playing algorithms now train from just the rules alone (or even just the outcomes). We do use some knowledge: words come in some linear order, faces have hierarchical features, but not much more than that. Human expertise can help as we start solving a problem, even just to know what good solutions look like, but then it often gets in the way.

When we longer use a skill we tend to lose it, like my ability to navigate from a good map. If we eliminate expertise we may find it very difficult to get it back.

There's a more personal issue--people spend their entire careers creating their expertise in some area, and that expertise is often a source of pride and a source of income. If someone comes along and tells you that expertise is no longer needed, or even worse irrelevant or that it gets in the way, you might feel protective, and under guise of our expertise tear down the learning algorithm. That attitude will just make us more irrelevant--better to use our expertise to guide the machine learning models to overcome their limitations. 

You can't stop progress, but you can shape it.

Sunday, September 05, 2021

Guest Post on Solving (or trying to) Poly Diophantine Equations by Bogdan Grechuk

(Guest post by Bogdan Grechuk)


Motivated by Mathoverflow question here I have recently became interested in solving Polynomial Diophantine equations, that is, equations of the form

P(x1,...,xn)=0

for some polynomial P with integer coefficients. Because there are many such equations, I have decided to ask a computer to help me. Our conversation is presented here.


Highlights of the conversation: the height of a polynomial over Z in many variables is what you get when you make all of the coefficients positive and plug in x=2. For example, the height of

x3 - y4 + 5

is
23 + 24 + 5 = 8 + 16 + 5 = 29.

Note that, for all h, there are only a finite number of polynomials in many vars over Z with height h. With the help of my friend the computer we have looked at the equations with h=0,1,2,... and so on, and tried to  determine which ones have any integer solutions. As expected, the first equations were trivial, but at about h=22 we have started to meet quite interesting equations for which we needed help from  Mathoverflow to solve. The project is currently at h=29, with only one remaining open equation of this height.  Read the conversation with the computer, or my mathoverflow question

here
or my paper:

here
to find out more. I INVITE you to join me in this quest!

Thursday, September 02, 2021

The hierarchy and GapP

There is a great but little-known theorem from the early 90's by Seinosuke Toda and Mitsunori Ogihara (buried as Lemma 2.3 in their paper) that shows the polynomial-time hierarchy is randomly low for Gap-P.

Let M be a non-deterministic polynomial-time machine. #M(x) is the number of accepting paths of M on x. #P is the set of functions f such that f(x) = #M(x) for an NP machine M. 

Gap-M(x) is the difference between the number of accepting and rejecting paths. Unlike #M(x), Gap-M(x) could be negative. Gap-P is the set of functions such that f(x)=Gap-M(x) for an NP machine M. Equivalently Gap-P is the difference of two #P functions. Gap-P inherits all the nice closure properties of #P while being closed under negation and subtraction.

Theorem (Toda-Ogihara): Let q be a polynomial and f be a function in Gap-PPH. There is a function g in Gap-P and a polynomial p such that for all x, if you choose a binary string r randomly of length p(|x|), f(x) = g(x,r) with probability at least 1-2-q(|x|).

In other words, with randomness the PH as an oracle of Gap-P disappears.

Let me show you how the proof works for a specific f in GapPPH, namely the indicator function for SAT: f(φ) = 1 if φ is satisfiable and 0 otherwise.

Recall Valiant-Vazirani showed how to randomly reduce a formula φ to a set of formula ψ1,…,ψk such that

  1. If φ is not satisfiable then for all i, ψi will not be satisfiable
  2. If φ is satisfiable then with high probability for some i, ψi will have exactly one satisfying assignment.
Picking k > 4|x|q(|x|) will reduce the error in (2) to under 2-q(|x|). Let g(φ,r) use r to choose ψ1,…,ψk and output the Gap-P function 1-∏(1-#ψi) using the closure properties of Gap-P where #ψi is the number of satisfying assignments to ψi.
  • If φ is not satisfiable then for all i, #ψi=0 and g(φ,r) = 0.
  • If φ is satisfiable then with high probability for some i, #ψi=1 and thus g(φ,r) = 1.
Note that if r is a bad random sequence and φ is satisfiable, g(φ,r) might be an exponentially large integer, positive or negative. It's right most of the time but when it's wrong it's terribly wrong.