Thursday, September 23, 2021

Why Conferences?

An undergrad thesis from North Carolina State University tries to tackle the question as to why computer science has used conferences as its main and most prestigious publication venues. The author Elijah Bouma-Sims gives a synopsis with some interesting follow up conversation in this Twitter thread.

The upshot is that the conference culture grew organically early in computing and just took hold as the field grew. My personal non-scientific theory is that technology not available to earlier fields, namely jet airplanes, allowed CS to have national and international meetings that researchers could regularly attend. Before that conferences in more established fields like math were held either locally (AMS sectional meetings) or less often (ICM held every four years), traditions that continue to this day.

Covid has temporarily suspended fully on-site conferences, and new technologies allow us to have virtual meetings. It's still not clear what will be the new normal for conferences. I hope we get to the model where we have more virtual meetings and rarer in-person meetings that people make more of an effort to attend. Conferences focused on networking instead of publications.

The culture of conference publications has been slowly changing. Many subfields in CS, though not no much theory, have moved to a hybrid model where papers are submitted to a journal and those accepted are invited to be presented at a conference.

Conferences used to be the first place you would hear about new results but that's no longer the case. Papers posted on arXiv get noticed and Google Scholar doesn't distinguished citations to an arXiv paper differently from any other publication venue. 

Now you don't even need a presentation or a paper, just a promise of one. How many of you are excited about linear-size locally testable codes based on a talk announcement alone?

Sunday, September 19, 2021

The New Jeopardy Champion is a `A CS grad student from a school in New Haven'

 As of Sept 17, Matt Amodio has won 23 straight games in a row on Jeopardy and won over $800,000 in regular play. The following website is not quite up to date, but its close: here. Of course, that website will change. 

1) They refer to him as A CS grad student from a school in New Haven. My first thought was probably Yale, but whatever it is, they should just say it. I looked it up and it is Yale. So why aren't they just saying A CS grad student from Yale?  If someone works for an airline company they do not tell you which airline- prob to avoid giving that airline free publicity. But I would think a school is different. And I remember (perhaps incorrectly) that they DO say what school someone teaches at or is a student at. 

(ADDED LATER: a colleague of mine who was on Jeop (he lost his only game) tells me that YES, you re NOT ALLOWED to say the company you work for. He was from Riverdale Park, MD which might make some people think there is a Univ of MD at Riverdale Park . He also told me that when he was on the show the following happened:  On One of the shows of the game before I  played, Alex was curious which LA area restaurant somebody worked at (to see if he had eaten there--- he hadn't), and sure enough, they edited the name of the restaurant out.)

2) Longest streak: Ken Jennings: 74. Also most money in reg play: roughly 2.5 Mill

    2nd longest: James Holzhauer: 32. Also  2nd most money in reg play: roughly 2.4 Mill

    3rd longest: Matt Amodio: 23. Also  3rd most money in reg play: roughly 0.8 Mill

3) I do not think Matt will move into second place on any of these categories. He bets big on the daily doubles and it has paid off but either (a) he will miss and it will lead to a loss, or (b) he will just not get the daily double and be against a very good opponent. Item (b) happened to James H- and the person who beat him did have a good enough win streak to be in the Tournament of Champions. I wonder if they try to stop a long streak by picking really good opponents. I also wonder if they can even tell who will be a really good opponent. 

4) Matt has played in front of (or will- counting tomorrow) six hosts: Robin Roberts, LeVar Burton, David Faber, Joe Buck, Mike Richards, and Mayim Bialik. Six is a record which I suspect won't be broken, except possibly by Matt himself if he also plays in front of Ken Jennings (the rest of 2021 will be Mayim B and Ken J as hosts, see here.

5) Matt works in AI. When he gets his PhD and is on the job market will his Jeopardy success help him, hurt him, or neither?  

6) James H and Matt A are both very good at calculating how much to bet. I think Ken J is not quite as good but still good. Generally the players on Jeop are not that good at that aspect. I had the chance to ask some a champions (not any of those three) why that was and she said that most people get into because of the trivia-aspect, not the betting aspect. I wonder if just as players now study lots of facts to prep, they will also learn how to bet better. 

7) Ken J as host is a bit odd in that, if he says (as Alex T did sometimes) That category looks hard I won't believe him. I also have this suspicion that when a contestant gets something wrong Ken might be thinking what a moron; however, (a) by all accounts Ken is a nice guy, and (b)  I might be projecting. 

Sunday, September 12, 2021

Review of A Blog Book based on the Less Wrong Blog

There is a blog called lesswrong. Many people contribute to it (how many people must contribute to a website before it stops being called a blog and starts being called... Not sure what?). The theme is rationality. They (not sure who they are) have made a best-of collection from lesswrong which is named

A Map that Reflects the Territory (available here)

Actually, its not one book, but five mini-books. I quote the titles and paraphrase the first sentence of each:

Epistemology:  How we come to know the world.

Agency: The ability to take action in the world and control the future.

Coordination:  The ability of multiple agents to work together.

Curiosity: The desire to understand how the world works.

Alignment: The problem of aligning the thoughts and goals.

I have written a review of the the book. The book was my first exposure to the blog, except sometimes reading about the blog, probably on Scott's blog.

I am posting this to both complexity blog and to lesswrong, though with lesswrong I will have a different intro since they know lesswrong but might not know complextyblog. 

My review is here.

I would appreciate intelligent comments and suggestions, which I will use to improve the review.

Thursday, September 09, 2021

The Death of Expertise

Four years ago I tried to catch up with deep learning and this summer I aimed to try to catch up again. Who would've thought 2017 is ancient history.

I watched the lectures in the latest MIT course, played with GPT-3 and Codex, read the new Stanford manifesto on what they call foundation models, models trained that can preform on a wide range of tasks instead of a single goal. We've seen machine learning solve protein folding, detect cancer from x-rays better than radiologists, not to mention effectively solving many of the traditional AI problems (language translation, voice and face recognition, game playing, etc.) Watching Codex generate computer code from plain English is a game changer. Far from perfect but this technology is in its infancy.

From what I can tell, the main technological advances focus on learning when we don't have labelled data such as new techniques to transfer knowledge to new domains and using generative models to provide more data to train ML algorithms.

The trend to worry us all is that deep learning algorithms in the long run seem to do better if we limit or eliminate previous human knowledge from the equation. The game playing algorithms now train from just the rules alone (or even just the outcomes). We do use some knowledge: words come in some linear order, faces have hierarchical features, but not much more than that. Human expertise can help as we start solving a problem, even just to know that good solutions look like, but then it often gets in the way.

When we longer use a skill we tend to lose it, like my ability to navigate from a good map. If we eliminate expertise we may find it very difficult to get it back.

There's a more personal issue--people spend their entire careers creating their expertise in some area, and that expertise is often a source of pride and a source of income. If someone comes along and tells you that expertise is no longer needed, or even worse irrelevant or that it gets in the way, you might feel protective, and under guise of our expertise tear down the learning algorithm. That attitude will just make us more irrelevant--better to use our expertise to guide the machine learning models to overcome their limitations. 

You can't stop progress, but you can shape it.

Sunday, September 05, 2021

Guest Post on Solving (or trying to) Poly Diophantine Equations by Bogdan Grechuk

(Guest post by Bogdan Grechuk)


Motivated by Mathoverflow question here I have recently became interested in solving Polynomial Diophantine equations, that is, equations of the form

P(x1,...,xn)=0

for some polynomial P with integer coefficients. Because there are many such equations, I have decided to ask a computer to help me. Our conversation is presented here.


Highlights of the conversation: the height of a polynomial over Z in many variables is what you get when you make all of the coefficients positive and plug in x=2. For example, the height of

x3 - y4 + 5

is
23 + 24 + 5 = 8 + 16 + 5 = 29.

Note that, for all h, there are only a finite number of polynomials in many vars over Z with height h. With the help of my friend the computer we have looked at the equations with h=0,1,2,... and so on, and tried to  determine which ones have any integer solutions. As expected, the first equations were trivial, but at about h=22 we have started to meet quite interesting equations for which we needed help from  Mathoverflow to solve. The project is currently at h=29, with only one remaining open equation of this height.  Read the conversation with the computer, or my mathoverflow question

here
or my paper:

here
to find out more. I INVITE you to join me in this quest!

Thursday, September 02, 2021

The hierarchy and GapP

There is a great but little-known theorem from the early 90's by Seinosuke Toda and Mitsunori Ogihara (buried as Lemma 2.3 in their paper) that shows the polynomial-time hierarchy is randomly low for Gap-P.

Let M be a non-deterministic polynomial-time machine. #M(x) is the number of accepting paths of M on x. #P is the set of functions f such that f(x) = #M(x) for an NP machine M. 

Gap-M(x) is the difference between the number of accepting and rejecting paths. Unlike #M(x), Gap-M(x) could be negative. Gap-P is the set of functions such that f(x)=Gap-M(x) for an NP machine M. Equivalently Gap-P is the difference of two #P functions. Gap-P inherits all the nice closure properties of #P while being closed under negation and subtraction.

Theorem (Toda-Ogihara): Let q be a polynomial and f be a function in Gap-PPH. There is a function g in Gap-P and a polynomial p such that for all x, if you choose a binary string r randomly of length p(|x|), f(x) = g(x,r) with probability at least 1-2-q(|x|).

In other words, with randomness the PH as an oracle of Gap-P disappears.

Let me show you how the proof works for a specific f in GapPPH, namely the indicator function for SAT: f(φ) = 1 if φ is satisfiable and 0 otherwise.

Recall Valiant-Vazirani showed how to randomly reduce a formula φ to a set of formula ψ1,…,ψk such that

  1. If φ is not satisfiable then for all i, ψi will not be satisfiable
  2. If φ is satisfiable then with high probability for some i, ψi will have exactly one satisfying assignment.
Picking k > 4|x|q(|x|) will reduce the error in (2) to under 2-q(|x|). Let g(φ,r) use r to choose ψ1,…,ψk and output the Gap-P function 1-∏(1-#ψi) using the closure properties of Gap-P where #ψi is the number of satisfying assignments to ψi.
  • If φ is not satisfiable then for all i, #ψi=0 and g(φ,r) = 0.
  • If φ is satisfiable then with high probability for some i, #ψi=1 and thus g(φ,r) = 1.
Note that if r is a bad random sequence and φ is satisfiable, g(φ,r) might be an exponentially large integer, positive or negative. It's right most of the time but when it's wrong it's terribly wrong.