Wednesday, October 09, 2024

Fall Jobs Post 2024

In the fall, I write a jobs post predicting the upcoming CS faculty job market and giving suggestions and links. In the spring I used to crowdsource a list of where everyone got jobs but have since outsourced the crowdsource to Grigory Yaroslavtsev. Let's start with a few announcements.

FOCS in Chicago will have a Graduating Bits on Sunday, October 27 from 12-2 PM. If you have job openings for postdocs and researchers the FOCS organizers are collecting them here, The CATCS also maintains a Theoretical Computer Science Jobs posting site. You are also free to add pointers to theory-related job listings as comments to this post. More generally in CS, there is the CRA Database of Job Candidates and the CRA and ACM job postings.

MohammadTaghi Hajiaghayi is organizing a virtual theory jobs panel in November including yours truly. I'll post details here and on social media when it's finalized.

If you are a bit more senior, the Simons Institute is looking for a new director

Last year I suggested AI (which by the way just won two Nobel Prizes) wasn't dramatically affecting the CS faculty job market yet but

Many see programming, rightly or wrongly, as one of the first careers that AI will displace, which may reduce enrollment in the future, as offshoring fears drove CS enrollment down 20 years ago

It didn't take long. In two years we have gone from nearly all of our CS graduates getting jobs in the field to many of them struggling to get internships and jobs in the top companies if at all. If the past is any guide, a weak tech job market leads to fewer majors which leads to fewer slots for CS faculty. We'll start to see these trends this year and they will accelerate quickly if the tech jobs market doesn't recover.

Areas related to data such as Artificial Intelligence, Data Science and Cybersecurity, will draw the most interest. Best if you can tie your research to those areas, or at least that you are open to teaching in them.

Have a well-designed website with all your job materials and links to your papers. Make sure your Google Scholar, LinkedIn and GitHub (if relevant) sites are accurate and up to date. Make a short video describing your research to a general CS crowd. Take advantage of all the links above. Network at FOCS if you can make it. And start early. 


Sunday, October 06, 2024

Emil Post Anticipated (more than anticipated) Godel and Turing

 (Thanks to James De Santis for pointing the article that inspired this post on Post. The article is pointed to in this post.)

What is Emil Post known for? I know of him for the following:

a) Post's Problem: Show that there is an r.e. set A that is strictly in between Decidable and Halt using Turing Reductions. He posed the problem in 1944. It was solved in 1956 by the priority method, simultaneously invented by Friedberg and Muchnik . (My students wondered who posted to the web first and if the other one could have seen it there.)

b) He invented Post Tag Systems, a natural problem that is undecidable. Or a model of computation. Depends on how you look at it.

c) The Kleene-Post Theorem which produces a set A that is strictly in between Decidable and Halt. It did not solve Post's Problem since A is not r.e. The proof used forcing and may have been the first or close to the first use of forcing. This was published in 1954.

In summary, I thought of Post as being a recursion theorist. 

 The more I read about Post the more I realize that calling him a recursion theorist is wrong but in an interesting way. Back in the 1950's  I am sure Emil Post was called a logician. The over-specialization that produces Recursion Theorists, Model Theorists, Set Theorists, Proof Theorist was about a decade away. In fact, people who worked in logic often also worked in other parts of math. (Post worked on Laplace Transforms. See The Post Inversion Formula as part of the Wikipedia entry on Laplace Transforms.)

I recently read the article, Emil Post and His Anticipation of Godel and Turing which shows that Post really did have some of the ideas of Godel and Turing at about the same time, and possibly before they did. I will discuss briefly what he did and why it is not better known; however, the article is worth reading for more on this.

What did Post Do? 

Part of Post's Thesis (1920) was showing that the Propositional Logic in Russell-Whitehead was consistent and complete. 

He tried to show that all of RW was consistent and complete. He got RW down to a normal form; however, he realized that rather than reducing the complexity he just localized it. In 1921 he noted the following (I quote the article).

a) Normal Systems can simulate any symbolic logic, indeed any mechanical system for proving theorems.

b) This means, however, that all such systems can be mechanically listed, and the diagonal argument  shows that the general problem of deciding whether a given theorem is produced by a given system is unsolvable.

c) It follows, in turn, that no consistent mechanical system can produce all theorems.

Wow- that sounds like the outline of a proof of Godel's Incompleteness theorem!

Why Didn't Post Publish This? 

 In 1921 Post suffered his first (of many) attacks of manic-depression. He was unable to get an academic job until 1935.  By the time he could have written a paper, his ideas were already known. Note that the items about Post I knew are all post-1940.

The Story is Interesting Because its Boring

There is no interesting conflicts between mathematicians in this story. No Newton vs Leibniz rap battle (see here) no plagiarism. Just bad timing and bad luck. So this story is not even worth  being exaggerated. 

But I find that interesting. Timing and Luck play a big role, perhaps bigger than is commonly thought.






Wednesday, October 02, 2024

Favorite Theorems: Gradient Descent

September Edition

Who thought the algorithm behind machine learning would have cool complexity implications?

John Fearnley, Paul Goldberg, Alexandros Hollender and Rahul Savani

Let's unpack these classes, subclasses of TFNP, where for every input we know there is an easily verifiable solution and we are looking at the complexity of finding it. PPAD is the class famous for having finding a Nash Equilibrium as a complete set, see this post for details.

PLS is the class of problems where we look for a local minimum. Finding a global minimum is NP-complete--think vertex cover. Finding a local minimum is often easier but still could be hard if you are optimizing over an exponential set of values.

CLS is a bit trickier to define, basically you are finding an approximate local minimum of a function mapping three real variables to one real value.

The authors show that gradient descent is complete for PPAD ∩ PLS even if you only use two input variables. Since gradient descent is in CLS, the equality follows. 

More in my 2021 post. On that post author Paul Goldberg commented

The paper is a fine example of the humorous stereotype of complexity theorists proving a problem "hard" when meanwhile the problem is being routinely solved in practice in the real world.

Nevertheless it's a neat complexity result and now officially one of my favorite theorems.

Sunday, September 29, 2024

Progress on R(5). Will there be more?

 (I had a post a while back requesting people to submit open problems in Luca Trevisan's honor with deadline Oct 1. I am extending that to Oct 14, but that is a HARD deadline. See my original post which I have updated, here.)

And now back to our regularly scheduled program.

 ====================================

Breaking News: \(R(5) \le 46 \)

I know this since 46 people have emailed me this link here.


(ADDED) Recall that \(K_n\) is the complete graph on \(n\) vertices which consists of \(n\) vertices and every single pair of vertices has an edge.

Recall that R(5) is the least n such that 

For all 2-colorings of the edges of \(K_n\) there is a set of 5 vertices such that all of the edges between them are the same color.

Here is a history of what is known about R(5). It is not complete. 

Proofs of points 1 and 2 below are on my slides here.

1) Let R(a,b) be the least n such that 

For all 2-colorings of the edges of K_n there is either a RED K_a or a BLUE K_b.

Note that

R(2,b)=b

R(a,2)=a

It is well known that R(3,3)=6.

Is it well known that \( R(a,b) \le R(a,b-1) + R(a-1,b)  \).

From this one can derive \( R(5,5) = R(5) \le 70 \)

2) One can also show that if R(a,b-1) and R(a-1,b) are even then 

\( R(a,b) \le R(a-1,b) + R(a,b-1) -1 \).

From this one can derive \(R(5,5)=R(5)\le 62 \). 

3) In 1989 Exoo showed \(R(5) \ge 43 \). That is, Exoo gave a 2-coloring of \(K_{42}\) with no monochromatic \(K_5\). I was unable to find his paper online; however, there is a modern exposition of his paper here.

4) In 1997 McKay and Radzisowski showed \(R(5) \le 49\).  The paper is here. This used some clever math  and lots of computer time. This paper also has a more complete history of R(5) up to 1997 then I have in this post. (Radzisowski also has a dynamic survey of small Ramsey numbers here.)

5) In 2017 Angelveit and McKay showed \(R(5) \le 48 \). This used some clever math of  and lots of computer time. The paper is here. That is the arxiv version. The journal version is behind a paywall; however, if you want to reference it and need to know journal etc, the link is here.

6) In 2024 Angelveit and McKay showed \(R(5) \le 46 \). This used some clever math and and a big computer search. The paper is here. (ADDED LATER - they used Glucose, a SAT Solver.)

COMMENTS ON ALL THIS

1) It is widely believed that R(5)=43. 

2) I have been asked if AI or SAT Solvers will help. I asked Radziowski and Angelveit and McKay and they all thought NO. There is just no way around ALL those possiblities. 

Lance: Getting 46 via an extensive computer search using "30 years of CPU time." Hard to believe AI and SAT Solvers won't play some role in future advances. 

Bill: Some problems are just really hard for SAT Solvers. Getting \(R(5)\le 45\) may take 3000 years of CPU time. So it may not be for a while or ever. 

Bill: How about a wager? If R(5) =43 is proven by 2040 then you win, otherwise I win.

Lance: Let's make it an exact value for R(5). "Widely believed" doesn't always mean "true". What do we win?

Bill: Bragging rights! And the loser buys dinner.

(ADDED LATER- A commenter left a comment which probably means they want to ask the following:

Commenter: Since SAT Solvers did play a role in the new result, why are you skeptical that they won't play a role in getting further improvements, and perhaps finding R(5)?

Bill: Even with SAT Solvers, it is taking more and more time. Do you want in on the bet? If so then email me privately. 

)

3) IF there is some clever math that will help a lot then an AI or a Human may find it. But I am skeptical this will happen.

4) I am surprised the bound went from 48 to 46 without a paper about 47. 

5) Has any nice math come out of trying to find R(5) (and other concrete values)? 

a) YES- the work on R(k) for large k has lead to nice math.

b) NO- the results above on R(5), and the math for other concrete values has largely been clever and computer work, but nothing that generalizes that much. 

6) One of the Anonymous commenters pointed this out: here


Thursday, September 26, 2024

LeetCode and AI

I often do LeetCode problems for fun. This site mainly provides short coding problems for students and others to train for the kinds of question that come up in technical job interviews. I use the site to keep up my programming skills and it often requires clever algorithms and data structures. The Daily Question is like a Wordle for recreational programmers. Try this problem which asks you to create a data structure for sets with insert, delete and get-random-element operations in expected constant amortized time.

I have to turn off GitHub Co-pilot, otherwise it will give you the solution before you even finish typing the function name. There are so many solutions to these problems out there and in the training sets for LLMs.

A student asked me last week why he should do LeetCode problems if AI can solve them all. I responded that doing the problems (and CS homework more importantly) give you the skills to understand code and algorithms and in your future jobs you'll encounter problems AI may not solve fully, or correctly, or efficiently and having those skills will allow you to solve those kinds of problems AI alone can't tackle.

But is this the right answer as AI continues to improve? Ideally we want to create students who transcend AI instead of being replaced by it. For that they need to fully understand programming and computing and be smart enough to know when and when not to outsource that skill to AI. That's the challenge for CS departments: teaching students how to use AI to make themselves better computer scientists without relying on it. It's a hard balance in a technology where we can't predict AI's capabilities at the time these students graduate.

Monday, September 23, 2024

I thought I knew what pizza was...

 On Page 75 of 

The Existential Theory of the Reals as a Complexity Class: A Compendium

by Marcus Schaefer, Jean Cardinal, Tillmann Mitzow

(see here for the paper) 

I came across the following definition:


a pizza is a mass distribution (measure) on \( [0,1]^2 \) that can be computed for polygonal subsets using arithmetic circuits.

Okay. Can I add green peppers and mushrooms?

There are many words in English that we use for math terms. How close are their English-meaning and their Math-meaning?

Greedy and Girth are pretty close.

Pizza, not so much. 

Forcing is a bit odd- when we do a forcing argument we are (say) creating a model of set theory where CH fails,  I don't think the model minds that, whereas, the term forcing in English usually means the person being forced does not like being forced.

Dynamic Programming sounds like the programmer is typing really fast.

Divide and Conquer- that matches pretty well.

Games.    There are some math papers about actual games people play. And there are also math papers that inspired Darling to say Math Games are NOT Fun Games!  I blogged about that a few times, mostly notably here.




Wednesday, September 18, 2024

Embracing the Future of Manufacturing


Last week I walked around the International Manufacturing Technology Show with 89,000 other participants at the McCormick Center in Chicago. I went to see how AI was affecting manufacturing and it wasn't quite the way I was expecting.

The sheer scale of physical machinery is overwhelming: from elaborate manufacturing equipment and numerous robots, to water jets performing precision cuts on metal. You also notice the international aspects--the big booths belonged to German, Japanese and Taiwanese companies, not so much American. US leads the world in many things, but manufacturing technology is not one of them.

There was one exception--the three big cloud providers: Google, Microsoft and AWS (Amazon). All, especially Microsoft, have dedicated cloud services for manufacturing. All were touting their AI services for manufacturing from predictive maintenance to Edge AI to generative AI diagnosing machine failures. It was less clear if the audience was listening.

When I walked around a Health Tech show last year, one could fee the focus on data and the excitement about AI even just a few months after ChatGPT went public. At IMTS the feeling was different. The most exciting "new technology" was allowing supervisors to get stats and adjust the machines remotely from the beach, offering retrofits to old machines to allow it to happen. One exhibitor showed me an app where a line worker could show a video to a distant supervisor to get some help (they need a new app for that?). I suggested that soon the supervisor could be replaced by AI in the app and they gave me a funny look.

The most interesting use of AI came from a small Oregon-based company Machine Research, which uses AI for small manufacturers to create bids from proposals. It's a small operations--just seven employees, basically only two developers working on the cloud. 

Like academia, the manufacturing industry is slow to change. Replacing old machines and products are expensive--you can't just do a software update. Both presidential candidates are pushing to increase manufacturing in the US but unless we can take advantage of our technical strengths, embracing the future and Industry 4.0, we'll find it hard to compete. Tariffs and financial incentives will only take us so far.

Sunday, September 15, 2024

How will chatGPT affect Homework?

LANCE: I gave my final exam for my ugrad theory course (regular, Context Free, P, NP, Decidable, Undecidable) to the new ChatGPT o1 that claims to reason about math to see how it would do.

BILL: How did it do?

LANCE: Impressive--beat most of my students. Lets just say I could write a pretty good letter of recommendation.

BILL (arrogantly): I am a great problem maker, so this won't affect me. I ask problems in Grad Ramsey that ChatGPT would crash and burn on.

LANCE:  Okay big-shot, email me some HWs from Ramsey and we'll see how ChatGPT does on it.

BILL: Okay! 

[Bill emails Lance the first two HW assignments and prediction on how ChatGPT will do.

Bill predicts that it will do well on the standard problems, but crash and burn on unusual problems that (key!) are not on the web. Concepts that Bill invented just for his course. Bill is mostly correct in these predictions.]

LANCE: But how did Chatty do compared to the students who took the course?

BILL: About the same with one caveat: when ChatGPT gets something wrong its really weird- it has the look and feel of a proof but what it outputs is something no human would ever right. So it does badly on getting partial credit. And one other thing- asking it to do a proof A CERTAIN WAY or SIMILAR TO WHAT I DID IN CLASS it does badly on.

LANCE: I wonder if the same holds for Ramsey courses taught at other universities. Wait is there even any one else who teaches Ramsey theory?

BILL: Very funny. In your theory course I  assume students can't use ChatGPT on an exam so this is not really a problem. But for HW it might be a problem. It also brings up the analog of the question

do we allow students to use slide rules on an exam

LANCE: Slide rules? Your age is showing.

BILL: I am only 2 years older than you. Or is it 3? I forget these things.

LANCE: Yes, I would allow students to use ChatGPT as long as they acknowledge they did so, write up their own solution and take responsibility for any mistakes. Problem is they will just cut and past.

BILL: Lets break this down course-by-course

GRAD RAMSEY:  The students want to be there and are very good. My hope is that 

1) For standard problems they won't need ChatGPT

2) For unusual problems ChatGPT won't help. And I am good at making up unusual problems.

3) In any case, if they use it and they UNDERSTAND the answer and put it into their own words, I am okay with that.

HONORS DISCRETE MATH: Similar to Ramsey though I am a bit nervous.

UNDERGRAD THEORY  (similar to Lance's course): I would like to think its similar to Ramsey Theory.  However, I may be wrong.

ALL COURSES:

There is a more profound problem here that applies to all courses: the students may confuse getting the answer from ChatGPT with understanding. 

LANCE: ChatGPT is a hell of a lot better at solving problems than it was last week. Maybe ChatGPT won't help on the unusual problems today but that might not last for long. Perhaps the solution is to just give up on assessments and stop grading all together. Students can learn if they want to and waste their money if they don't.

BILL: That kind of change in how we educate is above my pay grade. You are a Dean, so it may be at your pay grade.

LANCE: Nope, even deans have their limits.