Wednesday, March 29, 2023

Alan Turing, The Opera

 

Last Thursday I attended the world premier of The Life and Death(s) of Alan Turing, a new production from Chicago Opera Theater composed by Justine Chen with the libretto (text) from David Simpatico.

The opera takes a mostly chronological trip through his life in seven scenes, focusing less on the computer science and more on Turing's struggle with homosexuality and his prosecution. Turing does fiddle with an old computer throughout the opera. The opera ends with a different take on his death (spoiler alert), where he attempts to upload his consciousness to escape his chemically castrated body. 

Between the scenes, a chorus chants random numbers and words as they floated on a scrim, in what the composer calls "chat clouds". 

These “chat clouds” transport the listeners with a sonic approximation of internet chatter, filled with information that brings them to the next moment. The aural depiction of these "chat clouds" was inspired by the animated movies and television series of Masamune Shirow's cyberpunk manga Ghost in the Shell. Another sonic influence was Walt Disney’s fantastical Snow White, one of Alan’s greatest obsessions.

I found them reminiscent of the Philip Glass' knee play from Einstein on the Beach. I really enjoyed these sequences, though another academic I ran into during intermission felt otherwise.

Over all I enjoyed the music and the opera, particularly the courtroom scene where Turing gave an impassioned defense though it turns out all in his head. The cast did well across the board, especially Jonathan Michie who sang Turing. 

Librettist David Simpatico (wearing jacking), Composer Justine Chen (in gown)
conductor Lidiya Yankovskaya behind her and Jonathan Michie (in red robe)

One can't help compare this opera to The (R)evolution of Steve Jobs, that I saw in Atlanta last year. Both operas chart a metaphysical journey of a computing giant through various important moments of their lives. During a Q&A I asked Simpatico about the two and he said he purposely avoided the Jobs opera so as to not affect how he wrote this one. Probably for the best.

Sunday, March 26, 2023

The SIGACT Book Review column list of books it wants reviewed

I am posting this for Nick Tran who is the current SIGACT Book Review Editor (before him it was Fred Green for about 6 years, and before him it was me (Bill Gasarch) for 18 years. Nobody should have the job for more than 6 years. No TV show should to on more than 6 years. The 6-year rule probably has other applications.)

Nick asked me to post the list of books that need reviewing. It is most of this post. 

If you spot one you want to review then email him (email address later)  the name of the  book you want to review and your postal address so he can send it to you or have it sent to you. Here are his specs:

Reviews of recently published or bucket-list books of interest to the TCS community are welcome. Manuscripts (NOTE FROM BILL `manuscripts'? Really? Sounds like the kind of thing you would FAX or postal mail) should be between 3 and 6 pages and include a brief introduction, a detailed content summary, an assessment of the work, and a recommendation to the book's targeted audience. 

Nick's email is ntran@scu.edu

The books are: 

ALGORITHMS

Knebl, H. (2020).  Algorithms and Data Structures: Foundations and Probabilistic Methods for Design and Analysis. Springer.

Roughgarden, T. (2022). Algorithms Illuminated: Omnibus Edition. Cambridge University Press.


MISCELLANEOUS COMPUTER SCIENCE

Amaral Turkman, M., Paulino, C., & Müller, P. (2019). Computational Bayesian Statistics: An Introduction (Institute of Mathematical Statistics Textbooks). Cambridge University Press.

Nakajima, S., Watanabe, K., & Sugiyama, M. (2019). Variational Bayesian Learning Theory. Cambridge University Press.

Hidary, J. D. (2021). Quantum Computing: An Applied Approach (2nd ed.). Springer.

Apt, K. R., & Hoare, T. (Eds.). (2022). Edsger Wybe Dijkstra: His Life, Work, and Legacy (ACM Books). Morgan & Claypool.

Burton, E., Goldsmith, J., Mattei, N., Siler, C., & Swiatek, S. (2023). Computing and Technology Ethics: Engaging through Science Fiction. The MIT Press.


DISCRETE MATHEMATICS AND COMPUTING

O’Regan, G. (2020). Mathematics in Computing: An Accessible Guide to Historical, Foundational and Application Contexts. Springer Publishing.

Rosenberg, A. L., & Trystram, D. (2020). Understand Mathematics, Understand Computing: Discrete Mathematics That All Computing Students Should Know. Springer Publishing.

Liben-Nowell, D. (2022). Connecting Discrete Mathematics and Computer Science (2nd ed.). Cambridge University Press.


CRYPTOGRAPHY AND SECURITY

Oorschot, P. . C. (2020). Computer Security and the Internet: Tools and Jewels (Information Security and Cryptography). Springer.


COMBINATORICS AND GRAPH THEORY

Golumbic, M. C., & Sainte-Laguë, A. (2021). The Zeroth Book of Graph Theory: An Annotated Translation of Les Réseaux (ou Graphes)—André Sainte-Laguë (1926) (Lecture Notes in Mathematics). Springer.

Beineke, L., Golumbic, M., & Wilson, R. (Eds.). (2021). Topics in Algorithmic Graph Theory (Encyclopedia of Mathematics and its Applications).  Cambridge University Press.


PROGRAMMING ETC.

Nielson, F., & Nielson, R. H. (2019). Formal Methods: An Appetizer. Springer.

Sanders, P., Mehlhorn, K., Dietzfelbinger, M., & Dementiev, R. (2019). Sequential and Parallel Algorithms and Data Structures: The Basic Toolbox. Springer.


MISCELLANEOUS MATHEMATICS

Kurgalin, S., & Borzunov, S. (2022). Algebra and Geometry with Python. Springer.


 

Thursday, March 23, 2023

A Strange Hiring Season

In my fall jobs post, I had trouble predicting this season's CS faculty job market but I didn't expect this:  strong supply and strong demand, something we haven't seen since the early '80s when the then young CS departments were ramping up.

We have a confluence of two forces that have both strengthened since my post back in November: The layoffs and hiring freezes at the major internet companies (Alphabet, Amazon, Apple, Meta and Microsoft) contrasted with major excitement in machine learning. I wrote the jobs post before ChatGPT was released and Amazon and Meta have since announced even more layoffs.

ML-mania is leading to very strong demand from undergrad and grad students for computing degrees. Across the US we have a record number of open faculty slots in computing as universities try to meet that need.

Meanwhile, PhDs who might have gone to industry are going on the academic job market instead. Also some tech researchers who have been laid off or spooked by the layoffs are considering academic jobs.

Between these two forces we will likely have a record number of faculty hires this year but we may see fewer senior people switching universities because good departments can fill their needs with junior faculty.

There is a mismatch of area. There is a big demand in CS departments to hire in machine learning because that is where the student interest is. ML is not where the big companies are cutting back. If you are on the market this year, position your research to make it relevant to learning, or at least that you are willing to teach ML courses.

By the way, this post is for computer science. Count yourself extremely lucky if you can get a tenure-track job in a non-computing field.

Sunday, March 19, 2023

New Upper Bound on R(k). WOW!

 R(k) is the least n such that for all 2-colorings of the edges of \(K_n\) there is a monochromatic \(K_k\)

(so there are k vertices such that the coloring restricted to the edges between those k vertices is constant)


Here is some history. If I miss a reference or a result, let me know in the comments


\(R(k) \le 2^{2k} = 4^k \) seems to be folklore and is a very old result. Proof could be shown to HS students. I have. Or 9 year old's who are good at math. I have. 


\(R(k)\le {2k-2 \choose k-1}\) which is approx \(\frac{4^k}{\sqrt k}\) was shown by Erdos and Szekeres in 1935. Proof could be shown to HS students. I have. Or 9 year old's who are good at math. I have. 


Thomson in 1988 got

$$R(k) \le \frac{4^k}{k^A\sqrt{\log k}}$$

for some A. This paper is behind paywalls so will be lost to history and I cannot comment on if the proof is easy or hard. (If you know of a link it, let me know and I will provide it). 

Conlon in 2006 got the denom to be super-poly. We omit the result but the paper is here. Proof used the next method of quasi-randomness. 

Sah  (his paper is here) optimized Conlon's technique to get 

$$R(k) \le 4^{k-c(\log k)^2}.$$

(According to the paper I will point to soon that has the major result, Sah's result is the best one can do with the Thomason-Conlon technique. In Complexity theory we may have formalized that and called it a barrier result.)

These results were interesting and used interesting techniques. However, the best known lower bound is  \(R(k) \ge 2^{k/2}\) (I've left out constants) so the REAL questions are

a) Can the 4 be replaced with a smaller number in the upper bound?

b) Is there a number a so that R(k) is essentially \(2^{ak}\)?  

We note in passing that the lower bound was proven by the Prob Method. That last statement obscures the history--- the Prob Method was invented by Erdos TO PROVE the lower bound. I've seen the prob method called an application of Ramsey Theory which is not quite right. Its more like Ramsey INSPIRED a technique that is used A LOT, including things that are actually practical. 

But  back to our story. SO, while all of the above results were interesting, were they making progress towards the REAL question? We can now say YES as progress HAS been made and DID use some of the techniques.

On March 16, 2023 Campos, Griffthis, Morris, Sahasrabudhe posted a paper on arxiv, here, that showed

$$R(k) \le (4-\epsilon)^k$$

 for some (quite small) epsilon. 

Here are some thoughts which are influenced by my email correspondence with Jacob Fox (an eminent combinatorist) on this topic.

1) I am NOT surprised that the result is true. 

2) I AM surprised that it's been proven. Lots of brilliant people have worked on this problem for many years so... .why now? Since its been open for around 70 years, I thought it would take another 70 years to crack.

3) Will this result lead to better results? I hope so!

4) Does R(k) have a nice upper bound? Not clear. The following is unlikely though something like it may be possible:

R(k) roughly\( (3.5)^k\) for k a power of a Fibonacci prime

R(k) roughly\( (3.8)^k \)othewise

5) (Jacob pointed me to this) There is a paper on Book-Ramsey Numbers that also (on page 2) notes a connection to Ramsey Numbers. The paper is here. A conjecture on Book-Ramsey will lead to a big improvement on Ramsey. Those kinds of results are odd since I can't tell if the upshot is

Lets work hard on Book-Ramsey so we can get better bounds on Ramsey!

or

Book-Ramsey is HARD so lets give up. (Who first said if at first you don't succeed, quit. Why make a damn fool of yourself? ?) 

6) The paper with the new result is 57 pages. It looks dense. I will wait for the movie.  I may have to wait a long time. 

Thursday, March 16, 2023

Identities in Computational Complexity

Guest post by Josh Grochow

On the birdsite, Jay Cummings tweeted:


And it got me thinking about identities in computational complexity. At first I thought: how many could there be? 10? After some thought and with some help, there turn out to be quite a few more than that, and I think they make a pretty good list!

Rules/comments on the list:

  • Some are relatively simple alternative characterizations, such as the various definitions of PH; most are pretty nontrivial theorems.
  • I didn't include existentially quantified oracle results (such as "there exists A such that PA=NPA"), despite them being some of my favorite results, because there'd be waaaay too many of those. Probably thousands? Ask Lance. In some circles he's known as the "oracle oracle" - as in, he's who you go ask if you want to know if a certain oracle exists. I did include identities involving oracles where one side of the identity didn't have an oracle, such as EXP=NPRKt or AlmostP=BPP (AlmostP is the class of languages L such that {A : L is in PA} has measure 1).
  • I also didn't include some important conditional equalities, such as "EXP in P/poly iff EXP=MA" or "NP in BPP iff NP=RP". I guess it's not really an "identity" if it's conditional. Games are only fun if the rules are at least a little constraining!
  • There are some surprising containments that either aren't equalities, or aren't known to be equalities, and I didn't include those either, despite some of them being very important. For example, BPP in Σ2P, other depth reduction results (such as the chasms at depth 4 and 3), and Beigel-Tarui/Yao.

One could teach a class based on a list like this and cover a lot of good ground, but I think it'd be sad to leave out any lower bounds. Actually, by my estimate, if you were teaching from "scratch" (say, starting after an undergrad algorithms class), this list, and all its implied prerequisite material, is already probably the equivalent of about 3-4 semester-long classes!

What other complexity identities did I miss?

Finally, without further ado, a list of (some) identities in computational complexity:

RP∩coRP=ZPP

CLS=PPAD∩PLS [from Paul Goldberg @paulwgoldberg]

quasi-poly Frege =quasi-poly noncommutative formula IPS

Space classes

PSPACE=NPSPACE

NL=coNL

L=SL

DET=LGapL (see Update 1)

Interactive Proofs

NP=PCP(O(log n), 3)

IP=PSPACE

AM = MAM = AMAM = ... [From Albert Atserias @atserias]

MIP=NEXP=PCP(poly,poly)

Alternating Turing machines

Σk P = ∃ Pik-1 P = NPΣk-1 P = ΣkTIME(poly(n)) [from Lance]

AP=PSPACE

APSPACE=EXP

Circuit classes within P

NC1=5-PBP

ACC0=branching programs over solvable monoids

P=AuxPDA

SAC1 = LogCFL [from Michael Levet @Michael_Levet]

REG = DSPACE(O(1)) = NSPACE(O(1)) [from Levet]

REG=DTIME1tape(o(n log n))

ACk = logk time on a CRCW PRAM

Quantum

QIP=PSPACE

MIP*=CE

QMAlog (1)=BQP [from Martin Schwarz @martin_schwarz]

QMAlog (poly)=QCMA [ditto] 

QAC0f = QNC0f [from Ale `Scinawa' Luongo @scinawa]

QMA(k) = QMA(2) for any k ≥ 2 [from Sarvagya Upadhyay @sarvagya82]

Algebraic complexity

VP=VNC2

VDET=VABP=VPs=VPws

VPe=VABP3

border-VPe=border-VABP2

For bilinear maps, tensor rank=Theta(algebraic circuit size)

Logical characterizations

FO=AC0

AC=NC=FO[poly(log)]

SO=NP

P = FO + LFP on ordered structures [thanks to Lance, and Michael Levet]

Kolmogorov-random strings as oracles

EXP=NPRKt

PSPACE=ZPPRKS

P=COMP∩{dtt reducible to RKU}

Almost-classes

AlmostP=BPP

AlmostNP=AM

AlmostPSPACE=BPexp.PSPACE


Updates
  1. March 17th update, h/t Eric Allender: Using Cook's original definition of DET as problems NC1-reducible to the integer determinant, apparently this equality is not known! See Eric's recent guest column in SIGACT News for details and a $1000-prize related to this question, along with many other interesting open questions.

Monday, March 13, 2023

Problems we assume are hard. Are they? ALSO- revised version of Demaine-Gasarch-Hajiaghayi is posted!

 (The latest version of 

Computational Intractability: A Guide to Algorithmic Lower Bounds

by Demaine-Gasarch-Hajiaghayi is posted here. Its new and improved: (1) we have made all or most of the corrections send to us by proofreaders,  (2) there is a chapter on quantum computing, (3) there is an index.  Feel free to read it and send us corrections! 

This post is related to it in that most of the problems-assumed-hard mentioned in this post were the basis for chapters of the book.)


We think SAT is hard because (1) its NPC, and (2) many years of effort have failed to get it into P. 

Imagine a world where we didn't have the Cook-Levin Theorem. We may still think SAT is hard. We may even take it as a hardness assumption to prove other things hard by showing SAT \le A. We may also be curious if they are equivalent and have lots of reductions A \le SAT. The reductions might be Karp or Cook. 

You do not have to imagine this world! We already have it- in different contexts. In other areas of complexity theory there are problems that are assumed hard, but for which there is no analog of Cook-Levin. Are they hard? They seem to be- but the evidence is empirical. Not that there's anything wrong with that. 

I will list out problems that 

a) we assume are hard, for some definition of  we and hard.

b) we have NO proof of that and NO completeness or hardness result.  ADDED LATER- a commenter wanted me to clarify this.

 For SAT there is a well defined set of problems, NP, defined independent of any particular problem (that is, NP was NOT defined as all sets Karp-Red to TSP or anything of the sort) and by Cook-Levin we have 

if SAT is in P then NP is contained in P.

For FPT (the class the commenter was interested in) there IS the result

if weighed k-SAT is in FPT then W[1] is contained in FPT.

but W[1] is DEFINED as the set of problems FPT reducible to Weft-1 circuits (some books use a different basic problems) which IN MY OPINION is not a natural class. One may disagree with this of course. 



COUNTERAUGMENT: but W[1] Was defined as all problems FPT-reducible to Weft-1 circuits. And this is not the same 

(I do not include hardness assumptions for crypto because that's not quite the same thing. Also there are just so many of them!) 

I am sure I missed some- which is where YOU come in! Please leave comments with additional problems that I forgot (or perhaps did not know) to include. 

1) 1vs2 cycle: Given a graph that you are guaranteed is 1 cycle or the union of 2 cycles, determine which is the case. This is assumed to be hard to parallelize (we omit details of defining that formally).  This has been used for lower bounds in parallelism. See here.

2) 3SUM: Given x1,..,xn an array of integers, are there 3 that add to 0? There is an O(n^2) algorithm. The hardness assumption is that, for all epsilon, there is no O(n^{2-\epsilon}) algorithm. This assumption has been used to get lower bounds in comp. geom. See the Wikipedia entry here or the introduction of this paper here. (ADDED LATER, a 2-pager on some algorithms for 3SUM including some randomized ones: here. A commenter asked about randomized algorithms. Perhaps the conjecture should be that no randomized algorithm is sub quadratic.) 

4) APSP (All Pairs Shortest Path) Given a graph G, find for each pair of vertices  the length of the shortest path. There is an O(n^3) algorithm. The hardness assumption is that, for all epsilon, there is no O(n^{3-epsilon}) algorithm. This assumption has been used to get lower bounds on graph problems. For more details see the introduction of this paper: here

5)  Weighted-SAT-k: Given a Boolean formula (it can be taken to be in 2CNF form) is there a satisfying assignment that has exactly k of the variables set to TRUE. This is assumed to not be fixed parameter tractable (that is no function f such that this problem is in  O(f(k)n^{O(1)}) time). Problems that are FPT-equiv to it are called W[1]-complete and are all thought to not be in FPT. W[2], W[t] are also defined but we omit this. W[1]-complete has also been defined in other ways, but I can't seem to find a Cook-Levin type theorem for them. 

6) Graph Isom.  One of he few problems that are in NP but thought to not be NP-complete and, at least for now, is not in P. Babai has shown its in quasi-poly time (n^{(log n)^{O(1)}). There is a notion of GI-hard: problems that, if they are in P then GI is in P. See he Wikipedia entry here. Most of the GI-hard problems are variants of GI, e.g., graph isom. for directed graphs. GI could be in P without unbelievable consequences for complexity and without terrifying consequences for cryptography. 

7) Unique Games Conj: I won't define it formally here, see the Wikipedia entry here. From UGC you get several approximation results are optimal. Does that argue for UGC being true-- having consequences we believe? I would say YES since in some cases the algorithm that gives a good approx has an alpha-approx for some weird number alpha, and assuming UGC you get the SAME alpha as a lower bound. 

8) Existential  Theory of the Reals. Easier for you to read the Wikipedia entry here. It is used for problems that are inbetween NP and PSPACE. (ADDED LATER: One of the commenters says that Blum-Shub-Smale showed ETR is complete for the real version of  NP, so this item should not be on my list.) 

9) Orthogonal vector conjecture. See this paper: here. This is used to show problems are not in subquadratic time. 

Possible research directions and thoughts

a) Try to prove a Cook-Levin type theorem for one of these problems.

b) Build classes analogous to the poly-hiearchy on one of these problems.

c) Ask bounded-query questions. For example: Are k queries to 3SUM more powerful than k-1 (This is a VERY Gasarchian question.) 

d) Try to prove that one of these problems is actually hard. That seems hard. Perhaps on a weaker model (thats prob already been done for at least one of them.)






Monday, March 06, 2023

Peer Review

I ran into a partner of a computer scientist at a social event who asked me "Is the publication system in CS screwed up or really screwed up?" If you don't know my response you haven't been reading this blog long.

Today let's talk about peer review. Kevin McCurley and Adam Mastroianni have recent, not so positive, takes on this topic. 

Peer review came out of a system where we had limited slots in journals and, in computer science, conferences and we had to make tough decisions. Journals and conferences would gain a reputation based somewhat on how difficult it was to get papers published there.

Now we have basically unlimited space to publish your results. And you definitely should do so, posting your papers on your own webpage, and a paper archive site like arXiv or ECCC. The research community would flourish in a world where everyone posts their paper online for public comment, people can promote their favorite papers on social media and we have a TikTok-system for recommending papers to you.

So why do we still need paper review? Mostly because we have to review researchers for jobs and grants, and with the questioning the value of recommendation letters, publication quality and quantity has become a stronger proxy for measuring people for better or for worse.

First of all, peer review is a cornerstone of science. Would you rather have papers reviewed by faceless bureaucrats who know little about the subject area? Or papers only ranked by manipulable statistics like citations.  

But the way we apply peer review, to decide acceptances in conferences, just adds too much randomness to the system. CS conferences have multiplied and continue to get increased submissions as the field grows. It's just impossible to maintain any sort of uniformity in quality of acceptances. Or too often, we find conference committees and funding panels playing it safe rather than take risks with new research. With so much randomness, it's best to try many papers instead of focusing on a stronger result, leading to too much incremental research, especially in academia. 

For hiring, promotion, tenure and funding decisions, we too often rely on short cuts, such as the number of papers accepted to major conferences. Those who don't win the conference lottery get disillusioned and often leave academia for industry and no one wins.

Thursday, March 02, 2023

Goodbye Dilbert

Scott Adams, creator of Dilbert, had a racist rant in a video he posted last week. As a result most newspapers that carried the comic strip are dropping Dilbert, including our local Chicago Tribune. I fully support these moves. Much as I believe in separating the art from the artist, it's different when the artist is living and profiting from their art.

So we need to say to Dilbert, making the end of an era. Dilbert started in 1989 as a strip that captured the absurdities of the work place in an anonymous tech company, predating movies like Office Space and shows like Better Off Ted and Silicon Valley. I used Dilbert strips (with permission) in my book, namely this strip to introduce Kolmogorov complexity and this strip to describe my research area. Just call me Dan.

Farewell to Dilbert, Dogbert, Wally, Alice, Asok, the pointy-haired boss and the rest. I won't miss Scott Adams, but I will miss his creations.