Sunday, November 29, 2015

One more sign that a Journal is bogus. Or is it?

I went to an AMS conference with my High School Student Naveen (who presented  this paper)  and an ugrad from my REU program Nichole (who presented this paper). Note that this is a math conference- low prestige, mostly unrefereed, parallel sessions, but still a good place to pick up some new problems to work on and learn some things, and meet people.

Both Naveen and Nichole later got email from a journal urging them to submit their work! They were also invited  to be on the Editorial Board! I can understand inviting a HS student who is a SENIOR to be on an editorial board, but Naveen is a SOPHMORE!

Naveen and Nichole  both emailed me asking what this was about and I looked around and found, not to my surprise, that the journal is an author-pays journal of  no standards. On the other hand, it was open access, on the other other hand, they had an article claiming that R(4,6)=36 (might be true, but if this was known, I would know it, see this blog post for more on that proof technique).

The pay-to-publish model is not necc. bad, and is standard in some fields, but is unusual for math. Of perhaps more importance is that the journal had no standards. And they prey on the young and naive.

Or do they?

Consider the following scenario:  Naveen  publishes in this journal and this publication is  on his college application, hence he gets into a good college with scholarship. He knows exactly what he is buying. Or Nicole does this to help her Grad school Application.  Or I do this for my resume and get a salary increase. Or an untenured prof does this to get tenure ( Deans can count but they can't read). And it gets worse--- the school gives the prof tenure, knowing the papers are bogus, but now they can say they have a prof who publishes X papers a year! At some point I don't know who is scamming who.

This blog post (not mine) gives several criteria for when a journal is bogus. I'll add one: When they ask a 15 years old to be on their editorial board.

Monday, November 23, 2015

Star Trek Computing

In the wake of Leonard Nimoy's death last February, I decided to rewatch the entire original Star Trek series, all 79 episodes. I had watched them each many times over in high school in the 70's, though the local station removed a scene or two from each episode to add commercial time and I often missed the opening segment because I didn't get home from school in time. Back in those stone ages we had no DVR or other method to record shows. I hadn't seen many episodes of the original series since high school.

Now I can watch the entire episodes whenever I want in full and in order through the magic of Netflix. I finished this quest a few days ago. Some spoilers below.

I could talk about the heavy sexism, the ability to predict future technologies (the flat screen TV in episode 74), the social issues in the 23rd century as viewed from the 60's, or just the lessons in leadership you can get from Kirk. Given the topic of this blog, let's talk about computing in Star Trek which they often just get so wrong, such as when Spock asks the computer to compute the last digit of π to force Jack-the-Ripper to remove his consciousness from the ship's computers.

Too many episodes end with Kirk convincing a computer or robot to destroy itself. I'd like to see him try that with Siri. In one such episode "The Ultimate Computer", a new computer is installed in the Enterprise that replaces most of the crew. A conversation between Kirk and McCoy sounds familiar to many we have today (source).

MCCOY: Did you see the love light in Spock's eyes? The right computer finally came along. What's the matter, Jim?
KIRK: I think that thing is wrong, and I don't know why.
MCCOY: I think it's wrong, too, replacing men with mindless machines.
KIRK: I don't mean that. I'm getting a Red Alert right here. (the back of his head) That thing is dangerous. I feel. (hesitates) Only a fool would stand in the way of progress, if this is progress. You have my psychological profiles. Am I afraid of losing my job to that computer?
MCCOY: Jim, we've all seen the advances of mechanisation. After all, Daystrom did design the computers that run this ship.
KIRK: Under human control.
MCCOY: We're all sorry for the other guy when he loses his job to a machine. When it comes to your job, that's different. And it always will be different.
KIRK: Am I afraid of losing command to a computer? Daystrom's right. I can do a lot of other things. Am I afraid of losing the prestige and the power that goes with being a starship captain? Is that why I'm fighting it? Am I that petty?
MCCOY: Jim, if you have the awareness to ask yourself that question, you don't need me to answer it for you. Why don't you ask James T. Kirk? He's a pretty honest guy.

Later in the episode the computer starts behaving badly and Kirk has to convince it to shut itself down. But what if the computer just did its job? Is that our real future: Ships that travel to stars controlled only by machine. Or are we already there?

Thursday, November 19, 2015

A Silly String Theorem

First a note on a serious theorem: Babai has posted a video (mp4, 1h 40 m, 653MB) of his first talk on his Graph Isomorphism algorithm.

I was giving a talk on the Kleene star operator (don't ask) and came across this cute little problem. Say a language L commutes if for all u,v in L, uv=vu.

Show that L commutes if and only if L is a subset of w* for some fixed string w.

Here w* is the set of strings consisting of zero or more concatenations of w with itself. The if case is easy, but I found the other direction pretty tricky and came up with an ugly proof. I found a cleaner proof in Seymour Ginsburg's 1966 textbook The Mathematical Theory of Context Free Languages which I present here.

 If u=wi and v=wj then uv=vu=wi+j.

 Trivial if L contains at most one string. Assume L contains at least two strings.

Proof by induction on the length of the shortest non-empty string v in L.

Base case: |v|=1
Suppose there is an x in L with x not in v*. Then x = vibz for b in Σ-{v}. Then the i+1st character of xv is b and the i+1st character of vx is v, contradicting xv=vx. So we can let w=v.

Inductive case: |v|=k>1.

Lemma 1: Let y=vru be in L (u might be empty). Then uv=vu.
Proof: Since yv=vy we have vruv=vvru=vrvu. So vu=uv.

Let x in L be a string not in v* (if no such x exists we can let w=v). Pick j maximum such that x=vjz with z not the empty string. By Lemma 1 we have zv=vz. Note |z| < |v| otherwise v is a prefix of z, contradicting the maximality of j.

Lemma 2: For all y in L we have yz=zy.
Proof: As before let y = vru. By Lemma 1, uv=vu
Since yx=xy we have vruvjz = vjzvru
vruvjz = vruzvj by swapping v's and z's.
vjzvru = zvruvj by swapping v's and z's, and v's and u's.
So we have  vruzvj = zvruvj
or yzvj = zyuj and thus yz=zy.

By Lemma 2, the set L∪{z} commutes. Since 0 < |z| < |v| by induction L∪{z} is a subset of w* for some w so L is a subset of w*.

Monday, November 16, 2015

Is the word Quantum being used properly by civilians? Understood by them? You know the answer.

I've seen the word `civilians' expanded in use from non-military to non-X for some X. Not sure I've ever seen `civilians' mean `people who don't do math stuff' until the title of todays post. Well, there is a first time for everything.

 I've blogged in the past about the use of the word Quantum (here) . The phrase Quantum Leap  means a BIG leap, where as Quantum stuff is small. Though, to be fair, the discovery (invention?) of Quantum Mechanics was a big leap. So maybe that IS proper use. The James Bond movie Quantum of Solace uses Quantum to mean small, the ONLY time I've seen Quantum used to mean small, so Kudos to the title of an absolutely awful movie. Commenters on my prior blog on the subject pointed out that the original meaning of
quantum was quantity or amount without regard to size of discreteness. I think using it that way now would be very rare.

I came across a theatre called Quantum Theatre. What is Quantum Theatre?  The following are actual quotes from their website.

Quantum artists mine all  kinds of non-traditional spaces for the sensory possibilities they offer when combined with creative design.

 We find it meaningful to place the audience and performer together, the moving parts inside the works.

 We want to move people with our experiements.

The shows run the gamut from those you thought you knew but now experience like never before, to shows that didn’t exist until their elements mixed in our laboratory.

 I came across this article in a place it didn't belong-- in the middle of an article about Google and NASA trying to build a quantum computer (see here.) This news might be exciting but the article was full of mistakes and bad-signs so I'm not to excited about it. Plus the reference to Quantum Theatre is just odd.

The BEST use of the word quantum that I've heard recently was in the episode The Ricks must be crazy of the excellent TV show Ricky and Morty:

The car is broken

Morty (a 14 year old): W-Whats wrong Rick? Is it the Quantum carburetor or something?

Rick (his grandfather, a brilliant scientist): Quantum carburetor? You can't just add a Sci-fi word to a car word and hope it means something.
W-what's wrong, Rick? Is it the quantum carburetor or something?

Read more at:
W-what's wrong, Rick? Is it the quantum carburetor or something?

Read more at:

Thursday, November 12, 2015

A Primer on Graph Isomorphism

I spent 14 years on the faculty at the University of Chicago. I know László Babai well, we collaborated on some of my best known work. I also know Ryerson 251, a room where I've seen hundreds of talks and given more than a few myself. So I could imagine the excitement in that room on Tuesday as Babai gave the most anticipated talk in the history of theoretical computer science, the first of several talks Babai is giving on his new algorithm for graph isomorphism [Video]. Gabriel Gaster extensively live tweeted the event. Jeremy Kun has some details.

For this post instead of doing a bad job trying to overview Babai's proof, I'll explain the graph isomorphism problem and why it is important in the complexity world.

Suppose we have two high schools, say HS North and HS South, each with 1000 students. Consider a diagram (or graph) containing a point for each student at HS North with lines between students who are facebook friends, and a similar diagram for HS South. Is there a 1-1 map from the students at HS North to the students at HS South so that these diagrams look identical? That's the graph isomorphism problem.

To determine whether these two graphs were isomorphic you could look at all the possible mappings between students, but that's 1000! or more than 4x102567 possible maps. There has long been known how to search a smaller number of possibilities, especially if we put some restrictions on the diagrams, but always exponential in n (the number of students) in the worst case. Ideally we'd like a polynomial-time algorithm and Babai gets very close, an algorithm that runs in time nlogkn time for some fixed k.

Graph Isomorphism is one of those few problems, like factoring, known to be in NP but not known to be in P or NP-complete. Graph non-isomorphism is the poster child for the class AM, the problems solved by a randomized verifier asking a single question to a powerful prover. Graph non-isomorphism in AM implies that Graph Isomorphism is not likely NP-complete and that under reasonable derandomization assumptions that Graph non-isomorphism is in NP. Kobler, Schöning and Toran wrote a whole book on the computational complexity issues of graph isomorphism.

Even small progress in graph isomorphism creates waves. At a theory conference in the late 80's a speaker caused a major stir when he casually mentioned he had a proof (he didn't) that Graph non-isomorphism was in NP. Babai's announced caused a huge tsunami and those of us who know him realize he wouldn't make such an announcement without being sure he has a proof. The talk put together a large number of ideas from combinatorics and graph theory. My impression is that those who saw the talk didn't leave convinced of the proof, but did feel Babai had found the pieces to make it work.

With Babai's breakthrough algorithm, the smart money says now that graph isomorphism sits in P. It took decades to get from quasipolynomial time to polynomial time for primality testing and the same time frame may be needed to get polynomial time for graph isomorphism. But it will likely happen and the complexity of graph isomorphism then gets a whole lot simpler.

A couple of thoughts: All the breakthrough results that I can remember were released as papers, ready to devour. This is the first result of this caliber I remember being announced as a talk.

Also we think of theory as a young person's game, most of the big breakthroughs coming from researchers early in their careers. Babai is 65, having just won the Knuth Prize for his lifetime work on interactive proofs, group algorithms and communication complexity. Babai uses his extensive knowledge of combinatorics and group theory to get his algorithm. No young researcher could have had the knowledge base or maturity to be able to put the pieces together the way that Babai did.

More on Babai and graph isomorphism from Scott, Luca, Bill, TimDick and Ken, RedditScience News, New Scientist and Science.

Sunday, November 08, 2015

Looking forward to the GI result

As you all know Laszlo Babai will give a talk Tuesday Nov 10 on a result: GI in  quasipolynomial time (2 to a polylog). Other theory blogs have already commented on this   (GLL,In Theory,Shetl-Opt)

When I was in Graduate School (early 1980's) it looked like GI would get into P. Three key results: graphs of bounded degree, graphs of bounded genus, graphs of bounded eigenvalue multiplicity, were all shown to be in P. These results used group theory and linear algebra in serious ways so the thought was that more advanced group theory, perhaps the classification  of all finite simple groups (CFSG)  would be used to show GI in P.  If CFSG was used in an algorithm for GI in P then the algorithm might have an enormous constant, but that's okay since the quest for GI in P is not for practical reasons (I think that all the  people who want to solve it already have fast enough algorithms) . I was looking forward to the debates on if an algorithm with that big a constant would count as poly time (I would say yes). But no algorithm for GI in P, using CFSG or not, was found.  Also note that it was possible (and is still possible) that GI is NOT in P. In my 2012 survey of P vs NP I also asked about GI. Of the 20 people who responded 14 thought GI is in P, 6 thought not. Note that GI is not known to be in co-NP or BPP. It IS in IP[2], and if GI was NPC
then PH collapses, so it is unlikely that GI is NP-complete. This tells us NOTHING about whether it is in P.

An analogous thing happened with PRIMES.  Lots of results that got it almost in P, lots of hard number theory used, but then the progress stopped. Two differences: Most people thought PRIMES IS in P (likely because it's in coNP and BPP),and this was finally proven in 2002.

Is Babai's proof  likely to be correct? There are three parameters to consider when looking at that question: (1) Who is making the claim?, (2)  How likely the claim is to be true?, (3) How likely the claim is to be provable with what is known today?.You can give different weights to each one in different combinations..

Laszlo Babai is an expert in GI with a track record in the area. (reading that over again it undersells the case- Babai is, as the kids say, awesome!) That GI is in quasipolynomial time is quite believable. Even those 6 who think that Gi is not in P might believe that. Its harder to know if today's techniques are up to the task; however, there are no results like `group theory won't suffice' or any kind of obstacles, so certainly the claim seems provable with todays technology. 

Here's hoping...

Thursday, November 05, 2015

Computation and Journalism Part 2

The theory blogosphere is atwitter with an announcement of a talk next Tuesday at Chicago by László Babai Graph Isomorphism in Quasipolynomial Time. Luca will blog from the scene and we will have a full report as details emerge. 

More from the Comp and Journalism conference

0) (ADDED WRITE BEFORE POSTING) The conference was about how journalists do or should use technology. Part of this is how news gets to people. Twitter and Blogs are one method, as the above note about Babai's talk shows.

1) There was a panel discussion on context When one reads an article it depends on where one is coming from. Here is a strange thing that technology has allowed journalists to do:  An article can have slight differences depending on where its being read. For example if someone from New York is reading it may refer to Broadway, whereas if someone in LA is reading it it might refer to Hollywood. The tech also exists to have the reader (online) have a choice- so you can read it as if you are in place X. This scares me a bit- what if an article is can be tinkered with to appeal to lefthanded latino lesbians who work on the local
Lovasz Lemma.  Isn't our country fragmented enough?  STILL, the point that context matters came through nicely.

2) There was a panel of three ACTUAL JOURNALISTS (I think that all of the other speakers were academics) who used BIG DATA or MACHINE LEARNING or FILL IN BUZZWORD to write articles.  Propublica, a nonprofit newspaper (aren't they all?) did an article about rating doctors (here ) that seems to be very well checked (example- if a patient has to come back again because of a complication, is it the surgeons fault? hard to answer).  However, some doctors (those that got low ratings?) complained that there article was not peer reviewed. This brings up a question--- journalists run things by fact checkers and proofreaders, but do not have peer-review. Academics do. Which is the better system?  Another journalist  did a study of the Supreme court and found that there is a small set of lawyers  that are well connected (one used to play darts with Alito) who are involved in  over half of the cases the court sees. What interested me is--- what if you have  an idea for an article and you do ALL of the number crunching, bring in ML people  to help and find out in the end OH, the Supreme court actually is doing a fair job or OH, Doctors are pretty good. No story! They said this happens about 1/3 of the time where either you don't have enough data for a story or the story is not really a story.

3) There was a paper about YELP ratings of doctors. It seems that the actual
doctor-stuff is rarely mentioned, usually it was how long the wait time was,
how boring the magazines in the office were, etc.  Also, most doctors got
either 1 star or 5 stars. One thing they didn't mention but I will- Walter Palmer, the Dentist who killed  Lion in Africa, got lots of very negative YELP reviews from people who were never patients of his. So YELP ratings can be skewed by things independent of what is supposed to be measured. That is, of course, and extreme case.

4) Key Note by Chris Wiggins who is an academic who, on his Sabbatical, worked for the NYT on issues of How many copies of the paper should we print? and other  Operations Research type questions.   Print  subscriptions still out number online subscriptions, the online is doing pretty well and bringing in money, but they still NEED to find a better business model.I am reminded that when Jeff Bezos bought the Wash Post Stephen Colbert said there are more people buying newspapers than there are people buying newspapers.
(My spellchecker things online is not a word. Everyone else does.)

5) There was a panel on Education. From there point of view very pessimistic---
most schools don't offer courses in how to use Technology in journalism, no real
books, no agreement on what's  important.  And I suspect they will have a hard time updating courses once they have them. I wonder if the early days of Comp Science were like that.

6) The issue of jobs in journalism going away was not quite ignored but also not
quite confronted either. Some people told me that Journalism schools are not being honest with their students about the dismal job market. Then again, they are journalism students- they should investigate!

7) A journalism student told me of a case going to the supreme court that may be
very important for privacy: Spokeo vs Robins: here

8) UPSHOT- I am glad I went, I learned some things outside of what I usually think about.  Unlike the RATLOCC (Ramsey Theory and Logic) I was able to tell my mom what I learned.  I didn't bring my laptop nor had TV access I was able to work on some math and had a minor breakthrough. But that,s for a later blog
(My spellcheck thinks blog is not a word. It also thinks spellcheck is not a word.)

Tuesday, November 03, 2015

Conference on Computation and Journalism (Part I)

 (In honor of Boole's Birthday yesterday see this.)

I recently (Oct 2 and 3 2015) went to a conference on Computation and Journalism (see here for their schedule). I went since I co-blog and have a Wikipedia page (here)  hence I"ve gotten interested in issues of technology and information (not information-theory, but real world information).

I describe some of what I saw in two posts, today and thursday. Many of the articles I mention can be gotten from the schedule page above.

1) I understood all of the talks. This is very different from STOC, FOCS, CCC, etc; however, old timers tell me that back in STOC 1980 or so one could still understand all of the talks.

2) Lada Adamic gave a keynote on Facebook about whether or not Facebook causes an echo chamber (e.g., liberals only having liberal friends) Her conclusion was NO- about 25% of the friends of an L are a C and vice versa. This makes sense since people friend people who are workers and family, not based on political affiliation (Some of my in-laws are farther to the right than... most of my readers.) She also noted that 10% of the articles passed on by L's are Conservative- though this might not be quite indicative since they may often be `look at this crap that FOX news is saying now' variety. Note that Lada works for Facebook. Talking to people over lunch about that the consensus was that (1) the study was valid, but (2) if the conclusion had been that Facebook is tearing apart our country THEN would they have been allowed to publish it? I leave that as a question.

3) There was a Panel on Comments. The NYT has 14 people whose sole job is the soul-crushing job of reading peoples comments on articles and deciding what to let through. Gee, complexity blog just needs Lance and me. I found out that the best way to ensure intelligent comments and to get rid of trolls is (1) people must register, give their real name, and even have a picture of themselves, and (2) engage with the commenters (like Scott A does, though they did not use him as an example). Lance and I do neither of these things. Oh well.(3)COMPUTERS- can we have a program that can tell if an comment is worth posting? Interesting research question. One basic technique is to not allow comments with curse words in them- NOT out of a prudish impulse, but because curse words are an excellent indicator of articles that will not add to the discourse.

The most interesting question raised was Do Trolls know the are trolls? YES- they just like destroying things for no reason. Picture Bart Simpson using a hammer on Mustard packets just cause.

4) There was two sessions of papers on Bias in the news.

a) Ranking in the Age of Algorithms and Curated News By Suman Deb Rob. How should news articles be ranked? If articles are ranked by popularity then   you end up covering Donald Trump too much.  If articles are ranked by what the editors think thats too much of THE MAN telling me whats important.  Also, there are articles that are important for a longer period of time but never really top-ten important. Nate Silver has written that the most important story of the 2016 election is that the number of serious (defined: Prior Senators of Govs) who are running for the Rep nomination is far larger than ever before. But thats not quite a story.

b) The Gamma: Programming tools for transparent data journalism by Tomas Petricek. So there are data journalists that you can see right through! Wow! Actually he had a tool so that you could, using World Bank Data,  get maps that show stuff via colors. His example was the story that China produces more carbon emission than the USA, and a nice color graphic for it, but then the user or the journalist can easily get a nice color graphic of carbon-emms-per-person in which case the USA is still NUMBER ONE! (Yeah!) Later he had  a Demo session. I tried to look at % of people who go to college by country, but it didn't have info on some obscure countries like Canada. Canada? The project was Limited by the data we have available.

c) The quest to automate Fact-Checking by Hassan, Adair, Hamilton, Li, Tremayne, Yang, Yu. We are pretty far from being able to do this, however, they had a program that could identify whether or not something uttered IS checkable . When I am president I will lower taxes is not checkable, whereas Bill Mahr's claim that  Donald Trumps father is an Orangutang is checkable.

d) Consumer and supplies: Attention Asymmetries by Abbar, An, Kwak, Messaoui, Borge-Holthofer. They had 2 million comments posted by 90,000 unique readers and analyzed this data to see if there is an asymtery between what readers want and what they are getting. Answer: YES