Sunday, February 18, 2018

The web bad/People using it bad.

I was going to write a post about how hard it was to find what grades mean at different schools (e.g., at UMCP W (for withdraw) means the student dropped the course, but at UT Austin its Q (for Quit?)) but then I found that my Googling

Name of school grade key

I could find it.  Okay, so grade keys are not that hard to find.

WEB BAD:

However, course descriptions are. Both questions (what grades mean and course desc) came up when I do admissions for my REU program.  If a student has a course in Discrete Mathematics that could mean a course where you learn how to prove simple things OR it could mean a course where you proof  the graph minor theorem (actually I doubt that) or something in between. Similar for Principles of Programming languages and Software Engineering in that they could mean a variety of things. I think math and physics are more standardized, but even there you have the problem that Advanced calc could be either multivar or a course with proofs or something else. There is a great related XKCD here.

PEOPLE USING THE WEB BADLY:

Fellow blogger Scott Aaronson recently posted about Vivek Wadhwa's Washington post column on quantum computing (the post is here) The article by Vivek told the tired and utterly false story that QC can be used to solve TSP by looking at all possibilities. Should Vivik have known better by looking at the web? At first I thought YES he should know better. But then I thought -- I should try to see what he might have done. Maybe the web has this one wrong. In fact, some of my students Iwho, I should add, are not writing articles on QC for the Washington Post)  tell me that they don't need to learn the proof of the Cook-Levin theorem since Quantum Computers can solve SAT anyway.  So I decided to pretend I didn't know anything about QC and went to the web to see what I could find.  First stop: Wikipedia. I found the following quote in the QC article:

BQP is suspected  to be disjoint from NP-complete and  a strict superset of P though this is not known.

So, the first place I look I find that BQP is suspected to NOT solve SAT or TSP or... Realize that Wikipedia is not some obscure source of  hidden knowledge. It is literally the first place one would look.  Hence, no excuse, even if other experts told him otherwise (which seems to be the case, though `experts'  should be in quotes) the Wikipedia quote should at least give pause.

So even when Wikipedia gives the right information, not everyone looks there.


Thursday, February 15, 2018

Corporate Education

First of all read the #metoo testimonial going around the TCS blogosphere. Our field is not immune.

Last Sunday Frank Bruni wrote an op-ed column Corporations will Inherit the Earth, an article on how corporations have taken on what governments used to do. Quite a bit is focused on education.
The nimbleness of corporations gives them an edge over hoary, complacent institutions, including those in higher education...In an effort to make sure that employees have up-to-the-minute technical skills--or are simply adept at critical thinking and creative problem solving -- more companies have developed academies of their own. That's likely to accelerate. "I think enterprises like Amazon and Google are going to build universities that teach coding and things the nation needs" said Margaret Spellings, former education secretary and current president of the University of North Carolina system.
Already in the universities we've seen a move towards more majors and minors in STEM and computer science in particular. The changing corporate workplace has already changed academia, putting more an emphasis on technical areas and less on the liberal arts. Will companies though take the next step, running their own schools that focus on the needs of industry? If we see continued decreased government funding for universities and academic research, we may end up with a handful of rich universities and the rest of the population getting education on the job, a future that would be a loss for us all.


Saturday, February 10, 2018

A ``new'' ``application'' of Ramsey Theory/A Truly ``bad'' math song

Ian Parberry once told me (though I doubt he originated it- The first link I found says it was Mark Twain)


to a man with a hammer, everything looks like a nail

Indeed. Since I am teaching a grad course Ramsey theory and its ``Applications'' (I got 24 students, which is more than I thought I would- including around 10 ugrads who are taking it because `all the cool kids are taking it' ) I have been taking the hammer of Ramsey Theory and looking for nails to apply it to. (I'm also using a webiste I found of applications of Ramsey Theory here and an survey article on applications of Ramsey here.)

Thinking about the infinite Ramsey Theorem I came up with this ``application'' :

If p1, p2, p3, ... is an infinite sequence of points in Rn then there exists a subsequence q1, q2,q3,... such that for each coordinate 1 ≤ i ≤ n the projection onto that coordinate is either (a) strictly incresing, (b) strictly decreasing, or (c) constant.

Proof:  For i< j color (i,j) with one of 3^n colors - indicating for each coordinate i if the ith projection is increaing, decreasing, or constant. The infinite homog set gives you the sequence.

End of Proof

One can also proof this from the Bolzano-Weierstrass  theorem (an infinite bounded sequence of points in Rn has a convergent subsequence). We leave that proof to the reader; however, the proof of BW looks like the proof of the infinite Ramsey Theorem, so I'm not sure if my proof is new or not.

I wanted to look into the BW theorem so I googled "Bolzano-Weierstrass"  I think Google knows me better than I know myself since the second hit was https://www.youtube.com/watch?v=dfO18klwKHg which is a Rap Song about the BW theorem (I am a fan of novelty songs, and of math, so does it follow I am a fan of Math Novelty songs. Not sure if its follows, but I AM!)

One of the problems on the HW was BW-rap- good, bad, or really bad?

Answers were:

1) Made my ears bleed

2) Lyrics good, singing really bad

3) So bad its good

4) No, just bad.

I'm in the Lyrics good/singing is `so bad its good' camp. The class was okay with the lyrics, but mostly thought it was so bad its... bad. One person thought it was awesome!

I would like to see  real rapper perform  it on you tube. I doubt I will. Oh well.

Thursday, February 08, 2018

For the Love of Algorithms

Wired magazine labelled 2017 as The Year We Fell Out of Love with Algorithms. The article goes on to talk about how algorithms give us filter bubbles, affect elections, propagate biases and eliminate privacy. The article at the end argues that we shouldn't blame the algorithm but the people and companies behind them.

Every day algorithms decide what music we listen to, what posts we see on Facebook and Twitter, how we should drive from point A to point B, what products we should buy. Algorithms feed my nostalgia in Facebook and Google Photos showing me what once was. Algorithms recognize my voice and my face. We've even created new currencies from algorithms. Algorithms form the backbone of the top six world's largest public companies (Apple, Alphabet, Microsoft, Amazon, Facebook and Tencent). It's been a long time since I only trusted algorithms to format my papers.

Algorithms have taken over nearly every aspect of our lives and our economy. With new machine learning techniques, no longer can the creator of an algorithm fully understand how algorithms work. But we aren't in a Skynet world, algorithms are not sentient and not even inherently bad, any more than we label a group of people inherently bad because of a few examples.

This societal transition from human-oriented decision to algorithmic decision making will continue and will have great benefits such as much safer cars and medical care. We must be vigilant, of course, to how algorithms will change society, but (in a serious sense) I welcome our new machine overlords.

Monday, February 05, 2018

"How can you use your research for ugrad projects?'- the wrong question

I was helping a math PhD who worked in computable ramsey theory prepare his teaching and research statements for his job application. One of the questions various schools wanted to know was

How can you use your research as the basis for undergraduate projects?

His answer was kind of a cheat- Ramsey Theory was combinatorics which ugrads can work on without too much prior knowledge (true), computability theory could be picked up (not true), and the chance to combine the two should thrill ugrads (it doesn't even thrill most Ramsey Theorists!).

I don't fault the PhD student, I fault the school. What they really want to know is:

Can you come up with ugrad projects?

They should realize that some math requires too much bacground and is not appropriate for ugrads AND that a math PhD will have picked up OTHER knowledge for projects. They may relate some to the research (e.g., in the case above just Ramsey Theory) or they may not even be close.

In Comp Sci it is more common for an ugrad to work on research related to a profs research since Comp Sci is a younger field and needs less background knowledge. And students can often code something up of interest related to the profs research.

SO- do you have ugrads work on your research program OR on something else?




Thursday, February 01, 2018

Flying Blind

Many computer science conferences have made a number of innovations such as a rebuttal phase, multi-tiered program committees, a hybrid journal/conference model with submission deadlines spread through the year. Not theoretical computer science which hasn't significantly changed their review process in the major conferences since allowing electronic submissions in the early 90's and an ever growing program committee now at 30 for FOCS.

Suresh Venkatasubramanian learned this lesson when he ran a double blind experiment for ALENEX (Algorithmic Engineering and Experiments) and laid out an argument for double blind at broader theory conferences to limit the biases that go along with knowing the authors of a paper. The theory blogosphere responded with posts by Michael Mitzenmacher, Boaz Barak and Omer Reingold and a response by Suresh. I can't stay out of a good conference discussion so here goes.

Today major orchestra auditions happen behind a screen with artists even told to remove footwear so sounds won't give away the gender of the musician. On the other extreme, the value of a piece of art can increase dramatically in price if it is discovered to be the work of a famous artist, even though it is the same piece of art. Where do research papers lie? It's more a work of art than a violinist in a symphony.

Knowing the authors gives useful information, even beyond trusting them to have properly checked their proofs. Academics establish themselves as a brand in their research and you learn to trust that when certain people submit to a conference you know what you get, much the way you would more likely buy a new book from an author you love.

Suresh rightly points out that having authors names can and do produce biases, often against women and researchers at lesser known universities. But we should educate the program committee members and trust that they can mitigate their biases. We can completely eliminate biases by choosing papers at random but nobody would expect that to produce a good program.

Having said all that, we should experiment with double blind submissions. Because everything I said above could be wrong and we won't know unless we try.

Monday, January 29, 2018

Publishing: Why are we doing it right and other sciences aren't?

(NOTE- this is NOT a `we hate Elsevier and the others' post- though I suspect the comments will be about that.)

Alexandra Elbakyan has created a repository of science papers for scientists to share. She did this because too many papers were behind paywalls. An article about her (and the inspiration for this post) is here. Note that she has had some legal problems.

But it got me thinking: I have rarely had a problem getting an article. Between authors websites and arXiv most of what I want is out there.  Lance and others I've asked agree--- while there are problems with editors and publishes etc, access is NOT one of them.

Why don't other scientists just routinely do this?

some speculation but if someone actually knows please comment

1) Some scientists purposely don't post on line to avoid being scooped. Or the fear of that

2) Some scientists don't post for fear of legal issues with publishing. I do not know of a single computer scientist or mathematician that has gotten in any kind of trouble for posting a paper on their website or arXiv.

3) Some scientists are lazy.

4) (This contrasts with Comp Sci) other fields do things the way they do since they ALWAYS did it that way. Comp Sci is a younger field and hence is less tied to tradition.

5) Other scientists ARE posting their papers (gee, then why did Alexandra have to create the reporistory).

6) Alexandar is where arxiv was X years ago. They'll catch up. But they are having legal problems-- did arXiv ever have legal problems?

Looking over these I'm not sure I believe any of them. Any other ideas?

Again, the question is: Why don't other fields routinely post to something like arXiv or their own websites?

Friday, January 26, 2018

From Art to Science

Q: Why did the neural net cross the road?
A: Who cares along as it got to the other side.

Whit Diffie and Martin Hellman in their classic 1976 paper talk about the transition of cryptography:
Theoretical developments in information theory and computer science show promise of providing provably secure cryptosystems, changing this ancient art into a science.
Indeed we have some very strong theoretical foundations for cryptography, though they still need to rely on unproven hardness assumptions. Often we trade off formal proven techniques for efficiency, in for example AES and Bitcoin. Here we use "science" in another sense: extensive testing of the hardness of these protocols. Creating and breaking cryptographic protocols used to be an art, having the right intuitions to make things work (for now). Today, breaking security rarely involves breaking a cryptographic protocol but rather due to bad implementations and easy-to-guess or phished passwords.

For neural networks we still live in an art regime. We have great methods to minimize the error for a given network (though no formal proofs as such), one still needs to design the right network with the appropriate topology and various tricks like regularization to prevent overfitting. Several members of the theoretical CS community have taken on the task to turn the art into a science but getting us to where we sit in cryptography will take some time. Crytography took thousands of years to get from an art to science, hopefully we can get there faster for machine learning.

This debate came to a head at the recent NIPS conference. Ali Rahimi in his test-of-time talk described the current state of ML research as "alchemy". Yann LeCun gave a strong response
Sticking to a set of methods just because you can do theory about it, while ignoring a set of methods that empirically work better just because you don't (yet) understand them theoretically is akin to looking for your lost car keys under the street light knowing you lost them someplace else.
Imagine if we never did any cryptography until we had the science in place.

Proving theorems in mathematics is an art more than a science. Because of incompleteness in some sense it has to be. If mathematics ever became automated it would be a whole lot less fun.