Tuesday, November 03, 2015

Conference on Computation and Journalism (Part I)

 (In honor of Boole's Birthday yesterday see this.)

I recently (Oct 2 and 3 2015) went to a conference on Computation and Journalism (see here for their schedule). I went since I co-blog and have a Wikipedia page (here)  hence I"ve gotten interested in issues of technology and information (not information-theory, but real world information).

I describe some of what I saw in two posts, today and thursday. Many of the articles I mention can be gotten from the schedule page above.

1) I understood all of the talks. This is very different from STOC, FOCS, CCC, etc; however, old timers tell me that back in STOC 1980 or so one could still understand all of the talks.

2) Lada Adamic gave a keynote on Facebook about whether or not Facebook causes an echo chamber (e.g., liberals only having liberal friends) Her conclusion was NO- about 25% of the friends of an L are a C and vice versa. This makes sense since people friend people who are workers and family, not based on political affiliation (Some of my in-laws are farther to the right than... most of my readers.) She also noted that 10% of the articles passed on by L's are Conservative- though this might not be quite indicative since they may often be `look at this crap that FOX news is saying now' variety. Note that Lada works for Facebook. Talking to people over lunch about that the consensus was that (1) the study was valid, but (2) if the conclusion had been that Facebook is tearing apart our country THEN would they have been allowed to publish it? I leave that as a question.

3) There was a Panel on Comments. The NYT has 14 people whose sole job is the soul-crushing job of reading peoples comments on articles and deciding what to let through. Gee, complexity blog just needs Lance and me. I found out that the best way to ensure intelligent comments and to get rid of trolls is (1) people must register, give their real name, and even have a picture of themselves, and (2) engage with the commenters (like Scott A does, though they did not use him as an example). Lance and I do neither of these things. Oh well.(3)COMPUTERS- can we have a program that can tell if an comment is worth posting? Interesting research question. One basic technique is to not allow comments with curse words in them- NOT out of a prudish impulse, but because curse words are an excellent indicator of articles that will not add to the discourse.

The most interesting question raised was Do Trolls know the are trolls? YES- they just like destroying things for no reason. Picture Bart Simpson using a hammer on Mustard packets just cause.

4) There was two sessions of papers on Bias in the news.

a) Ranking in the Age of Algorithms and Curated News By Suman Deb Rob. How should news articles be ranked? If articles are ranked by popularity then   you end up covering Donald Trump too much.  If articles are ranked by what the editors think thats too much of THE MAN telling me whats important.  Also, there are articles that are important for a longer period of time but never really top-ten important. Nate Silver has written that the most important story of the 2016 election is that the number of serious (defined: Prior Senators of Govs) who are running for the Rep nomination is far larger than ever before. But thats not quite a story.

b) The Gamma: Programming tools for transparent data journalism by Tomas Petricek. So there are data journalists that you can see right through! Wow! Actually he had a tool so that you could, using World Bank Data,  get maps that show stuff via colors. His example was the story that China produces more carbon emission than the USA, and a nice color graphic for it, but then the user or the journalist can easily get a nice color graphic of carbon-emms-per-person in which case the USA is still NUMBER ONE! (Yeah!) Later he had  a Demo session. I tried to look at % of people who go to college by country, but it didn't have info on some obscure countries like Canada. Canada? The project was Limited by the data we have available.

c) The quest to automate Fact-Checking by Hassan, Adair, Hamilton, Li, Tremayne, Yang, Yu. We are pretty far from being able to do this, however, they had a program that could identify whether or not something uttered IS checkable . When I am president I will lower taxes is not checkable, whereas Bill Mahr's claim that  Donald Trumps father is an Orangutang is checkable.

d) Consumer and supplies: Attention Asymmetries by Abbar, An, Kwak, Messaoui, Borge-Holthofer. They had 2 million comments posted by 90,000 unique readers and analyzed this data to see if there is an asymtery between what readers want and what they are getting. Answer: YES

