Thursday, January 31, 2019

Phish Before Turkey

The Chronicle of Higher Education recently published a story Phishing Scheme Targets Professors’ Desire to Please Their Deans — All for $500 in Gift Cards. The same thing happened to me last fall.

Twas the day before Thanksgiving and an email went out to most of the faculty in my department.
From: Lance Fortnow <lancefortnow@yahoo.com>
Sent: Wednesday, November 21, 2018 1:45 PM
To: [name deleted]
Subject: 
Hello,are you available?
At the time I was in New Jersey visiting family. lancefortnow@yahoo.com is not my email. I do own fortnow@yahoo.com but don't email there, I rarely check it.

Some faculty checked with me to see if this is real. One faculty called me to see what I wanted. Once I found out what was happening I sent a message to my entire faculty to ignore those emails.

Some faculty did reply to see what I want. The response:
i need you to help me get an Amazon gifts card from the store,i will reimburse you back when i get to the office.
One of our security faculty decided to follow up and replied "Sure! Let me get them for you. Could you provide more more information? e.g., amount and #cards. I can bring them on Monday." The reply:
The amount i want is $100 each in two (2) piece so that will make it a total of $200 l'll be reimbursing back to you.i need physical cards which you are going to get from the store. When you get them,just scratch it and take a picture of them and attach it to the email then send it to me here ok
He went a few more rounds before the phisher just stopped responding.

A week later, a different faculty member came to my office and said I wanted to see him but he's been out of town. I said it was nice to see him but I didn't ask to talk to him and we figured out the confusion was the phishing email.

Someone went through the trouble of creating a fake email address in my name, looking up the email addresses of the faculty in the department and individually emailing each of them, without realizing computer science professors won't fall for a gift card phishing attack. Or at least none of them admitted falling for it.

Friday, January 25, 2019

The Paradigm Shift in FinTech Computation and the need for a Computational Toolkit (Guest Post by Evangelos Georgiadis)

The Paradigm Shift in FinTech Computation and the need for a Computational Toolkit

(Guest Post by Evangelos Georgiadis)

We are experiencing a paradigm shift in finance as we are entering the era of algorithmic FinTech computation. (**And another yet to come. See **Future** below.)  This era is marked by a shift in the role played by the theoretical computer scientist. In the not so distant past, the (financial) economist had the ultimate stamp of approval  for how to study financial models, pricing models, mechanism design, etc. The economist was the ultimate gatekeeper of ideas and models, whereas the main role of the computer scientist was to turn these ideas or models into working code; in a sense, an obedient beaver/engineer. (In finance, the theoretical computer scientist more often than not wears the hat of the quant.)

In today's era, the role of the theoretical computer scientist has been elevated from the obedient engineer to the creative architect not only of models and mechanism designs but also of entire ecosystems. One example is blockchain based ecosystems. In the light of this promotion from obedient engineer to architect, we might need to re-hash the notion of 'sharing blame', as originally and elegantly voiced in On Quants by Professor Daniel W. Stroock, when things go wrong.)

The role change is also coupled by a shift in emphasis of computation that in turn necessitates a deeper understanding of (what this author would refer to as) distributed yet pragmatic complexity based crypto systems' that attempt to redefine 'trust' in terms of distributed computation.

This change necessitates an ability to think in terms of approximation (and lower/upper bounds)  or other good-enough solutions that work on all inputs,  rather than merely easy instances of  problem types that usually lead to clean, exact formulas or solutions.  Additionally, looking through the lens of approximation algorithms enables a different and often more insightful metric for dealing with intrinsically hard problems (for which often no exact or clean solutions exist.) Computer Scientists are trained in this way; however, financial economists are not.   Might the economists actually get in the way?

Our tentative response: The economists are valuable and the solution to the dilemma is to equip them with the right 'computational toolkit'. Ideally, such a toolkit comprises computational tools and algorithms that enable automation of certain computational tasks which otherwise would necessitate more granular understanding at the level of a theoretical computer scientist (or mathematician)
OR be too cumbersome to perform by hand even for the expert.

Essentially, a toolkit even for the theoretical computer scientist that frees her from clerical work and enables computation to scale from clean cases, such as n=1, to pathological (yet far more realistic) cases, such as n=100000, all the way to the advanced and rather important (agnostic case or) symbolic case when n=k -- without much pain or agony.

The existence of such a toolkit would in turn do justice to the definition of FinTech Computation, which entails applying advanced computational techniques not necessarily information techniques) to financial computation. in fact, this author is part of building such an infrastructure solution which
necessitates the underlying programming language [R-E-CAS-T] to have intrinsic hybrid capabilities -- symbolic as well as numeric.

One step towards this  "automation" conquest is shown in A combinatorial-probabilistic analysis of bitcoin attacks with Doron Zeilberger.  The work illustrates an algorithmic risk analysis of the bitcoin protocol via symbolic computation, as opposed to the meticulous, yet more laborious by hand conquest shown by the European duo in Double spend races Heavy usage of the "Wilf-Zeilberger algorithmic proof theory" one of the cornerstones in applied symbolic computation, enabled automated recurrence discovery and algorithmic derivation of higher-order asymptotics. For example, in terms of asymptotics tools: the ability to internalize a very dense body of mathematics, such as the G.D. Birkhoff and W.J. Trjitzinsky method, symbolically, automates the process of computing asymptotics of solutions of recurrence equations; a swiss army knife for any user.

<**Future**>

What does the future entail for FinTech Computation ?

[My two satoshis on this]

Where are we headed in terms of type of computation ?

Blockchain based systems, even though some of us (including this author) have noticed fundamental flaws, seem to still have momentum, at least, judging from recent news articles about companies becoming blockchain technology friendly.  Ranging from (of course) exchanges such as our friends at Binance and BitMEX, we have major smartphone makers such as SamsungHuawei, and HTC. The favorable sentiment towards blockchain technology is shared even amongst top tier U.S. banks.
 Can one deduce success or failure momentum from the citation count distribution of the paper that laid grounds to this technology ? Bitcoin: A Peer-to-Peer Electronic Cash System)

If we look at crypto(currencies), one of many challenges for these blockchain based systems is the high maintenance cost.  Certainly in terms of energy consumption when it comes to the process of mining -- whether Proof-of-Work (PoW) is replaced by Proof-of-Stake (PoS) or some other more energy efficient consensus variant. (This author is aware of various types of optimizations that have been used.)
A few questions that have bugged this author every since ...

a) Is there a natural way to formalize the notion of energy consumption for consensus mechanisms?

b) What about formalizing an energy-efficient mechanism design ?)

(The idea of savings when PoW is replaced by PoS as intended by our friends at the Ethereum Foundation has been around for some time but the point of this author is, the value of 0.99*X (where X is a supernatural number  [a la Don E. Knuth style]), is still a big quantity; too big for an environmentalist ?)

So, what comes next ?

[... the satoshis are still on the table.]

Daniel Kane has brought to my attention that quantum computation -- the seemingly next paradigm shift in which again the role of TCS seem  inextricably interwoven --  may lead to blockchain based systems being replaced by less expensive (at least in terms of energy consumption) quantum based systems. (Crypto might get replaced by Quantum (money). :-)) One such pioneering approach is masterfully articulated by Daniel Kane in "Quantum Money from Modular Forms.

Thursday, January 24, 2019

Machine Learning and Wind Turbines


My daughter Molly spent six weeks on an environmental program in China last summer. When she got back she had to do a report on machine learning and wind turbines used for clean energy generation. What does machine learning have to do with wind turbines? Plenty it turns out and it tell us a lot about the future of programming.

Sudden changes in wind can cause damage to the blades of the turbine. Maintenance is very expensive especially for turbines in the sea and a broken turbine generates no electricity. To catch these changes ahead of time you can mount a Lidar on top of the turbine.


The Lidar can detect wind gusts from about 100 meters ahead, giving about 10 seconds to react. In that time you can rotate the blades, or the whole turbine itself to minimize any damage. Here's a video describing the situation.


How do you do the computations to convert the Lidar data into accurate representations of wind gusts and then how to best adjust for them? You could imagine some complex fluid dynamics computation, which gets even more complex when you several wind turbines in front of each other. Instead you can use the massive amount of data you have collected by sensors on the turbine and the Lidar information and train a neural network. Training takes a long time but a trained network can quickly determine a good course of action. Now neural networks can always make mistakes but unlike self-driving cars, a mistake won't kill anyone, just possibly cause more damage. Since on average you can save considerable maintenance costs, using ML here is a big win.

I've obviously over simplified the above but I really like this example. This is not an ML solution to a standard AI question like image recognition or playing chess. Rather we are using ML to make a difficult computation tractable mostly by using ML on available data and that changes how we think about programming complex tasks.

Sunday, January 20, 2019

ACM prize and some thoughts on the Godel Prize

(ADDED LATER: deadlines for some of these awards have passed but here are ones
coming up soon:

Godel Prize: Feb 15

Knuth Prize: Feb 15

SIGACT News Dist Service: March 1

)

As Lance Tweeted, and I will re-iterate, nominations for the following prizes
are due soon and you can nominate people here

Godel Prize for outstanding paper in TCS. (Godel mentioned P vs NP in a letter to Von Neumann. I've heard it said that its too bad they didn't work on it-- either it would be solved or we'd know its hard. Frankly, I think enough smart people have worked on it that we already know its hard.)

Knuth Prize for outstanding contributions to foundations of Computer Science. (its a greater honor to have a prize  named after you in your lifetime then to win a prize!)

Dijkstra Prize (I wonder if having `ijk' in his name inspired him to work in algorithms)

Kanellakis Theory and Practice Award.

Lawler Award for Humanitarian Contributions within CS and Informatics.

ACM Distinguished Service Award

Danny Lewin Best Student Paper Award (best student paper at STOC)

The Best Paper award (Best paper at STOC, Best paper at FOCS)

(The last two I doubt you can nominate someone for.)

A few thoughts on the Godel Prize:


1) You can win the Godel Prize twice and some people have: Goldwasser, Hastad, Arora, Szegedy, Spielman, Teng. Spielman-Teng have won it as a team twice.

2) GLAD there is no limit to how many can win. If a paper has a lot of people on it (and this has happened) then FINE, they're all winners! According to The Big Bang Theory (the TV show, not the theory) in Physics at most 3 can win a Nobel Prize in Physics for the same breakthrough in a given year. The show itself shows how stupid the policy is.

3) I either knew and forgot or never knew that DPDA Equiv is decidable! Glad to now it just in time for teaching Automata theory this spring.

4) Looking over the list reminded me that there are some papers in the intersection of those I want to read and those I am able to read! Though not many. Most I want to read but they seem hard.

5) The Kanellakis award is for theory that is PRACTICAL. Could someone win a Godel AND a Kannellakis award for the same paper (or set of papers). I found one sort-of case ( (a) below) and one definite case ( (b) below).

a) Moshe and Wolper won the 2000 Godel Prize for Temporal Logic and Finite Automata (I should also read that before my class starts)

Holtzmann, Kurshan, Vardi, and Wolpert won the 2005 Kanellakis prize for Formal Verification Tools.

I assume the two works are related.

b) Freund and Schapire won the 2003 Godel Prize and the 2004 Kanellakis Award, both for their work on boosting in Machine Learning.

6) Why is it the  Godel Prize and the Kanellakis Award? What is the diff between a prize and an award? A quick Google Search says that an Award is a token of effort and merit, while a Prize is something you win in a competition. I doubt that applies. I suspect they are called Prize and Award from historical accident. Does anyone know?

Thursday, January 17, 2019

The Cost of Privacy

Billboard at 2019 CES

Computer scientists tend to obsess about privacy and we've had a privacy/security debate for decades now. But now machine learning has really given us a whole new spin on what privacy protects and takes away.

I take an open approach and basically allow Google to know everything about my life. Google knows where I've been--sometimes my Pixel asks me which store in a shopping center I visited and I give up that info. Google knows who I communicate with, what websites I visit, what music and movies I listen to and watch, all my photos, what temperature makes me comfortable and so on.

What do I get? A Google ecosystem that knows me sometimes better than I know myself. Google works best when it learns and integrates. I get asked to download maps for trips Google knows I'm about to take. I have Google assistant throughout my house, in my phone, in my car and it tailor answers and sometimes even the questions that I need answers to. If anything I wish there was further integration, like Google Voice should ring my office phone only when I'm in the office.

Georgia Tech now forces us to use Microsoft Exchange for email. Outlook is not a bad email program but its capabilities, especially for search, does not work as well and think of all that unused knowledge.

I trust Google to keep my information safe, with a random password and 2-factor encryption and even if someone would manage to break in they would find I'm a pretty boring person with an unhealthy obsession of opera (the musical form not the browser).

Doesn't work for everyone and companies should make it easy to keep your info secure. But I say go use your machine learning on me and find ways to make my life easier and more fun, and sure send me some targeted ads as payment. The Internets will find a way to discover you anyway, might as well take advantage. 

Tuesday, January 15, 2019

do we ever only care about the decision problem? I know of only one case of that

(I had been thinking of this for a post then Lance's post on search versus decision inspired me to write up these thoughts.)

When teaching NP-completeness we often say

The problem we really care about is, for example, given a weighted graph and two vertices s and t, find the optimal way to go from s to t while hitting every node. But its cleaner mathematically to look at the decision problem:

{ (G,s,t,C) : there is a Ham Path from s to t that costs \le C }

The search and decision are poly time equivalent, so its fine to just look at the decision. Indeed- if our interest in in lower bounds then clearly if Decision is hard then Find is Hard.

But here are some questions about search vs decision in general, not just with regard to P vs NP.

1) Is there ever a case where the real world actually cares about the decision version? I can think of just one- given a number is it PRIME is used in Crypto. The real world does not need the witness that its prime (or similar).  They  just want a prime.  Any other cases?

2) How far apart can search and decision be? NP-decision and NP-search they are poly equivalent. In other domains can they be very far apart? For example, is FINDING a k-clique or k-ind set in a graph on 2^{2k} vertices require roughly n^k steps (go through all k-sets) or can we do much better? I suspect this is unknown but would be delighted if a commenter tells me otherwise.


Wednesday, January 09, 2019

Search versus Decision

Shockingly I've never done a post on search versus decision, one of the more interesting dualities in complexity. In short: Decision: Is there a needle in the haystack? Search: Find the needle.

In Satisfiability, or any other NP-complete problem, the two problems are essentially equivalent. If you can decided SAT you can find a solution (good homework problem) or even the best solution. Often people mix up the two, where people say finding the shortest Traveling Salesman Tour is NP-complete, usually without getting into too much trouble.

Decision is always at least as easy as search: If you have a solution you know there is one. What about the other direction? We can't actually prove search is hard without separating P and NP, but we have our conjectures.

Sometimes both are easy. We can easily find the maximum weighted matching.

Sometimes decision is easy and search is supposedly hard: Composite Numbers. The search version is factoring.

Sometimes decision is trivial (i.e. they always exist) and search is still hard. Nash Equilibria. Ramsey Graphs.

Often we ask whether search reduces to decision? If you have some oracle (magic black box) that answered decision questions, can you solve the search problem efficiently? SAT has this property, as does Matching (for trivial reasons). Nash Equilibrium and Composite Numbers likely don't.

Graph Isomorphism does, i.e., given an oracle for graph isomorphism you can find the isomorphism (another good homework problem).

There's also an interesting non-adaptive version. Given a SAT formula can you find an assignment with questions to a SAT oracle that all have to be asked at the same time?

Here we get a probable yes. If the formula has one solution you can find it by asking for each bit of the solution. Randomly you can reduce SAT to several formulas, one of which is likely to have a single assignment that is also an assignment of the original formula. With standard hardness assumptions you can eliminate the randomness.

Is the same true for graph isomorphism? I think that's still open.

Sunday, January 06, 2019

When is a kilogram not a kilogram?

A long long time ago the standards for meter's, kilograms, etc was an actual physical object.

Those days are long gone of course. For example, the meter is defined is the length of the path traveled by light in 1/299,792,458 th of a second. Why such an odd number (can fractions be odd?)? Because they retrofitted it to what that the meter is.  Rather than go to France and compare my stick to the one under a glass case I can just measure the speed of light. Oh. That sounds hard!

It matters a bit since the weight of what was the standard kilogram did increase over time, though of course not by much. When did the measurements for stuff STOP being based on physical objects and was all done based on constants of the universe?

The answer surprised me:

On Nov 16, 2018 (yes, you read that light) they decided that by May 20, 2019, the Kilogram will be defined in terms of Plank's constant. I have not been able to find out how they will use Plank, maybe they don't know yet (they do and its known -- see the first comment) .With that, there are no more standards based on physical objects. Read about it here.

Why did it take so long? I honestly don't know and I am tossing that question out to my readers. You can leave serious or funny answers, and best if I can't tell which is which!




Wednesday, January 02, 2019

Today is Thirdsday! Enjoy it while you can!

Fellow Blogger James Propp has come up with a new Math holiday:

Thirsdsday!

The day is Jan 3 (1-3 in America, though 3-1 in ... Everywhere else?) but only when Jan 3 is a Thursday.

It is a day where we celebrate the magic of the number 1/3.

0) For other math days to celebrate see here

1/3) James Propp's blog about Thirdsday on Monday Dec 31. Really ???   : here

2/3) Evelyn Lamb blogged about Thirdsday on Tuesday Jan 1. Really ??? : here

3/3) Ben Orlin blogged about Thirsdsday on Wedensday Jan 2. Really??? here

(Added ON Thirdsday: Matt Foreman has a video about Thirdsday: here and a blog post here)

 How come I'm the only one blogged  about Thirdsday on Thursday Jan 3 ??? (Added later- not quite true anymore, Matt Foreman also waited until Thirdsday to post on Thirdsday).
I asked Jim Propp about this. He said that he want to help prepare teachers and other eduators for the excitment of Thirdsday! If they already know the wonders of 1/3 they can prepare and lecture on it! Kudos to him! I assume that Evelyn and Ben are similar! Kudos to them! And Ben blogged ON Thirdsday so Kudos to him!

2) Darling asked me `is it a real day like Pi-Day?'  Is Pi-Day real? Is any Holiday real? All holidays are made up until they are accepted and become real. The distinction between real holidays and  made up holidays  is ... nonexistent.  One can talk of accepted and not-accepted holidays.  How long did Pi-day take to be accepted? This is prob not a well defined question.

3) James Propp's and Evelyn Lamb's  blog has many math properties of 1/3.  One educational property: I think it is the first number that students see that is an infinite decimal. My favorite unbiased use of 1/3: The Cantor Set: Uncountable subset of [0,1] that has measure 0. Really!!! My favorite biased use: its important in Muffin Math. If m>s and you want to divide and distribute m muffins to s students, there is always a way to do this with smallest piece at least 1/3. (Usually you can do better but this is sometimes the best you can do.)

4) When will the next Thirdsday come?

2019: Jan 3 is a Thursday, so YES

2020: Jan 3 is a Friday, so NO

2021: Jan 3 is a Sunday (why no Saturday? Leap year. Great- it will come sooner!)  so NO

2022: Jan 3 is a Monday, so NO

2023: Jan 3 is a Tuesday  so NO

2024: Jan 3 is a Wednesday  so NO

2025: Jan 3 is a Friday. WHAT! Why no Thirdsday?  Darn leap year! So NO.

2026: Jan 3 is a Saturday, so NO

2027: Jan 3 is a Sunday so NO

2028: Jan 3 is a Monday so NO

2029: Jan 3 is a Wedensday (Why no Tuesday? Leap year), so NO

2030: Jan 3 is a Thursday (Leap Year helped!), so YES FINALLY!

(Exercise: find a formula: if 2019 was the first Thirdsday, find the year for TD(i), the ith Thirdsday.)

So enjoy Thirdsday in 2019 when spellcheck still flags it.

In 2030 it will be an accepted holiday and spellcheck will think it's fine.