Monday, April 17, 2017

Understanding Machine Learning

Today Georgia Tech had the launch event for our new Machine Learning Center. A panel discussion talked about different challenges in machine learning across the whole university but one common theme emerged: Many machine learning algorithms seem to work very well but we don't know why. If you look at a neural net (basically a weighted circuit of threshold gates) trained for say voice recognition, it's very hard to understand why it makes the choices it makes. Obfuscation at its finest.

Why should we care? A few reasons:

  • Trust: How do we know that the neural net is acting correctly? Beyond checking input/output pairs we can't do any other analysis. Different applications have a different level of trust. It's okay if Netflix makes a bad movie recommendation, but if a self-driving car makes a mistake...
  • Fairness: Many examples abound of algorithms trained on data will learn intended or unintended biases in that data. If you don't understand the program how do figure out the biases?
  • Security: If you use machine learning to monitor systems for security, you won't know what exploits still might exist, especially if your adversary is being adaptive. If you can understand the code you could spot and fix security leaks. Of course if the adversary had the code, they might find exploits. 
  • Cause and Effect: Right now at best you can check that a machine learning algorithm only correlates with the kind of output you desire. Understanding the code might help us understan the causality in the data, leading to better science and medicine. 
What if P = NP? Would that help. Actually it would makes things worse. If you had a quick algorithm for NP-complete problems, you could use it to find the smallest possible circuit for say matching or traveling salesman but you would have no clue why that circuit works. 

Sometimes I feel we put to much pressure on the machines. When we deal with humans, for example when we hire people, we have to trust them, assume they are fair, play by the rules without at all understanding their internal thinking mechanisms. And we're a long way from figuring out cause and effect in people.


  1. If P=NP you could also find the shortest proof in your favorite formal system that the smallest possible circuit does what you wanted it to do, as well as any other claim you are wondering that may be true about the circuit. That proof might not be comprehensible to you, but it could be written in a format where proof assistant software such as HOL or Coq could parse it and convince you it is correct. So if P=NP (with feasible low constants) I think that would definitely help.

    1. no, correctness proofs can still be unfeasibily long.

    2. Obviously correctness proofs can be unfeasibly long. Those would also be among those explanations of program behavior that we cannot understand. I was addressing Lance's claim that:

      "What if P = NP? Would that help. Actually it would makes things worse. If you had a quick algorithm for NP-complete problems, you could use it to find the smallest possible circuit for say matching or traveling salesman but you would have no clue why that circuit works."

      My point was simple: if short understandable explanations of a circuit's behavior exist at all, then we could find those explanations under P=NP (with the usual caveats of low-order constants). So P=NP would arguably help, not make things worse.

  2. Lance concludes "… we're a long way from figuring out cause and effect in people."
    The NIH web site "" provides a concrete overview of our present understanding of "cause and effect in people".

    For example, a search of clinical trials concerning "personality" and "therapy", restricted to clinical trials presently recruiting volunteers, yields a peer-reviewed listing — a listing that even can be conveniently alphabetized — of conditions in respect to which an understanding of "cause and effect in people" is objectively lacking.

    Q  Are we to infer that cognitive scientists, complexity theorists, and psychiatric physicians soon will be reading each other's literature?

    A  They already are! Not often in STEAM-history, does cross-pollination of trans-disciplinary ideas occur as vigorously, fertilely, dangerously, even distressingly, and none-the-less excitingly as at present.

    Q  Does this mean that programmers increasingly are consciously acting as (informatic) therapists to their neural networks, and concomitantly, psychiatrists increasingly are consciously acting as (epigenetic/connectomic) programmers to their human clients?

    A  For more-and-more STEAM-workers — young ones especially — the plain-and-simple answer is "yes (obviously)".

    Whether welcome or not (opinions differ, obviously), isn't this transformative unification empirically evident (nowadays) to pretty much everyone?

  3. To your list of reasons to care, I would add another: that the problem of understanding which learning tasks are feasible for these methods is a fascinating and important one from a purely intellectual point of view.

  4. Given a number x and a set S of n positive integers, MINIMUM is the problem of deciding whether x is the minimum of S. We can easily obtain an upper bound of n comparisons: find the minimum in the set and check whether the result is equal to x. Is this the best we can do? Yes, since we can obtain a lower bound of (n - 1) comparisons for the problem of determining the minimum and another obligatory comparison for checking whether that minimum is equal to x. A representation of a set S with n positive integers is a Boolean circuit C, such that C accepts the binary representation of a bit integer i if and only if i is in S. Given a positive integer x and a Boolean circuit C, we define SUCCINCT-MINIMUM as the problem of deciding whether x is the minimum bit integer which accepts C as input. For certain kind of SUCCINCT-MINIMUM instances, the input (x, C) is exponentially more succinct than the cardinality of the set S that represents C. Since we prove that SUCCINCT-MINIMUM is at least as hard as MINIMUM in order to the cardinality of S, then we could not decide every instance of SUCCINCT-MINIMUM in polynomial time. If some instance (x, C) is not in SUCCINCT-MINIMUM, then it would exist a positive integer y such that y < x and C accepts the bit integer y. Since we can evaluate whether C accepts the bit integer y in polynomial time and we have that y is polynomially bounded by x, then we can confirm SUCCINCT-MINIMUM is in coNP. If any single coNP problem cannot be solved in polynomial time, then P is not equal to coNP. Certainly, P = NP implies P = coNP because P is closed under complement, and therefore, we can conclude P is not equal to NP.

    You could read the details in:

  5. I will remove the link and use instead:

  6. A video in youtube with the explanation of my P versus NP solution!!!