Considering the topic of yesterday's post, if there were a list of favorite theorems in computational learning theory, a field which makes only brief appearances here on the weblog despite its many connections to computational complexity, Jeff Jackson's algorithm for learning DNF formulas would certainly be on it.

A DNF formula is a Boolean formula written as an OR of ANDs (e.g. x_1 and x_2 OR x_3 and x_5). The size of a DNF formula is equal to the number of terms or ANDs. The problem of PAC learning an unknown polynomial-size (in n, the number of variables) DNF formula with respect to an arbitrary distribution remains one of the most notorious open problems in the field (for background on PAC learning see this post or Kearns and Vazirani's excellent book An Introduction to Computational Learning Theory).

In fact, even if we restrict the underlying distribution on examples to be the uniform distribution, the fastest algorithm for learning DNF formulas runs in quasi-polynomial time (a result due to K. Verbeurgt-- the main idea being that only terms of logarithimic length have a chance at being satisfied, so longer terms can be ignored).

If, however, we allow the learner to make *queries* to the unknown DNF formula, i.e. if the learner can choose any input x and ask for the value of the DNF formula evaluated on x, then the learner can succeed in polynomial-time.

The solution, due to Jeff Jackson in 1994, shows how to learn polynomial-size DNF formulas with respect to the uniform distribution in polynomial-time (again assuming the learner has query access to the unknown DNF). His algorithm, which he has called the Harmonic Sieve due to its use of Fourier analysis, builds on work due to Blum, Furst, Jackson, Kearns, Mansour, and Rudich (``Weakly Learning DNF and Characterizing Statistical Query Learning Using Fourier Analysis'') which showed that for any DNF formula with s terms, there exists a parity function which agrees with the DNF formula on roughly a 1/2 +1/s fraction of inputs.

The next step of the algorithm involves a novel application of Boosting algorithms (see this post for more on Boosting) for combining these parity functions to obtain an accurate hypothesis. The output of the Harmonic Sieve is not a DNF formula but a threshold of parity functions.

The Harmonic Sieve is one of the rare examples in computational learning theory of a polynomial-time algorithm for an expressive concept class. It is natural to ask whether the queries are essential for the algorithm. Unfortunately it seems like the answer is yes-- we do not know how to learn decision trees or even juntas (both strictly weaker concept classes than DNF formulas) in polynomial-time with respect to the uniform distribution unless the learner has query access to the unknown function. Removing the dependence on queries would be a real breakthrough.

By the way, the interface on blogger.com is worse than I ever could have imagined. Apologies in advance for formatting errors.

## No comments:

## Post a Comment