COLT Course, Master in Computing, Fall 2010: November 2010

Tuesday, November 30, 2010

2010-11-30 Learning using the Fourier spectrum

Today we presented the basics of Fourier analysis of boolean functions, and showed an algorithm that learns functions with (epsilon,d)-degree (= whose Fourier coefficients for parities of more than d variables have low squared sum).

This technique (as well as the application to constant-depth circuits that
we will see next day) is from:
- Constant-depth circuits, Fourier transform and learnability, N. Linial Y. Mansour and N. Nisan, Jour. Assoc. Comput. Mach., 40 (1993) 607 - 620.

The exposition in this and next lecture largely follows:

- Learning Boolean Functions via the Fourier Transform, by Y. Mansour (1994).

Exercise 21: Prove Parseval's identity with the hint given in class.

Exercise 22: Prove that the Fourier transform of the Fourier transform of f is f.

Exercise 23: Prove Claim 1 in the proof of correctness of the (epsilon,d)-degree learning algorithm.

NO CLASS ON FRIDAY DEC. 3RD AND TUESDAY DEC. 7TH

We will resume our classes on Friday, Dec. 10th.

Friday, November 26, 2010

2010-11-26 Reductions

We covered reductions in the PAC prediction model. The lecture was based on chapter 7 of Kearns and Vazirani's book, but you can find the original definitions in Pitt and Warmuth's paper Prediction-preserving reducibility.

Then, we generalized the notion of reduction for algorithms that are allowed to ask MQs. This can be found in Angluin and Kharitonov's paper When won't membership queries help?

Exercise 19: Show that CNF pwm-reduces to 2-Horn CNF, where a 2-Horn CNF formula is a CNF formula for which every clause contains at most 2 positive literals. Hint: use a reduction similar to the reduction from general to monotone DNF seen in class, and add trailing clauses to guarantee the answers to the membership query oracle.

Exercise 20: Define a suitable notion of reducibility for the model of learning from membership and equivalence queries, and prove the corresponding theorem: if C eq+mq reduces to C' and C' is eq+mq learnable, then so is C.

Tuesday, November 23, 2010

2010-11-23 Representation-independent hardness, PAC prediction reductions

In the first half of the class, we showed that, assuming that the RSA public-key cryptosystem is hard to break, boolean formulas and log-depth circuits are not PAC predictable. The lecture was based on Kearns and Valiant's paper Cryptographic Limitations on Learning Boolean Formulae and Finite Automata.

In the second half of the class, we saw an example of a PAC reduction, namely, we showed how DNF formulas PAC-reduce to monotone DNF formulas. Formal definitions and proofs are left for the next lecture.

Friday, November 19, 2010

2010-11-19 End of learning WA, Hardness of PAC learning

In the first half of the lecture, the algorithm and proof for learning Weighted Automata was presented. We also showed that this algorithm implies learnabilty of u unambiguous automata, probabilistic automata (where a MQ returns the probability that the target accepts x), decision trees, and polynomials over finite fields.

Exercise 16 (extra credit): Show how to use the WA algorithm to learn polynomials over the real numbers; should be polynomial in the number of variables, number of terms, and the degree of the polynomial.

Exercise 17: Prove that satisfy-k DNF is learnable from EQ+MQ for every fixed k. Hint: Construct a WA equivalent to any such formula. You can describe a direct construction or, alternatively, you can show that if f and g have WA of size s, then f $\lor$ g, f $\land$ g have WA of size s^2 approx, and deduce the result from here.

In the second half of the lecture, we started with negative results for PAC learning. In particular, we proved that proper PAC-learning 3-term-DNF implies RP=NP. This result can be found in chapter 1 of the book by Kearns and Vazirani An Introduction to Computational Learning Theory.

Exercise 18: Show that a graph G is 3-colorable iff there is a consistent 3-term DNF for the sample S_G as defined in class.

Wednesday, November 17, 2010

2010-11-16 End of learning DFA & learning WA

We finished the presentation of the DFA learning algorithm, showing how to process a counterexample and discussing the query and time complexity of the algorithm.

We showed how learnability of DFA implies learnability of O(log n)-term DNF.
The key is the following exercise:

Exercise 14: Prove that if a boolean function on n variables is computed
by a T-term DNF, then it is computed by a DFA of size polynomial in 2^T and n.
Deduce that O(log n)-term DNF are learnable in polynomial time in n from MQ+EQ.

We then presented the model of Weighted Automata (WA) over the semiring and started
explaining how the algorithm for DFA learning can be extended to learning WA over finite fields. This result was first proved by Bergadano&Varricchio (1994) and shown to have many appications by Beimel et al. (1996-conference and 2000-journal), who used the name "multiplicity automata" instead of "weighted automata". We left the following lemma as an exercise:

Exercise 15: Show that the the rank of the Hankel matrix of f is at most the number of distinct states in the smallest WA for f (w.r.t. some field).

2010-11-09 Learning DFA

We presented the problem of learning Deterministic Finite Automata from Queries. Learnability of DFA in the query model was first proved by Angluin (1987), and more efficient variations of Angluin's algorithm were proposed by Rivest&Schapire (1993) and Kearns&Vazirani (1994). Our presentation is slightly different: hopefully conceptually simpler, and easier to extend to the problem of learning Weighted Automata that we will discuss in the next session.

No exercises proposed today.

Friday, November 5, 2010

Next two sessions & cancelling Nov/12th

There will be no class on Nov/12 as both Marta and Ricard will be at a workshop.

On Nov/9th and Nov/16th we will discuss learning of finite automata. We will
roughly follow this writeup for last year's course.

2010-11-05 Learning Horn Formulas, continued

In this class we reviewed some important aspects of Horn formulas, introduced the algorithm by Angluin, Frazier and Pitt that learns Horn formulas, and proved its correctness. The original article can be found here.

Exercise 13: Simulate the AFP algorithm that learns Horn formulas,
assuming n=5, and the counterexamples you receive are:
01011,11100,00001,00011,01011,00011,00110,00011,01110,10010,00011,
01110,01010,00011,01110,01111,00011,01110,01111,00011,01110,11111.
For answering membership queries, assume that 00000, 01000, 00100,
00010, 10000 are positive, and 00001, 01010, 01100 are negative.

Tuesday, November 2, 2010

2010-11-02 Query Learning

We have introduced the model of exact learning from equivalence and membership queries, and the following algorithms:

learning monotone conjunctions with Equivalence Queries
learning monotone conjunctions with Membership Queries
learning monotone DNFs with EQ + MQs

Finally, we have introduced the class of Horn CNFs, together with some observations on its behavior. The algorithm and its proof are left for next class.

The algorithm for monotone DNFs has been explained in several classical papers, for example, Valiant's A theory of the learnable or Angluin's Queries and concept learning. The algorithm for Horn CNF is a classic result by Angluin et al. Learning conjunctions of Horn clauses.

Exercise 11: Show that learnability in the query learning model using EQs implies learnability in the PAC learning model. Hint: simulate the EQ oracle with a sample of size $\frac{1}{\epsilon} \ln \frac{1}{\delta}$ and use the union bound appropriately.

Exercise 12: Complete the proof of learnability of monotone DNF formulas from Membership and Equivalence Queries.