COLT Course, Master in Computing, Fall 2010: December 2010

Tuesday, December 21, 2010

2010-12-21 Multi-Armed Bandits

In this last lecture of the course, we have covered the algorithm Exp3 for the multi-armed bandit problem. The result can be found in the original paper The Nonstochastic Multiarmed Bandit Problem.

Wednesday, December 15, 2010

Paper list for presentations -- send us your choice by Monday Dec. 20th!

Here is our suggestion of papers for presentation. Chose your own if you prefer, we will look at it and accept it if it is up to standards.

Tuesday, December 14, 2010

Today we saw how to obtain strong PAC learning algorithms from weak ones. We showed how to boost the accuracy using the Adaboost algorithm.

The first boosting algorithm is due to Rob Schapire and is described in his paper The Strength of Weak Learnability. Freund followed this work with a more practical boosting algorithm described in Boosting a Weak Learning Algorithm by Majority. Later on they combined their ideas into Adaboost, which was first described in the article A decision-theoretic generalization of on-line learning and an application to boosting (Freund and Schapire received the Gödel Prize for this work in 2003). There are many many *many* papers on boosting from the machine learning and statistics communities; more literature can be found in the page http://www.boosting.org.

Exercise 26: Re-do the proof for AdaBoost seen in class, completing all the details and the holes that were left out.

Friday, December 10, 2010

2010-12-10 Learning using the Fourier Spectrum (continued) and Boosting

In the first part of the class, we have stated without proof that (AND,OR,NOT)-circuits of size s and depth d have "low" Fourier degree, from which it follows that they can be learned in time n^((log s)^d); this is due to Linial-Mansour-Nisan. We have then shown how to use the low-degree algorithm to learn decision lists and decision trees, and mentioned the Kushilevitz-Mansour algorithm that uses Membership Queries and can learn decision trees in polynomial time.

In the second part of the class we saw two variations on the PAC learning definition, one that fixes the confidence parameter $\delta = 1/2$, and one that fixes the accuracy parameter $\epsilon = 1/2$. We showed that the first variation is equivalent to the original one.

Exercise 24: Give depth-d circuits of size about 2^{n^{O(1/d)}} for parity of n variables. Hint: use divide-and-conquer.

Exercise 25: Use the [LMN] result on the Fourier degree of AC^0 to show that
depth-d circuits for parity of n variables must have size about 2^{n^\Omega(1/(d-1))}.
Of course, the more precise you can be in your bounds, the better solution.

COLT Course, Master in Computing, Fall 2010