TTIC 31120: Computational and Statistical Learning Theory
This is a webpage for the Winter 2011 course "Computational and Statistical Learning Theory", taught at TTIC, and also open to all University of Chicago students.
Lectures: Monday and Wednesday 34:30pm at TTIC 530 (located at 6045
S. Kenwood Ave, fifth floor).
Instructor: Nati Srebro.
TA: Karthik Sridharan.
Course Description
We will discuss classic results and recent advances in statistical
learning theory (mostly under the agnostic PAC model), touch on
computational learning theory, and also explore the relationship with
stochastic optimization, the oracle model of optimization, and online
regret analysis. Our emphasis will be on developing a rigorous
quantitative understanding of machine learning and acquiring techniques
for analyzing and proving performance guarantees for learning methods.
PreRequisites
 Mathematical maturity, as obtain, e.g., in a rigorous analysis course.
 Discrete Math (specifically combinatorics and asymptotic notation)
 Probability Theory
 Introduction to Machine Learning
 Algorithms; Basic Complexity Theory (NPHardness)
Familiarity with Convex Optimization, Computational Complexity and
background in Statistics can be helpful, but is not required.
Specific Topics:
We will try to touch
 The Statistical Model (Learning Based on an IID Sample):
 The PAC (Probably Approximately Correct) and Agnostic PAC models.
 Stochastic Optimization
 Hoeffding and Bernstien Inequalities
 Cardinality Bounds
 Description Length Bounds
 PACBayes
 Compression Bounds
 The Growth Function and VC Dimension
 VC Subgraph Dimension and Fat Shattering Dimension
 Tight Characterization of Learning in terms of the VC and Fat Shattering Dimensions
 Covering Numbers
 Radamacher Averages, including Local Radamacher Analysis
 Uniform Learning and NoFree Lunch Theorems
 Online Learning, Online Optimization and Online Regret
 The Perceptron Rule and Online Gradient Descent
 Experts and the Winnow Rule
 Bergman Divergence and Online Mirror Descent
 Online to Batch Conversion
 Computational Lower Bounds:
 Computational Hardness of Proper Learning
 Cryptographic Hardness of Learning
 Lower Bounds in the Oracle Model of Optimization
 Additional Topics
 Stability Based Analysis
 Boosting: Weak Learning and the Margin Interpretation of Boosting.
 Comparison of assumptions and results with typical statistical
study of regression.
Requirements and Grading:
 Required Reading: Some material will be covered through assigned
reading, rather then inclass.
 Scribing: students are required to scribe (prepare lecture notes)
one lecture. This is required in order to receive a
grade in the class, and the quality of the notes will be
taken into consideration in determining the grade.
 Exercises and Problems: every 12 weeks three types of problems will be
posted:
 Exercises: These are relatively easy review exercises, often
covering required reading not covered in class. A random subset of
these will be graded.
 Challenge Problems: These are harder problems that might be a challenge to
solve. Successfully completing at least 12 of
these is required for receiving an "A" in the
class. Students will also be asked to present
and discuss (with the instructor) the solutions
to 12 of these.
 Research Problems: I don't know how to solve these (though I might
have an idea). Solving one of these will
likely earn an "A+" in the class and might
lead to a publication.
 There will be a short final inclass quiz on basic concepts
covered in the class.
In order to receive a "B" a student is expected to complete
the exercises, do a good job at scribing a lecture and
display reasonable understanding on the quiz.
In order to receive an "A" a student is expected to complete the
exercises, do a very good job scribing a lecture,
successfully answer at least 12 Challenge Problems and
present them to the instructor satisfactorily.
Lectures and Required Reading:
 Lecture 1: Monday January 3rd

 Course Overview
 Basic Setup: Generalization Error, Learning Rule, Learning Algorithm
 Analysis of the "Minimum Consistent Rectangle" learning rule in
the realizable case.
 No Free Lunch Theorem.
Required Reading: Section 1 of: S. Boucheron, O. Bousquet, and G. Lugosi, (2004), Concentration inequalities, Advanced Lectures in Machine Learning, Springer, pp. 208240, 2004 Alternative reading : Lecture notes
Homework 1 out (due 5th Jan, 2010 (preferable), ultimate deadline 10th Jan, 2010)
 Lecture 2: Wednesday January 5th

 Empirical Risk Minimization
 Finite cardinality generalization bound and learning gurantee.
 The growth function.
 Lecture 4: Monday January 10th
 VCdimension and the Shatter Lemma.
 Lecture 5: Wednesday January 13th
 PAC Learnability and the VC dimension.
 Description Length based learning rules and gurantees, Structural Risk Minimization.
 No Lecture on Monday January 17th (MLK Day)
 Lecture 6: Wednesday January 19th, given by Karthik
 PACBayes bound.
 Compression scheme bounds.
Homework 2 out (due January 31st)
 Lecture 7: Monday January 24th
 Universal learning.
 Computational considerations: efficient PAC learning and proper PAC learning.
 Lecture 8: Wednesday January 26th
 Efficient proper PAC learning implies Consistency is in RP.
 Hardness of efficiently PAC learning 3TERMDNFs, even though they can be learned improperly.
 Lecture 9: Monday January 31st
 Inherent hardness of efficient PAC learning.
 Existence of hypothesis classes that are easy to learn statistically but hard to learn computationally, based on counting arguments.
 Overview of cryptographic hardness of efficient PAC learning.
 Hardness of efficient PAC learning vs efficient agnostic PAC learning.
Homework 3 out (due 16th Feb, 2011)
Homework 4 out (due 2nd March, 2011)
Homework 5 out (due 16th March, 2011)
Assignments:
 Homework 1 (due 5th Jan, 2011 (preferable), ultimate deadline 10th Jan, 2010) Note: updated January 4th afternoon(solutions to earlier posted version will also be accepted)
Last modified: Thu Jan 27 15:09:06 Central Standard Time 2011