David McAllester
Winter 2018
- Introduction and Historical Notes
- Multi-Layer Perceptrons (MLPs) and Stochastic Gradient Descent (SGD)
- Feed-Forward Computation Graphs, Backpropagation, and the Educational Framework (EDF)
- Convolutional Neural Networks (CNNs)
- Invariant Theory
- Controling Gradients: Initialization, Batch Normalization, Resnets and Gated RNNs
- Language Modeling and Machine Translation
- First Order Stochastic Gradient Descent (SGD)
- Gradients as Dual Vectors, Hessian-Vector Products, and Information Geometry
- Regularization
- Interpretation
- Information Theory
- Fully Observed Graphical Models I: Exponential Softmax, Sufficient Statistics, and Belief Propagation
- Fully Observed Graphical Models II: Approximate SGD Algorithms
- Partially Observed Graphical Models: Expectation Maximization (EM), Expected Gradient (EG), and CTC
- Variational Autoencoders (VAEs)
- Rate-Distortion Autoencoders
- Generative Adversarial Networks (GANs)
- Reinforcement Learning (RL)
- AlphaZero
- The Quest for Artificial General Intelligence (AGI)