Date Lecture Readings Logistics
W 01/18 Lecture #1 :
Introduction to Reinforcement and Representation Learning
[ slides ]

F 01/20 Recitation #1:
Introduction to PyTorch and OpenAI Gym
[ slides ]

M 01/23 Lecture #2 :
Multi-armed Bandits
[ slides ]

W 01/25 Lecture #3 :
Gaussian Processes, Experiment design
[ slides ]

HW1 out (tentative)

F 01/27 Recitation #2:
Deep Learning Architectures, MDPs, Bandits
[ slides ]

M 01/30 Lecture #4 :
Markov Decision Processes, Value Iteration, Policy Iteration
[ slides ]

W 02/01 Lecture #5 :
Markov Decision Processes, Value Iteration, Policy Iteration
[ slides ]

F 02/03 Recitation #3:
Iterative Learning, Monte Carlo
[ slides ]

M 02/06 Lecture #6 :
Monte Carlo Learning and Temporal Difference Learning, Planning, Monte Carlo Tree search
[ slides | slides 2 ]

W 02/08 Lecture #7 :
Monte Carlo Learning and Temporal Difference Learning, Planning, Monte Carlo Tree search
[ slides ]

F 02/10 Recitation #4:
Monte Carlo Learning (contd), HW1 Q&A
[ slides ]

M 02/13 Lecture #8 :
Function approximation in prediction and control, Deep Q-learning
[ slides ]

HW1 due 11:59pm

W 02/15 Lecture #9 :
Policy gradients, REINFORCE, Actor-Critic methods
[ slides ]

HW2 out

F 02/17 Recitation #5:
Quiz 1 Review, MCTS & DQN Review
[ slides | slides 2 ]

M 02/20 Lecture #10 :
Policy gradients, REINFORCE, Actor-Critic methods (cont)
[ slides ]

W 02/22 Lecture #11 :
Natural PG, PPO, TRPO
[ slides ]

F 02/24 Quiz 1

M 02/27 Lecture #12 :
No Class due to Illness
[ slides ]

W 03/01 Lecture #13 :
Re-parametrized policy gradients, deep deterministic policy gradients
[ slides ]

F 03/03 Lecture #14 :
Evolutionary Methods
[ slides ]

M 03/06 Spring Break - No Classes

W 03/08 Spring Break - No Classes

HW2 due 11:59PM

F 03/10 Spring Break - No Classes

M 03/13 Lecture #15 :
Imitation learning, behavior cloning
[ slides ]

W 03/15 Lecture #16 :
Adversarial imitation learning, Imitation learning for vision-based manipulation
[ slides | slides 2 ]

F 03/17 Recitation #6:
Recitation 7
[ slides ]

M 03/20 Lecture #17 :
Transporter networks for Robot manipulation (cont.), Model-based RL in low-dim state space
[ slides ]

HW3 out (tentative)

W 03/22 Lecture #18 :
MBRL (cont), AlphaGo, AlphaGoZero
[ slides ]

F 03/24 Recitation #7:
Imitation Learning
[ slides ]

M 03/27 Lecture #19 :
MBRL (cont.) latent dynamics models
[ slides ]

W 03/29 Lecture #20 :
Dynamics learning with graph neural networks
[ slides ]

F 03/31 Quiz 2

M 04/03 Lecture #21 :
MuZero, Intelligent Exploration
[ slides ]

HW4 out (tentative), HW3 due 11:59PM

W 04/05 Lecture #22 :
Intelligent Exploration
[ slides ]

F 04/07 Recitation #8:
Recitation 9
[ slides ]

M 04/10 Lecture #23 :
Self-Supervised Visual Learning
[ slides ]

W 04/12 Lecture #24 :
Sim2Real Transfer
[ slides ]

F 04/14 Spring Carnival - No Classes

M 04/17 Lecture #25 :
Visual Imitation Learning
[ slides ]

HW5 out (tentative), HW4 due 11:59PM

W 04/19 Lecture #26 :
Offline Reinforcement Learning
[ slides ]

F 04/21 Recitation #9:
Recitation 10
[ slides ]

M 04/24 Lecture #27 (Katerina):
Language-guided Control
[ slides ]

W 04/26 Lecture #28 :
Diffusion models for action prediction and planning
[ slides ]

F 04/28 Recitation #10:
Recitation 11
[ slides ]

M 05/01 Quiz 3

HW5 due 11:59PM