Date Lecture Readings Logistics
M 08/26 Lecture #1 :
Introduction to Reinforcement and Representation Learning
[ slides ]

W 08/28 Lecture #2 :
Multi-armed Bandits
[ slides ]

F 08/30 Recitation #1:
Neural Nets, TensorFlow & Keras, OpenAI Gym, Bandits
[ slides ]

M 09/02 No Class, Labor Day

W 09/04 Lecture #3 :
Markov Decision Processes, Value Iteration, Policy Iteration
[ slides ]

HW1 out (tentative)

F 09/06 Recitation #2:
Bandits, MDPs & HW1
[ slides ]

M 09/09 Lecture #4 :
Markov Decision Processes, Value Iteration, Policy Iteration (cont.)
[ slides ]

W 09/11 Lecture #5 :
Monte Carlo Learning and Temporal Difference Learning
[ slides ]

F 09/13 No Recitation

M 09/16 Lecture #6 :
Monte Carlo Learning and Temporal Difference Learning (cont.)
[ slides ]

W 09/18 Lecture #7 :
Monte Carlo Tree Search, Function approximation in prediction and control, Deep Q-learning
[ slides | slides 2 ]

F 09/20 Recitation #3:
MCTS, TD Learning, Deep Q Learning, HW2 (DQN)
[ slides ]

M 09/23 Lecture #8 :
Deep Q-learning (cont.)
[ slides ]

HW1 due 11:59pm, HW2 out (tentative)

W 09/25 Lecture #9 :
Policy gradients
[ slides ]

F 09/27 Recitation #4:
Quiz 1 Review
[ slides ]

M 09/30 Lecture #10 :
Natural PG, PPO, TRPO
[ slides | slides 2 ]

W 10/02 Lecture #11 :
Evolutionary methods for policy search
[ slides ]

F 10/04 Quiz 1

M 10/07 Lecture #12 :
Deterministic Policy gradient, re-parametrized PG
[ slides | slides 2 ]

W 10/09 Lecture #13 :
Generative Adversarial Imitation Learning
[ slides ]

HW2 due 11:59PM

F 10/11 Recitation #5:
Solutions to Quiz 1
[ slides ]

M 10/14 Fall Break - No Classes

W 10/16 Fall Break - No Classes

F 10/18 Fall Break - No Classes

M 10/21 Lecture #14 :
Multi-goal RL and Imitation learning, Visual Imitation Learning
[ slides | slides 2 ]

HW3 out (tentative)

W 10/23 Lecture #15 :
Imitation Learning with Diffusion Models
[ slides ]

F 10/25 Recitation #6:
Diffusion policies (cont.), HW3 Recitation
[ slides | slides 2 ]

W 10/28 Lecture #16 :
AlphaGo, AlphaGoZero, AlphaZero
[ slides ]

W 10/30 Recitation #7:
Class Cancelled
[ slides ]

F 11/1 Lecture #17 :
MBRL in explicit and observable low-dimensional state spaces
[ slides ]

M 11/04 Lecture #18 :
MBRL from Sensory Input, Planning in Sensory Space, Planning in a Latent State Space
[ slides ]

HW4 out (tentative)

W 11/06 Lecture #19 :
Quiz 2 and HW4 Review Recitation
[ slides ]

F 11/08 Quiz 2

M 11/11 Lecture #20 :
MBRL with Multimodal Dynamics
[ slides ]

W 11/13 Lecture #21 :
Intelligent Exploration
[ slides ]

F 11/15 Recitation #8:
Solutions to Quiz 2
[ slides ]

W 11/18 Lecture #22 :
Intelligent Exploration (cont.)
[ slides ]

W 11/20 Lecture #23 :
Offline RL (Aviral Kumar)
[ slides ]

F 11/22 Recitation #9:
Homework 5
[ slides ]

M 11/25 Lecture #24 :
Sim2Real Transfer
[ slides | slides 2 ]

HW5 out (tentative), HW4 due 11:59PM

W 11/27 Lecture #25 :
No Class - Thanksgiving Break
[ slides ]

F 11/29 Recitation #10:
No Recitation - Thanksgiving Break
[ slides ]

M 12/02 Lecture #26 :
Building Generalist Robots with Agility via Learning and Control: Humanoids and Beyond (Guanya Shi)
[ slides ]

W 12/04 Lecture #27 :
Language and Robot Control
[ slides ]

HW5 due 11:59PM

F 12/06 Recitation #11:
Quiz 3 Review
[ slides ]

T 12/10 Quiz 3, 1:00pm - 4:00pm, PH 100