Schedule
Date | Lecture | Readings | Logistics | |
---|---|---|---|---|
M 08/26 |
Lecture #1
:
Introduction to Reinforcement and Representation Learning [ slides ] |
|
||
W 08/28 |
Lecture #2
:
Multi-armed Bandits [ slides ] |
|
||
F 08/30 |
Recitation #1:
Neural Nets, TensorFlow & Keras, OpenAI Gym, Bandits [ slides ] |
|
||
M 09/02 | No Class, Labor Day | |||
W 09/04 |
Lecture #3
:
Markov Decision Processes, Value Iteration, Policy Iteration [ slides ] |
|
HW1 out (tentative) |
|
F 09/06 |
Recitation #2:
Bandits, MDPs & HW1 [ slides ] |
|||
M 09/09 |
Lecture #4
:
Markov Decision Processes, Value Iteration, Policy Iteration (cont.) [ slides ] |
|
||
W 09/11 |
Lecture #5
:
Monte Carlo Learning and Temporal Difference Learning [ slides ] |
|
||
F 09/13 | No Recitation | |||
M 09/16 |
Lecture #6
:
Monte Carlo Learning and Temporal Difference Learning (cont.) [ slides ] |
|
||
W 09/18 |
Lecture #7
:
Monte Carlo Tree Search, Function approximation in prediction and control, Deep Q-learning [ slides | slides 2 ] |
|
||
F 09/20 |
Recitation #3:
MCTS, TD Learning, Deep Q Learning, HW2 (DQN) [ slides ] |
|||
M 09/23 |
Lecture #8
:
Deep Q-learning (cont.) [ slides ] |
HW1 due 11:59pm, HW2 out (tentative) |
||
W 09/25 |
Lecture #9
:
Policy gradients [ slides ] |
|||
F 09/27 |
Recitation #4:
Quiz 1 Review [ slides ] |
|||
M 09/30 |
Lecture #10
:
Natural PG, PPO, TRPO [ slides | slides 2 ] |
|
||
W 10/02 |
Lecture #11
:
Evolutionary methods for policy search [ slides ] |
|
||
F 10/04 | Quiz 1 | |||
M 10/07 |
Lecture #12
:
Deterministic Policy gradient, re-parametrized PG [ slides | slides 2 ] |
|
||
W 10/09 |
Lecture #13
:
Generative Adversarial Imitation Learning [ slides ] |
|
HW2 due 11:59PM |
|
F 10/11 |
Recitation #5:
Solutions to Quiz 1 [ slides ] |
|||
M 10/14 | Fall Break - No Classes | |||
W 10/16 | Fall Break - No Classes | |||
F 10/18 | Fall Break - No Classes | |||
M 10/21 |
Lecture #14
:
Multi-goal RL and Imitation learning, Visual Imitation Learning [ slides | slides 2 ] |
|
HW3 out (tentative) |
|
W 10/23 |
Lecture #15
:
Imitation Learning with Diffusion Models [ slides ] |
|
||
F 10/25 |
Recitation #6:
Diffusion policies (cont.), HW3 Recitation [ slides | slides 2 ] |
|||
W 10/28 |
Lecture #16
:
AlphaGo, AlphaGoZero, AlphaZero [ slides ] |
|
||
W 10/30 |
Recitation #7:
Class Cancelled [ slides ] |
|||
F 11/1 |
Lecture #17
:
MBRL in explicit and observable low-dimensional state spaces [ slides ] |
|||
M 11/04 |
Lecture #18
:
MBRL from Sensory Input, Planning in Sensory Space, Planning in a Latent State Space [ slides ] |
|
HW4 out (tentative) |
|
W 11/06 |
Lecture #19
:
Quiz 2 and HW4 Review Recitation [ slides ] |
|||
F 11/08 | Quiz 2 | |||
M 11/11 |
Lecture #20
:
MBRL with Multimodal Dynamics [ slides ] |
|||
W 11/13 |
Lecture #21
:
Intelligent Exploration [ slides ] |
|
||
F 11/15 |
Recitation #8:
Solutions to Quiz 2 [ slides ] |
|||
W 11/18 |
Lecture #22
:
Intelligent Exploration (cont.) [ slides ] |
|
||
W 11/20 |
Lecture #23
:
Offline RL (Aviral Kumar) [ slides ] |
|
||
F 11/22 |
Recitation #9:
Homework 5 [ slides ] |
|||
M 11/25 |
Lecture #24
:
Sim2Real Transfer [ slides | slides 2 ] |
|
HW5 out (tentative), HW4 due 11:59PM |
|
W 11/27 |
Lecture #25
:
No Class - Thanksgiving Break [ slides ] |
|||
F 11/29 |
Recitation #10:
No Recitation - Thanksgiving Break [ slides ] |
|||
M 12/02 |
Lecture #26
:
Building Generalist Robots with Agility via Learning and Control: Humanoids and Beyond (Guanya Shi) [ slides ] |
|
||
W 12/04 |
Lecture #27
:
Language and Robot Control [ slides ] |
|
HW5 due 11:59PM |
|
F 12/06 |
Recitation #11:
Quiz 3 Review [ slides ] |
|||
T 12/10 | Quiz 3, 1:00pm - 4:00pm, PH 100 |