Schedule
| Date | Lecture | Readings | Logistics | |
|---|---|---|---|---|
| M 01/12 |
Lecture #1
:
Welcome and Introduction to the Class [ slides ] |
|
||
| W 01/14 |
Lecture #2
:
Introduction to Reinforcement Learning [ slides ] |
|
||
| F 01/16 |
Lecture #3
:
Value-based Methods [ slides ] |
|
HW1 out |
|
| M 01/19 | No Class, MLK Day | |||
| W 01/21 |
Recitation #1:
Open AI Gym, PyTorch and DQN Details, & HW1 [ slides ] |
|
||
| F 01/23 |
Recitation #2:
Setup and Debugging OH (in class) [ slides ] |
|||
| M 01/26 |
Lecture #4
:
Value based methods (Cont.), Evolutionary Methods for Policy Search [ slides ] |
|||
| W 01/28 |
Lecture #5
:
Policy Gradient Methods [ slides ] |
HW1 Due, HW2 out |
||
| F 01/30 |
Recitation #3:
Policy Gradients & HW2 [ slides ] |
|||
| M 02/02 |
Lecture #6
:
How Far Can We Update a Policy? Step Size and Stability in On-Policy and Off-Policy Policy Gradients [ slides | slides 2 ] |
|||
| W 02/04 |
Lecture #7
:
Step Size and Stability in Policy Gradients (cont.) [ slides | slides 2 ] |
HW2 Due, |
||
| F 02/06 |
Recitation #4:
Policy-based Methods & HW3 [ slides ] |
|||
| M 02/09 |
Lecture #8
:
Actor Critic with Pathwise Derivatives [ slides ] |
|||
| T 02/10 |
HW4 Out |
|||
| W 02/11 |
Lecture #9
:
Imitation Learning, Behavior Cloning [ slides ] |
|
||
| F 02/13 |
Recitation #5:
Midterm Review and HW4 [ slides ] |
|||
| M 02/16 |
Lecture #10
:
GAIL , Multi-goal RL and IL [ slides ] |
|
HW3 Due |
|
| W 02/18 |
Lecture #11
:
Diffusion Models for Imitation Learning [ slides ] |
|
||
| F 02/20 | Midterm | |||
| M 02/23 |
Lecture #12
:
Diffusion Models for Imitation Learning (cont.) [ slides ] |
|
||
| T 02/24 |
HW4 Due , |
|||
| W 02/25 |
Lecture #13
:
Learning and Search: MCTS, AlphaGo, AlphaZero [ slides ] |
|
||
| F 02/27 |
Recitation #6:
IL Diffusion Policies and HW5 [ slides ] |
|||
| M 03/02 | Spring Break - No Classes | |||
| W 03/04 | Spring Break - No Classes | |||
| F 03/06 | Spring Break - No Classes | |||
| M 03/09 |
Lecture #14
:
MBRL (cont.) [ slides ] |
|
||
| W 03/11 |
Lecture #15
:
MBRL (cont.) [ slides ] |
|
HW5 Due, HW6 Out |
|
| F 03/13 |
Recitation #7:
TD-MPC / PETS & HW6 [ slides ] |
|||
| M 03/16 |
Lecture #16
:
Model-based Methods for offline RL [ slides ] |
|
||
| W 03/18 |
Lecture #17
:
Guided Diffusion [ slides ] |
|||
| F 03/20 |
Recitation #8:
Midterm Solutions [ slides ] |
|||
| M 03/23 |
Lecture #18
:
Offline RL [ slides ] |
|
||
| W 03/25 |
Lecture #19
:
Offline RL (Cont.) [ slides ] |
|
HW6 Due, HW7 Out |
|
| F 03/27 |
Recitation #9:
Offline RL & HW7 [ slides ] |
|||
| M 03/30 |
Lecture #20
:
Intelligent Exploration [ slides ] |
|
||
| W 04/01 |
Lecture #21
:
Intelligent Exploration [ slides ] |
|
||
| F 04/03 |
Recitation #10:
Exploration [ slides ] |
|||
| M 04/06 |
Lecture #22
:
Sim2Real Learning [ slides ] |
|
||
| W 04/08 |
Lecture #23
:
Foundation Models for RL [ slides ] |
HW7 Due, |
||
| F 04/10 |
Recitation #11:
Sim2Real & HW8 (Video Recitation) [ slides | video ] |
|||
| M 04/13 |
Recitation #12:
RL for Foundation Models [ slides ] |
|||
| W 04/15 |
Lecture #24
:
RL for foundation models (cont.) [ slides ] |
|
||
| F 04/17 |
Recitation #13:
RL with Foundation Models [ slides ] |
|||
| M 04/20 |
Lecture #25
(Aviral Kumar):
Exploration, Extrapolation, and Chains of Thought [ slides ] |
|||
| W 04/22 |
Lecture #26
:
Training Diffusion Models with RL [ slides ] |
|
||
| Th 04/23 |
HW8 Due |
|||
| F 04/24 |
Recitation #14:
Generative Models for RL & Final Review [ slides ] |
|
||
| F 05/01 | Final Exam (5:30pm - 8:30pm) | |||