Schedule
Date | Lecture | Readings | Logistics | |
---|---|---|---|---|
M 08/25 |
Lecture #1
:
Introduction to Reinforcement and Representation Learning [ slides ] |
|
||
W 08/27 |
Lecture #2
:
Multi-armed Bandits [ slides ] |
|
||
F 08/29 |
Recitation #1:
Neural Nets, PyTorch, OpenAI Gym, Bandits [ slides ] |
|
||
M 09/01 | No Class, Labor Day | |||
W 09/03 |
Lecture #3
:
Value-based Methods [ slides ] |
|
||
F 09/05 |
Recitation #2:
Bandits, MDPs [ slides ] |
|||
M 09/08 |
Lecture #4
:
Value-based Methods (cont.) [ slides ] |
|
||
W 09/10 |
Lecture #5
:
Value based methods cont. (DQN, MCTS) [ slides | slides 2 ] |
|||
F 09/12 | No Recitation | |||
M 09/15 |
Lecture #6
:
Actor-Critic Methods [ slides ] |
HW1 out (tentative) |
||
W 09/17 |
Recitation #3:
HW1 [ slides ] |
|||
F 09/19 |
Lecture #7
:
Actor Critic Methods (cont.) [ slides ] |
|||
M 09/22 |
Lecture #8
:
Trust Region Methods [ slides ] |
|
||
W 09/24 |
Lecture #9
:
Trust Region methods [ slides ] |
|
||
F 09/26 |
Recitation #4:
Quiz 1 Review [ slides ] |
|||
M 09/29 |
Lecture #10
:
Trust Region Methods [ slides ] |
|
HW1 due 11:59PM |
|
W 10/01 |
Lecture #11
:
Behavior Cloning, Generative Adversarial Imitation Learning [ slides ] |
|
||
F 10/03 | Quiz 1 | |||
M 10/06 |
Lecture #12
:
Multimodel Policies, Diffusion Policies [ slides ] |
|||
W 10/08 |
Lecture #13
:
Diffusion Policies (cont.) Evolutionary Methods for Policy Search [ slides ] |
|
HW2 out (tentative) |
|
F 10/10 |
Recitation #5:
Solutions to Quiz 1 [ slides ] |
|||
M 10/13 | Fall Break - No Classes | |||
W 10/15 | Fall Break - No Classes | |||
F 10/17 | Fall Break - No Classes | |||
M 10/20 |
Lecture #14
:
Maximum Entropy RL, SAC, DDPG [ slides ] |
|
||
W 10/22 |
Lecture #15
:
Maximum Entropy RL, SAC, DDPG [ slides ] |
|
HW2 due Thursday 10/23 11:59PM |
|
F 10/24 |
Recitation #6:
Diffusion policies (cont.) [ slides ] |
|||
M 10/27 |
Lecture #16
:
Introduction to Model-Based Reinforcement Learning [ slides ] |
|||
W 10/29 |
Lecture #17
:
AlphaGo, AlphaGoZero, AlphaZero [ slides ] |
|
HW3 out (tentative) |
|
F 10/31 |
Recitation #7:
HW3 [ slides ] |
|||
M 11/03 |
Lecture #18
:
MBRL from sensory input [ slides ] |
|
||
W 11/05 |
Lecture #19
:
MBRL (cont.) [ slides ] |
|
||
F 11/07 |
Lecture #20
:
Visual Imitation / Quiz 2 Review [ slides ] |
|||
M 11/10 |
Lecture #21
:
Multigoal Reinforcement Learning, MBRL with multimodal dynamics [ slides | slides 2 ] |
|
||
W 11/12 | Quiz 2 | |||
F 11/14 |
Lecture #22
:
Offline RL 1: going beyond imitation, problem statement, challenges in doing offline RL, policy gradient methods / policy constraints [ slides ] |
|
||
M 11/17 |
Lecture #23
:
Offline RL 2: conservative methods, model-based approaches, modern model-free algorithms [ slides ] |
|
HW4 out (tentaive) HW3 due 11:59pm |
|
W 11/19 |
Lecture #24
:
Intelligent Exploration [ slides ] |
|
||
F 11/21 |
Recitation #8:
HW 4 [ slides ] |
|||
M 11/24 |
Recitation #9:
Sim2Real Policy Learning [ slides ] |
|||
W 11/26 | Thanksgiving Break - No Classes | |||
F 11/28 | Thanksgiving Break - No Classes | |||
M 12/01 |
Lecture #25
:
Foundation Models for RL [ slides ] |
|||
W 12/03 |
Lecture #26
:
Foundation Models for RL [ slides ] |
HW4 due 11:59pm |
||
F 12/05 |
Recitation #10:
Quiz 3 Review [ slides ] |
|||
F 12/12 | Quiz 3 Placeholder |