Schedule
Date | Lecture | Readings | Logistics | |
---|---|---|---|---|
M 08/31 |
Lecture #1
:
Introduction to Reinforcement Learning and Representation Learning [ slides | video ] |
|
||
W 09/02 |
Lecture #2
:
Exploration-exploitation in multi-armed bandits [ slides | video ] |
|
||
F 09/04 |
Recitation #1:
CNNs, RNNs, Tensorflow [ slides ] |
|
||
M 09/07 | Labor day - No classes | |||
W 09/09 |
Lecture #3
:
Evolutionary methods for policy search [ slides | video ] |
|
||
F 09/11 |
Recitation #2:
Gaussian Processes [ slides | video ] |
|||
M 09/14 |
Lecture #4
:
Exploration-exploitation in experiment design, Bayesian optimization with Gaussian Processes [ slides | video ] |
|
HW1 out |
|
W 09/16 |
Lecture #5
:
Imitation learning with behavior cloning [ slides | video ] |
|
||
F 09/18 |
Recitation #3:
HW1 [ slides | video ] |
|||
M 09/21 |
Lecture #6
:
Generative adversarial / goal-conditioned imitation learning [ slides | video ] |
|
||
W 09/23 |
Lecture #7
:
What if states are few, we know which state we are in, and the world model is known? Dynamic programming for policy search [ slides | video ] |
|
||
F 09/25 |
Recitation #4:
REC: Behavior Cloning and Quiz 1 Review [ slides | video ] |
|||
M 09/28 |
Lecture #8
:
Dynamic Programming, Policy Iteration, Value Iteration [ slides | video ] |
|
HW1 due 09/28 11:59pm |
|
W 09/30 |
Lecture #9
:
Monte Carlo Learning and Temporal Difference Learning [ slides | video ] |
|
||
F 10/02 | Quiz 1 (online)
HW2 out |
|||
M 10/05 |
Lecture #10
:
What if states are infinite and we do not know or wish to estimate the world model? Deep Q-learning, Deep SARSA [ slides | video ] |
|
||
W 10/07 |
Lecture #11
:
Monte Carlo Tree search / Quiz 1 recap [ slides | video ] |
|||
F 10/09 |
Recitation #5:
Policy and Value Iterations [ slides | video ] |
|||
M 10/12 |
Lecture #12
:
Monte Carlo Tree search [ slides | video ] |
|||
W 10/14 |
Lecture #13
:
Policy gradients, REINFORCE, Actor-Critic methods [ slides | video ] |
|
HW3 out |
|
F 10/16 | Community Engagement Day - No classes
HW2 due 10/16 11:59pm |
|||
M 10/19 |
Lecture #14
:
Actor-Critic methods (cont.), Deterministic PG, Re-parametrized PG [ slides | slides 2 | video ] |
|
||
W 10/21 |
Lecture #15
:
Natural PG [ slides | video ] |
|
||
F 10/23 | Mid Semester Break - No classes | |||
M 10/26 |
Lecture #16
:
Natural PG (cont.) , off policy RL [ slides | slides 2 | video ] |
|
||
W 10/28 |
Lecture #17
:
SAC, TD3, soft q learning [ slides | slides 2 | video ] |
|||
F 10/30 |
Recitation #6:
REC [ slides | video ] |
HW3 due 10/30 11:59PM |
||
M 11/02 |
Lecture #18
:
Model-based RL [ slides | video ] |
|
||
W 11/04 |
Lecture #19
:
Model-based RL (cont.) [ slides | video ] |
|
||
F 11/06 | Quiz 2 (online)
HW4 out 11/07 |
|||
M 11/09 |
Lecture #20
:
Model-based RL (cont.) [ slides | video ] |
|
||
W 11/11 |
Lecture #21
:
Curiosity driven exploration, guest lecture, Deepak Pathak [ slides | video ] |
|
||
F 11/13 |
Recitation #7:
HW4 [ slides | video ] |
|||
M 11/16 |
Lecture #22
:
Model-based RL (cont.), imitating planners [ slides | video ] |
|
||
W 11/18 |
Lecture #23
:
Intelligent Exploration [ slides | video ] |
|
||
F 11/20 |
Recitation #8:
HW5 [ slides | video ] |
|||
M 11/23 |
Lecture #24
:
Learning and Planning [ slides | video ] |
|
HW5 out 11/23 HW4 due 11/23 11:59pm |
|
W 11/25 | Thanksgiving Holiday - No classes | |||
F 11/27 | Thanksgiving Holiday - No classes | |||
M 11/30 |
Lecture #25
:
Learning from RL and demonstrations [ slides | video ] |
|||
W 12/02 |
Lecture #26
:
Sim2Real transfer [ slides | video ] |
|
||
F 12/04 |
Recitation #9:
REC [ slides ] |
|||
M 12/07 |
Lecture #27
:
Meta-Learning, learning to learn [ slides | video ] |
|
HW5 due 12/07 11:59pm |
|
W 12/09 |
Lecture #28
:
RL and generalization: A closer look to state representations for generalization in model free and model based RL [ slides | video ] |
|
||
F 12/11 |
Recitation #10:
Quiz 3 Review [ slides | video ] |
|||
M 12/14 | Quiz 3 (online) 1:00 pm - 2:20 pm EST |