Schedule
Date | Lecture | Readings | Logistics | |
---|---|---|---|---|
W 01/19 |
Lecture #1
(Katerina):
Introduction to Reinforcement and Representation Learning [ slides ] |
|
||
F 01/21 |
Recitation #1:
Neural Nets, TensorFlow & Keras, OpenAI Gym, PyTorch [ slides | slides 2 | notes ] |
|
||
M 01/24 |
Lecture #2
(Katerina):
Multi-armed Bandits [ slides ] |
|
||
W 01/26 |
Lecture #3
(Katerina):
Markov Decision Processes, Value Iteration, Policy Iteration [ slides ] |
|
||
F 01/28 |
Recitation #2:
Bandits, MDPs & HW1 [ slides | slides 2 ] |
|||
M 01/31 |
Lecture #4
(Katerina):
Monte Carlo Learning and Temporal Difference Learning [ slides ] |
|
HW1 out |
|
W 02/02 |
Lecture #5
(Katerina):
Monte Carlo Learning and Temporal Difference Learning (Cont.) [ slides ] |
|
||
F 02/04 |
Recitation #3:
HW1 [ slides ] |
|||
M 02/07 |
Lecture #6
(Katerina):
Monte Carlo and TD (cont.), Function approximation in prediction and control, Deep Q-learning, Deep SARSA [ slides | slides 2 ] |
|
||
W 02/09 |
Lecture #7
(Katerina):
Planning, Monte Carlo Tree search [ slides ] |
|||
F 02/11 |
Recitation #4:
MCTS, TD Learning, Deep Q Learning [ slides | slides 2 ] |
|||
M 02/14 |
Lecture #8
(Katerina):
MCTS (cont.), Policy gradients, REINFORCE, Actor-Critic methods [ slides ] |
|
HW1 due 02/14 11:59pm |
|
W 02/16 |
Lecture #9
(Katerina):
Policy gradients, REINFORCE, Actor-Critic methods (cont.) [ slides ] |
|
HW2 out (tentative) |
|
F 02/18 |
Recitation #5:
Quiz 1 Review [ slides ] |
|||
M 02/21 |
Lecture #10
(Katerina):
Actor Critic Methods (cont.), Natural PG, PPO, TRPO [ slides | slides 2 ] |
|
||
W 02/23 |
Lecture #11
(Katerina):
Deterministic Policy gradient, re-parametrized PG [ slides | slides 2 ] |
|
||
F 02/25 | Quiz 1 [covering everything through Lecture 10, Monday, February 21] | |||
M 02/28 |
Lecture #12
(Katerina):
Evolutionary methods for policy search [ slides | slides 2 ] |
|
||
W 03/02 |
Lecture #13
(Katerina):
Imitation learning, behavior cloning [ slides ] |
|
||
F 03/04 |
Recitation #6:
No Recitation (Mid-semester break) [ slides ] |
|||
M 03/07 | Spring break - No classes | |||
W 03/09 | Spring break - No classes | |||
F 03/11 |
Recitation #7:
No Recitation (Spring break) [ slides ] |
|||
M 03/14 |
Lecture #14
(Katerina):
Imitation learning (cont.), Adversarial imitation learning [ slides ] |
|
HW3 out (tentative), HW2 due 03/14 11:59PM |
|
W 03/16 |
Lecture #15
(Katerina):
Model based RL, Low dimensional model, Explicit models. [ slides ] |
|
||
F 03/18 |
Recitation #8:
Homework 3 [ slides ] |
|||
M 03/21 |
Lecture #16
(Katerina):
MBRL (cont.) Holistic and graph-based world models [ slides ] |
|
||
W 03/23 |
Lecture #17
(Katerina):
MBRL (cont), AlphaGo, AlphaGoZero, MuZero [ slides ] |
|
||
F 03/25 |
Recitation #9:
Quiz 2 Review [ slides ] |
|||
M 03/28 |
Lecture #18
(Katerina):
MBRL (cont), AlphaGo, AlphaGoZero, MuZero (cont.) [ slides ] |
|
HW4 out (tentative), HW3 due 03/28 11:59pm |
|
W 03/30 |
Lecture #19
(Katerina):
MBRL (cont.) Time dependent linear models, iLQR [ slides ] |
|||
F 04/01 | Quiz 2 [covering everything from lectures 10 through 17 (no MuZero) (Wednesday, March 23)]
Pass/Fail Grade Option Deadline |
|||
M 04/04 |
Lecture #20
(Katerina):
MBRL(cont), stochastic world models [ slides ] |
|
||
W 04/06 |
Lecture #21
(Katerina):
Intelligent Exploration [ slides ] |
|
||
F 04/08 | No Classes - Spring Carnival | |||
M 04/11 |
Lecture #22
(Katerina):
Offline RL [ slides ] |
HW4 due 04/11 11:59pm, HW5 out (tentative) |
||
W 04/13 |
Lecture #23
(Katerina):
Sim2Real transfer [ slides ] |
|||
F 04/15 |
Recitation #10:
Homework 5 [ slides ] |
|||
M 04/18 |
Lecture #24
(Katerina):
Visual Imitation Learning [ slides ] |
|||
W 04/20 |
Lecture #25
:
Self-Supervised Visual Learning [ slides ] |
|
||
F 04/22 |
Recitation #11:
Quiz 3 Review [ slides ] |
|||
M 04/25 |
Lecture #26
(Katerina):
Control with 3D visual representations [ slides ] |
|
||
W 04/27 |
Lecture #27
(Katerina):
Transporter Networks for Vision-Based Manipulation [ slides ] |
|||
F 04/29 |
Recitation #12:
Quiz 3 Review [ slides ] |
Pass/Fail Grade Option Deadline |
||
T 05/03 | Quiz 3 |