Schedule
Date | Lecture | Readings | Logistics | |
---|---|---|---|---|
W 01/18 |
Lecture #1
:
Introduction to Reinforcement and Representation Learning [ slides ] |
|
||
F 01/20 |
Recitation #1:
Introduction to PyTorch and OpenAI Gym [ slides ] |
|||
M 01/23 |
Lecture #2
:
Multi-armed Bandits [ slides ] |
|
||
W 01/25 |
Lecture #3
:
Gaussian Processes, Experiment design [ slides ] |
HW1 out (tentative) |
||
F 01/27 |
Recitation #2:
Deep Learning Architectures, MDPs, Bandits [ slides ] |
|||
M 01/30 |
Lecture #4
:
Markov Decision Processes, Value Iteration, Policy Iteration [ slides ] |
|
||
W 02/01 |
Lecture #5
:
Markov Decision Processes, Value Iteration, Policy Iteration [ slides ] |
|
||
F 02/03 |
Recitation #3:
Iterative Learning, Monte Carlo [ slides ] |
|||
M 02/06 |
Lecture #6
:
Monte Carlo Learning and Temporal Difference Learning, Planning, Monte Carlo Tree search [ slides | slides 2 ] |
|
||
W 02/08 |
Lecture #7
:
Monte Carlo Learning and Temporal Difference Learning, Planning, Monte Carlo Tree search [ slides ] |
|
||
F 02/10 |
Recitation #4:
Monte Carlo Learning (contd), HW1 Q&A [ slides ] |
|
||
M 02/13 |
Lecture #8
:
Function approximation in prediction and control, Deep Q-learning [ slides ] |
|
HW1 due 11:59pm |
|
W 02/15 |
Lecture #9
:
Policy gradients, REINFORCE, Actor-Critic methods [ slides ] |
HW2 out |
||
F 02/17 |
Recitation #5:
Quiz 1 Review, MCTS & DQN Review [ slides | slides 2 ] |
|||
M 02/20 |
Lecture #10
:
Policy gradients, REINFORCE, Actor-Critic methods (cont) [ slides ] |
|
||
W 02/22 |
Lecture #11
:
Natural PG, PPO, TRPO [ slides ] |
|
||
F 02/24 | Quiz 1 | |||
M 02/27 |
Lecture #12
:
No Class due to Illness [ slides ] |
|||
W 03/01 |
Lecture #13
:
Re-parametrized policy gradients, deep deterministic policy gradients [ slides ] |
|
||
F 03/03 |
Lecture #14
:
Evolutionary Methods [ slides ] |
|
||
M 03/06 | Spring Break - No Classes | |||
W 03/08 | Spring Break - No Classes
HW2 due 11:59PM |
|||
F 03/10 | Spring Break - No Classes | |||
M 03/13 |
Lecture #15
:
Imitation learning, behavior cloning [ slides ] |
|
||
W 03/15 |
Lecture #16
:
Adversarial imitation learning, Imitation learning for vision-based manipulation [ slides | slides 2 ] |
|
||
F 03/17 |
Recitation #6:
Recitation 7 [ slides ] |
|||
M 03/20 |
Lecture #17
:
Transporter networks for Robot manipulation (cont.), Model-based RL in low-dim state space [ slides ] |
|
HW3 out (tentative) |
|
W 03/22 |
Lecture #18
:
MBRL (cont), AlphaGo, AlphaGoZero [ slides ] |
|
||
F 03/24 |
Recitation #7:
Imitation Learning [ slides ] |
|||
M 03/27 |
Lecture #19
:
MBRL (cont.) latent dynamics models [ slides ] |
|
||
W 03/29 |
Lecture #20
:
Dynamics learning with graph neural networks [ slides ] |
|
||
F 03/31 | Quiz 2 | |||
M 04/03 |
Lecture #21
:
MuZero, Intelligent Exploration [ slides ] |
|
HW4 out (tentative), HW3 due 11:59PM |
|
W 04/05 |
Lecture #22
:
Intelligent Exploration [ slides ] |
|
||
F 04/07 |
Recitation #8:
Recitation 9 [ slides ] |
|||
M 04/10 |
Lecture #23
:
Self-Supervised Visual Learning [ slides ] |
|
||
W 04/12 |
Lecture #24
:
Sim2Real Transfer [ slides ] |
|
||
F 04/14 | Spring Carnival - No Classes | |||
M 04/17 |
Lecture #25
:
Visual Imitation Learning [ slides ] |
HW5 out (tentative), HW4 due 11:59PM |
||
W 04/19 |
Lecture #26
:
Offline Reinforcement Learning [ slides ] |
|||
F 04/21 |
Recitation #9:
Recitation 10 [ slides ] |
|||
M 04/24 |
Lecture #27
(Katerina):
Language-guided Control [ slides ] |
|
||
W 04/26 |
Lecture #28
:
Diffusion models for action prediction and planning [ slides ] |
|
||
F 04/28 |
Recitation #10:
Recitation 11 [ slides ] |
|||
M 05/01 | Quiz 3
HW5 due 11:59PM |