Schedule
Date | Lecture | Readings | Logistics | |
---|---|---|---|---|
M 08/30 |
Lecture #1
(Katerina & Ruslan):
Introduction to Reinforcement and Representation Learning [ slides | video ] |
|
||
W 09/01 |
Lecture #2
(Ruslan):
Multi-armed Bandits [ slides | video ] |
|
||
F 09/03 |
Recitation #1:
Neural Nets, TensorFlow & Keras, OpenAI Gym, Bandits [ slides | slides 2 | video | notes | notes 2 ] |
|
||
M 09/06 |
Lecture #3
:
Labor Day - No Classes [ slides ] |
|||
W 09/08 |
Lecture #4
(Ruslan):
Markov Decision Processes, Value Iteration, Policy Iteration [ slides | video ] |
|
HW1 out (tentative) |
|
F 09/10 |
Recitation #2:
Pytorch, Training Resources & HW1 [ slides | video | notes ] |
|||
M 09/13 |
Lecture #5
(Ruslan):
Monte Carlo Learning and Temporal Difference Learning [ slides | video ] |
|
||
W 09/15 |
Lecture #6
(Ruslan):
Monte Carlo Learning and Temporal Difference Learning (Cont.) [ slides | video ] |
|
||
F 09/17 |
Recitation #3:
HW1 [ slides | video ] |
|||
M 09/20 |
Lecture #7
(Katerina):
Function approximation in prediction and control, Deep Q-learning, Deep SARSA [ slides | video ] |
|
||
W 09/22 |
Lecture #8
(Katerina):
Planning, Monte Carlo Tree search [ slides | video ] |
|||
F 09/24 |
Recitation #4:
MCTS, TD Learning, Deep Q Learning [ slides | video ] |
|||
M 09/27 |
Lecture #9
(Ruslan):
Policy gradients, REINFORCE, Actor-Critic methods [ slides | video ] |
|
HW1 due 09/27 11:59pm |
|
W 09/29 |
Lecture #10
(Ruslan):
Policy gradients, REINFORCE, Actor-Critic methods (cont.) [ slides | video ] |
|
HW2 out (tentative) |
|
F 10/01 |
Recitation #5:
Quiz 1 Review [ slides | video ] |
|||
M 10/04 |
Lecture #11
(Katerina):
Natural PG, PPO, TRPO [ slides | video ] |
|
||
W 10/06 |
Lecture #12
(Katerina and Ben):
Maximum Entropy RL, soft actor critic, Deterministic Policy gradient, re-parametrized PG [ slides | slides 2 | video ] |
|
||
F 10/08 | Quiz 1 [covering everything through Lecture 10, Wednesday, September 29] | |||
M 10/11 |
Lecture #13
(Katerina):
Evolutionary methods for policy search [ slides | video ] |
|
||
W 10/13 |
Lecture #14
(Ruslan):
Imitation learning, behavior cloning [ slides | video ] |
|
||
F 10/15 | Mid-semester break - No classes | |||
M 10/18 |
Lecture #14
(Katerina):
Imitation learning (cont.), Adversarial imitation learning [ slides | video ] |
|
HW3 out (tentative), HW2 due 10/18 11:59PM |
|
W 10/20 |
Lecture #15
(Katerina):
Model based RL, Low dimensional model, Explicit models. [ slides | video ] |
|
||
F 10/22 |
Recitation #7:
Homework 3 [ slides | video ] |
|||
M 10/25 |
Lecture #16
(Katerina):
MBRL (cont), AlphaGo, AlphaGoZero, MuZero [ slides | video ] |
|
||
W 10/27 |
Lecture #17
(Katerina):
MBRL (cont.) Holistic and graph-based world models [ slides | video ] |
|
||
F 10/29 |
Recitation #8:
Quiz 2 Review [ slides | video ] |
|||
M 11/01 |
Lecture #18
(Katerina):
MBRL (cont.) Time dependent linear models, iLQR [ slides | video ] |
HW3 due 11/01 11:59pm |
||
W 11/03 |
Lecture #19
(Katerina):
MBRL(cont), stochastic world models [ slides | slides 2 | video ] |
|
||
F 11/05 | No Classes - Day for Community Engagement | |||
M 11/08 | Quiz 2 [covering everything from lectures 10-19 (Wednesday, Nov 03)]
Pass/Fail Grade Option Deadline |
|||
W 11/10 |
Lecture #20
(Ruslan):
Offline RL [ slides | slides 2 | video ] |
|
||
F 11/12 |
Lecture #21
(Katerina):
Intelligent Exploration [ slides | video ] |
|
||
M 11/15 |
Lecture #22
(Katerina):
Deep exploration (cont.) and Sim2Real tranfer [ slides | video ] |
|||
W 11/17 |
Lecture #23
(Katerina):
Sim2Real tranfer (cont.) and Visual Imitation Learning [ slides | video ] |
|||
F 11/19 |
Recitation #9:
Homework 4 [ slides ] |
|||
M 11/22 |
Lecture #24
(Daniel Seita (https://www.cs.cmu.edu/~dseita/)):
GUEST lecture: Visual Imitation Learning (cont.) and vision-based manipulation with Transporters [ slides ] |
|||
W 11/24 | Thanksgiving Break - No classes | |||
F 11/26 | Thanksgiving Break - No classes | |||
M 11/29 |
Lecture #25
(Ruslan):
Deep RL for Navigation [ slides ] |
|
HW4 due 11/29 11:59pm |
|
W 12/01 |
Lecture #26
(Ruslan):
Efficient Distributed RL [ slides ] |
|||
F 12/03 |
Recitation #10:
Quiz 3 Review [ slides ] |
|||
F 12/07 | Quiz 3 (1-4pm)
Pass/Fail Grade Option Deadline |