Date Lecture Readings Logistics
W 01/19 Lecture #1 (Katerina):
Introduction to Reinforcement and Representation Learning
[ slides ]

F 01/21 Recitation #1:
Neural Nets, TensorFlow & Keras, OpenAI Gym, PyTorch
[ slides | slides 2 | notes ]

M 01/24 Lecture #2 (Katerina):
Multi-armed Bandits
[ slides ]

W 01/26 Lecture #3 (Katerina):
Markov Decision Processes, Value Iteration, Policy Iteration
[ slides ]

F 01/28 Recitation #2:
Bandits, MDPs & HW1
[ slides | slides 2 ]

M 01/31 Lecture #4 (Katerina):
Monte Carlo Learning and Temporal Difference Learning
[ slides ]

HW1 out

W 02/02 Lecture #5 (Katerina):
Monte Carlo Learning and Temporal Difference Learning (Cont.)
[ slides ]

F 02/04 Recitation #3:
HW1
[ slides ]

M 02/07 Lecture #6 (Katerina):
Monte Carlo and TD (cont.), Function approximation in prediction and control, Deep Q-learning, Deep SARSA
[ slides | slides 2 ]

W 02/09 Lecture #7 (Katerina):
Planning, Monte Carlo Tree search
[ slides ]

F 02/11 Recitation #4:
MCTS, TD Learning, Deep Q Learning
[ slides | slides 2 ]

M 02/14 Lecture #8 (Katerina):
MCTS (cont.), Policy gradients, REINFORCE, Actor-Critic methods
[ slides ]

HW1 due 02/14 11:59pm

W 02/16 Lecture #9 (Katerina):
Policy gradients, REINFORCE, Actor-Critic methods (cont.)
[ slides ]

HW2 out (tentative)

F 02/18 Recitation #5:
Quiz 1 Review
[ slides ]

M 02/21 Lecture #10 (Katerina):
Actor Critic Methods (cont.), Natural PG, PPO, TRPO
[ slides | slides 2 ]

W 02/23 Lecture #11 (Katerina):
Deterministic Policy gradient, re-parametrized PG
[ slides | slides 2 ]

F 02/25 Quiz 1 [covering everything through Lecture 10, Monday, February 21]

M 02/28 Lecture #12 (Katerina):
Evolutionary methods for policy search
[ slides | slides 2 ]

W 03/02 Lecture #13 (Katerina):
Imitation learning, behavior cloning
[ slides ]

F 03/04 Recitation #6:
No Recitation (Mid-semester break)
[ slides ]

M 03/07 Spring break - No classes

W 03/09 Spring break - No classes

F 03/11 Recitation #7:
No Recitation (Spring break)
[ slides ]

M 03/14 Lecture #14 (Katerina):
Imitation learning (cont.), Adversarial imitation learning
[ slides ]

HW3 out (tentative), HW2 due 03/14 11:59PM

W 03/16 Lecture #15 (Katerina):
Model based RL, Low dimensional model, Explicit models.
[ slides ]

F 03/18 Recitation #8:
Homework 3
[ slides ]

M 03/21 Lecture #16 (Katerina):
MBRL (cont.) Holistic and graph-based world models
[ slides ]

W 03/23 Lecture #17 (Katerina):
MBRL (cont), AlphaGo, AlphaGoZero, MuZero
[ slides ]

F 03/25 Recitation #9:
Quiz 2 Review
[ slides ]

M 03/28 Lecture #18 (Katerina):
MBRL (cont), AlphaGo, AlphaGoZero, MuZero (cont.)
[ slides ]

HW4 out (tentative), HW3 due 03/28 11:59pm

W 03/30 Lecture #19 (Katerina):
MBRL (cont.) Time dependent linear models, iLQR
[ slides ]

F 04/01 Quiz 2 [covering everything from lectures 10 through 17 (no MuZero) (Wednesday, March 23)]

Pass/Fail Grade Option Deadline

M 04/04 Lecture #20 (Katerina):
MBRL(cont), stochastic world models
[ slides ]

W 04/06 Lecture #21 (Katerina):
Intelligent Exploration
[ slides ]

F 04/08 No Classes - Spring Carnival

M 04/11 Lecture #22 (Katerina):
Offline RL
[ slides ]

HW4 due 04/11 11:59pm, HW5 out (tentative)

W 04/13 Lecture #23 (Katerina):
Sim2Real transfer
[ slides ]

F 04/15 Recitation #10:
Homework 5
[ slides ]

M 04/18 Lecture #24 (Katerina):
Visual Imitation Learning
[ slides ]

W 04/20 Lecture #25 :
Self-Supervised Visual Learning
[ slides ]

F 04/22 Recitation #11:
Quiz 3 Review
[ slides ]

M 04/25 Lecture #26 (Katerina):
Control with 3D visual representations
[ slides ]

W 04/27 Lecture #27 (Katerina):
Transporter Networks for Vision-Based Manipulation
[ slides ]

F 04/29 Recitation #12:
Quiz 3 Review
[ slides ]

Pass/Fail Grade Option Deadline

T 05/03 Quiz 3