Schedule
Date | Lecture | Readings | Logistics | |
---|---|---|---|---|
M 08/28 |
Lecture #1
:
Introduction to Reinforcement and Representation Learning [ slides ] |
|
||
W 08/30 |
Lecture #2
:
Multi-armed Bandits [ slides ] |
|
||
F 09/01 |
Recitation #1:
Neural Nets, TensorFlow & Keras, OpenAI Gym, Bandits [ slides ] |
|
||
M 09/04 | Labor Day - No Classes | |||
W 09/06 |
Lecture #3
:
Markov Decision Processes, Value Iteration, Policy Iteration [ slides ] |
|
HW1 out (tentative) |
|
F 09/08 |
Recitation #2:
Bandits, MDPs & HW1 [ slides ] |
|||
M 09/11 |
Lecture #4
:
Monte Carlo Learning and Temporal Difference Learning [ slides ] |
|
||
W 09/13 |
Lecture #5
:
Monte Carlo Learning and Temporal Difference Learning (Cont.) [ slides ] |
|
||
F 09/15 |
Lecture #6
:
Planning, Monte Carlo Tree search [ slides ] |
|
||
M 09/18 |
Lecture #7
:
Function approximation in prediction and control, Deep Q-learning [ slides ] |
|
||
W 09/20 |
Lecture #8
:
Policy gradients, REINFORCE, Actor-Critic methods [ slides ] |
|||
F 09/22 |
Lecture #9
:
Natural PG, PPO, TRPO [ slides ] |
|
||
M 09/25 |
Lecture #10
:
Deterministic Policy gradient, re-parametrized PG [ slides ] |
|
HW1 due 11:59pm, HW2 out (tentative) |
|
W 09/27 |
Lecture #11
:
Evolutionary methods for policy search [ slides ] |
|
||
F 09/29 |
Recitation #3:
MCTS, TD Learning, Deep Q Learning, HW2 [ slides ] |
|||
M 10/02 |
Recitation #4:
Quiz 1 Review [ slides ] |
|||
W 10/04 |
Recitation #5:
Large OH/Quiz 1 Recitation [ slides ] |
|||
F 10/06 | Quiz 1 | |||
M 10/09 |
Lecture #12
:
Imitation learning, behavior cloning [ slides ] |
|
||
W 10/11 |
Lecture #13
:
Imitation learning with generative models [ slides | slides 2 ] |
|
HW2 due 11:59PM |
|
F 10/13 |
Recitation #6:
Solutions to Quiz 1 [ slides ] |
|||
M 10/16 | Fall Break - No Classes | |||
W 10/18 | Fall Break - No Classes | |||
F 10/20 | Fall Break - No Classes | |||
M 10/23 |
Lecture #14
:
Imitation learning with generative models (cond. ), multigoal imitation learning and reinforcement learning [ slides ] |
|
HW3 out (tentative) |
|
W 10/25 |
Lecture #15
:
AlphaGo, AlphaGoZero, AlphaZero [ slides | slides 2 ] |
|
||
F 10/27 |
Recitation #7:
HW3, Gaussian Processes, Bayes Optimization [ slides | slides 2 ] |
|||
M 10/30 |
Lecture #16
:
MBRL in explicit and observable low-dimensional state spaces [ slides ] |
|||
W 11/01 |
Lecture #17
:
MBRL from sensory input, planning in sensory space, planning in a latent state space [ slides ] |
|
||
F 11/03 |
Lecture #18
:
MBRL (cont.) Stochastic latent dynamics models [ slides ] |
|
||
M 11/06 |
Recitation #8:
Quiz 2 Review & HW4 [ slides ] |
HW4 out (tentative), HW3 due 11:59PM |
||
W 11/08 |
Lecture #19
:
Intelligent Exploration [ slides ] |
|
||
F 11/10 | Quiz 2 | |||
M 11/13 |
Lecture #20
:
Offline RL, Learning by Observation [ slides ] |
|||
W 11/15 |
Lecture #21
(Aviral Kumar):
Offline RL [ slides ] |
|||
F 11/17 |
Recitation #9:
Homework 5 [ slides ] |
HW5 out (tentative), HW4 due 11:59PM |
||
M 11/20 |
Lecture #22
:
Sim2Real Transfer [ slides ] |
|
||
W 11/22 | Thanksgiving Break - No Classes | |||
F 11/24 | Thanksgiving Break - No Classes | |||
M 11/27 |
Lecture #23
:
Visual Imitation Learning [ slides ] |
|
||
W 11/29 |
Lecture #24
:
Language and Robot Control [ slides ] |
|
||
F 12/01 |
Recitation #10:
Recitation [ slides ] |
|||
M 12/04 |
Lecture #25
:
Self-Supervised Visual Learning [ slides ] |
HW5 due 11:59PM |
||
W 12/06 |
Lecture #26
:
Multimodal Policies , Control and 3D Spatial Representations [ slides | slides 2 ] |
|||
F 12/08 |
Recitation #11:
Quiz 3 Review Part II [ slides ] |
|||
12/12 | Quiz 3, 5:30pm - 8:30pm, SH 105 (Scaife Hall) |