Schedule
Date | Lecture | Readings | Logistics | |
---|---|---|---|---|
M 08/29 |
Lecture #1
:
Introduction to Reinforcement and Representation Learning [ slides ] |
|
||
W 08/31 |
Lecture #2
:
Multi-armed Bandits [ slides ] |
|
||
F 09/02 |
Recitation #1:
Neural Nets, TensorFlow & Keras, OpenAI Gym, Bandits [ slides ] |
|
||
M 09/05 | Labor Day - No Classes | |||
W 09/07 |
Lecture #3
:
Markov Decision Processes, Value Iteration, Policy Iteration [ slides ] |
|
HW1 out (tentative) |
|
F 09/09 |
Recitation #2:
Bandits, MDPs & HW1 [ slides ] |
|||
M 09/12 |
Lecture #4
:
Monte Carlo Learning and Temporal Difference Learning [ slides | slides 2 ] |
|
||
W 09/14 |
Lecture #5
:
Monte Carlo Learning and Temporal Difference Learning (Cont.) , Planning, Monte Carlo Tree search [ slides | slides 2 ] |
|
||
F 09/16 |
Recitation #3:
HW1 [ slides ] |
|||
M 09/19 |
Lecture #6
:
Function approximation in prediction and control, Deep Q-learning [ slides ] |
|
||
W 09/21 |
Lecture #7
:
Policy gradients, REINFORCE, Actor-Critic methods [ slides ] |
|||
F 09/23 |
Recitation #4:
MCTS, TD Learning, Deep Q Learning [ slides ] |
|||
M 09/26 |
Lecture #8
:
Policy gradients, REINFORCE, Actor-Critic methods [ slides ] |
HW1 due 11:59pm |
||
W 09/28 |
Lecture #9
:
Natural PG, PPO, TRPO [ slides ] |
|
HW2 out (tentative) |
|
F 09/30 |
Recitation #5:
Quiz 1 Review [ slides ] |
|||
M 10/03 |
Lecture #10
:
Maximum Entropy RL, soft actor critic, Deterministic Policy gradient, re-parametrized PG [ slides | slides 2 ] |
|
||
W 10/05 |
Lecture #11
:
Evolutionary methods for policy search [ slides ] |
|
||
F 10/07 | Quiz 1 | |||
M 10/10 |
Lecture #12
:
Imitation learning, behavior cloning [ slides ] |
|
||
W 10/12 |
Lecture #13
:
Adversarial imitation learning, Imitation learning for vision-based manipulation [ slides | slides 2 ] |
|
||
F 10/14 |
Recitation #6:
Solutions to Quiz 1 [ slides ] |
|||
M 10/17 | Fall Break - No Classes | |||
W 10/19 | Fall Break - No Classes | |||
F 10/21 | Fall Break - No Classes | |||
M 10/24 |
Lecture #14
:
Transporter networks for Robot manipulation (cont.), Model-based RL in low-dim state space [ slides | slides 2 ] |
|
HW3 out (tentative), HW2 due 11:59PM |
|
W 10/26 |
Lecture #15
:
MBRL (cont), AlphaGo, AlphaGoZero [ slides ] |
|
||
F 10/28 | Tartan Community Day - No Classes | |||
M 10/31 |
Lecture #16
:
AlphaGoZero, MBRL in sensory space [ slides | slides 2 ] |
|
||
W 11/02 |
Lecture #17
:
MBRL (cont.) Deterministic latent dynamics models [ slides | slides 2 ] |
|
||
F 11/04 |
Recitation #7:
Quiz 2 Review [ slides ] |
|||
M 11/07 |
Lecture #18
:
MBRL (cont.) Stochastic latent dynamics models [ slides | slides 2 ] |
|
HW4 out (tentative), HW3 due 11:59PM |
|
W 11/09 |
Lecture #19
:
Dynamics learning with graph neural networks [ slides | slides 2 ] |
|
||
F 11/11 | Quiz 2 | |||
M 11/14 |
Lecture #20
:
Intelligent Exploration [ slides ] |
|
||
W 11/16 |
Lecture #21
:
Postponed [ slides ] |
|||
F 11/18 |
Recitation #8:
Recitation [ slides ] |
|||
M 11/21 |
Lecture #22
:
Offline RL, Learning by Observation [ slides ] |
HW5 out (tentative), HW4 due 11:59PM |
||
W 11/23 |
Lecture #23
(Zoom Only):
Sim2Real Transfer [ slides ] |
|
||
F 11/25 | Thanksgiving Break - No Classes | |||
M 11/28 |
Lecture #24
:
Visual Imitation Learning [ slides | slides 2 ] |
|||
W 11/30 |
Lecture #25
:
Self-Supervised Visual Learning [ slides ] |
|
||
F 12/02 |
Recitation #9:
Homework 5 [ slides ] |
|||
M 12/05 |
Lecture #26
(Katerina):
Control with 3D visual representations [ slides ] |
|
||
W 12/07 |
Lecture #27
:
Language-guided Control [ slides ] |
|
HW5 due 11:59PM |
|
F 12/09 |
Recitation #10:
Quiz 3 Review [ slides ] |
|||
M 12/12 | Quiz 3, 8:30 am - 11:30 am, GHC 4401 |