Date Lecture Readings Logistics
M 08/25 Lecture #1 :
Introduction to Reinforcement and Representation Learning
[ slides ]

W 08/27 Lecture #2 :
Multi-armed Bandits
[ slides ]

F 08/29 Recitation #1:
Neural Nets, PyTorch, OpenAI Gym, Bandits
[ slides ]

M 09/01 No Class, Labor Day

W 09/03 Lecture #3 :
Value-based Methods
[ slides ]

F 09/05 Recitation #2:
Bandits, MDPs
[ slides ]

M 09/08 Lecture #4 :
Value-based Methods (cont.)
[ slides ]

W 09/10 Lecture #5 :
Value based methods cont. (DQN, MCTS)
[ slides | slides 2 ]

F 09/12 No Recitation

M 09/15 Lecture #6 :
Actor-Critic Methods
[ slides ]

HW1 out (tentative)

W 09/17 Recitation #3:
HW1
[ slides ]

F 09/19 Lecture #7 :
Actor Critic Methods (cont.)
[ slides ]

M 09/22 Lecture #8 :
Trust Region Methods
[ slides ]

W 09/24 Lecture #9 :
Trust Region methods
[ slides ]

F 09/26 Recitation #4:
Quiz 1 Review
[ slides ]

M 09/29 Lecture #10 :
Trust Region Methods
[ slides ]

HW1 due 11:59PM

W 10/01 Lecture #11 :
Behavior Cloning, Generative Adversarial Imitation Learning
[ slides ]

F 10/03 Quiz 1

M 10/06 Lecture #12 :
Multimodel Policies, Diffusion Policies
[ slides ]

W 10/08 Lecture #13 :
Diffusion Policies (cont.) Evolutionary Methods for Policy Search
[ slides ]

HW2 out (tentative)

F 10/10 Recitation #5:
Solutions to Quiz 1
[ slides ]

M 10/13 Fall Break - No Classes

W 10/15 Fall Break - No Classes

F 10/17 Fall Break - No Classes

M 10/20 Lecture #14 :
Maximum Entropy RL, SAC, DDPG
[ slides ]

W 10/22 Lecture #15 :
Maximum Entropy RL, SAC, DDPG
[ slides ]

HW2 due Thursday 10/23 11:59PM

F 10/24 Recitation #6:
Diffusion policies (cont.)
[ slides ]

M 10/27 Lecture #16 :
Introduction to Model-Based Reinforcement Learning
[ slides ]

W 10/29 Lecture #17 :
AlphaGo, AlphaGoZero, AlphaZero
[ slides ]

HW3 out (tentative)

F 10/31 Recitation #7:
HW3
[ slides ]

M 11/03 Lecture #18 :
MBRL from sensory input
[ slides ]

W 11/05 Lecture #19 :
MBRL (cont.)
[ slides ]

F 11/07 Lecture #20 :
Visual Imitation / Quiz 2 Review
[ slides ]

M 11/10 Lecture #21 :
Multigoal Reinforcement Learning, MBRL with multimodal dynamics
[ slides | slides 2 ]

W 11/12 Quiz 2

F 11/14 Lecture #22 :
Offline RL 1: going beyond imitation, problem statement, challenges in doing offline RL, policy gradient methods / policy constraints
[ slides ]

M 11/17 Lecture #23 :
Offline RL 2: conservative methods, model-based approaches, modern model-free algorithms
[ slides ]

HW4 out (tentaive) HW3 due 11:59pm

W 11/19 Lecture #24 :
Intelligent Exploration
[ slides ]

F 11/21 Recitation #8:
HW 4
[ slides ]

M 11/24 Recitation #9:
Sim2Real Policy Learning
[ slides ]

W 11/26 Thanksgiving Break - No Classes

F 11/28 Thanksgiving Break - No Classes

M 12/01 Lecture #25 :
Foundation Models for RL
[ slides ]

W 12/03 Lecture #26 :
Foundation Models for RL
[ slides ]

HW4 due 11:59pm

F 12/05 Recitation #10:
Quiz 3 Review
[ slides ]

F 12/12 Quiz 3 Placeholder