DS 543 Introduction to Reinforcement Learning (Spring 2024)

This course aim to present a math-lite introduction to reinforcement learning. We will cover (1) the basics of Markov Decision Processes (2) primary algorithmic paradigms including model-based, value-based and policy-based learning (3) modern challenges and open problems in RL.


Instructors: Xuezhou Zhang

TF: Gaurav Koley

Lecture time: Tuesday/Thursday 12:30pm - 1:45pm ET

Instructor office hours: Tuesday 2:00 - 3:00pm, Friday 1:00 - 2:00pm ET at CDS 1421.

TF hours: Wednesday 11:00am-12:00pm at CDS 1311, or virtually at https://calendar.app.google/J6dDehC3KSzKYzjBA.

Schedule (tentative)

Topics Reading Slides/HW
Chapter 1 Fundamentals: Markov Decision Processes AJKS: 1.1, 1.2 Slides
Chapter 2 Planning in MDPs: Policy and Value Itertions AJKS: 1.3 Slides, HW1.pdf, HW1.tex
Chapter 3 Model-based RL: MPC, Dreamer, MuZero AJKS: 2.1, 2.3 Slides
Chapter 4 Value-based RL: FQI, Q-learning AJKS: 4.1, 4.2 Slides, Project
Chapter 4 Value-based RL: Bellman completeness, DQN AJKS: 4.1, 4.2 Slides
Chapter 5 Policy-based RL: Policy Gradient Theorem, Reinforce AJKS: 11-14 Slides
Chapter 5 Policy-based RL: Actor-Critic, Importance Sampling, DPG AJKS: 11-14 Slides, HW2.pdf, HW2.tex
Chapter 5 Policy-based RL: NPG, TRPO, PPO AJKS: 11-14 Slides
Chapter 6 Imitation Learning: Behavior Cloning AJKS: 15 Slides, Pytorch Demo
Chapter 6 Imitation Learning: Dagger AJKS: 15 Slides
Chapter 7 Exploration: Exploration in MAB AJKS: 6.1.1 Slides
Chapter 7 Exploration: Exploration in MAB AJKS: 6.1.1 Slides
Chapter 8 Exploration: Exploration in MDPs AJKS: 7.2 Slides
Chapter 8 Exploration: Exploration in Deep RL AJKS: 7.2 Slides, HW3
Chapter 9 Offline RL: FQI and naive methods AJKS: 4.1 Slides
Chapter 9 Offline RL: Learning without full data coverage AJKS: 4.1 Slides
Chapter 9 Offline RL: LCB and Empirical Methods AJKS: 4.1 Slides
Chapter 10 Multi-agent RL: Game Theory Basics TBD Slides
Chapter 10 Multi-agent RL: Markov Games and Planning in MG TBD Slides
Chapter 10 Multi-agent RL: Online Learning in MGs TBD Slides
Chapter 10.5 Mechanism Design: Going beyond being a player in the game TBD Slides