DS 598 Introduction to Reinforcement Learning

This course aim to present a math-lite introduction to reinforcement learning. We will cover (1) the basics of Markov Decision Processes (2) primary algorithmic paradigms including model-based, value-based and policy-based learning (3) modern challenges and open problems in RL.

Instructors: Xuezhou Zhang

TF: Gaurav Koley

Lecture time: Tuesday/Thursday 12:30pm - 1:45pm ET

Instructor office hours: Tuesday 2:00 - 3:00pm, Friday 1:00 - 2:00pm ET at CDS 1421.

TF hours: Wednesday 11:00am-12:00pm at CDS 1311, or virtually at https://calendar.app.google/J6dDehC3KSzKYzjBA.

Schedule (tentative)

	Topics	Reading	Slides/HW
Chapter 1	Fundamentals: Markov Decision Processes	AJKS: 1.1, 1.2	Slides
Chapter 2	Planning in MDPs: Policy and Value Itertions	AJKS: 1.3	Slides, HW1.pdf, HW1.tex
Chapter 3	Model-based RL: MPC, Dreamer, MuZero	AJKS: 2.1, 2.3	Slides
Chapter 4	Value-based RL: FQI, Q-learning	AJKS: 4.1, 4.2	Slides, Project
Chapter 4	Value-based RL: Bellman completeness, DQN	AJKS: 4.1, 4.2	Slides
Chapter 5	Policy-based RL: Policy Gradient Theorem, Reinforce	AJKS: 11-14	Slides
Chapter 5	Policy-based RL: Actor-Critic, Importance Sampling, DPG	AJKS: 11-14	Slides, HW2.pdf, HW2.tex
Chapter 5	Policy-based RL: NPG, TRPO, PPO	AJKS: 11-14	Slides
Chapter 6	Imitation Learning: Behavior Cloning	AJKS: 15	Slides, Pytorch Demo
Chapter 6	Imitation Learning: Dagger	AJKS: 15	Slides
Chapter 7	Exploration: Exploration in MAB	AJKS: 6.1.1	Slides
Chapter 7	Exploration: Exploration in MAB	AJKS: 6.1.1	Slides
Chapter 8	Exploration: Exploration in MDPs	AJKS: 7.2	Slides
Chapter 8	Exploration: Exploration in Deep RL	AJKS: 7.2	Slides, HW3
Chapter 9	Offline RL: FQI and naive methods	AJKS: 4.1	Slides
Chapter 9	Offline RL: Learning without full data coverage	AJKS: 4.1	Slides
Chapter 9	Offline RL: LCB and Empirical Methods	AJKS: 4.1	Slides
Chapter 10	Multi-agent RL: Game Theory Basics	TBD	Slides
Chapter 10	Multi-agent RL: Markov Games and Planning in MG	TBD	Slides
Chapter 10	Multi-agent RL: Online Learning in MGs	TBD	Slides
Chapter 10.5	Mechanism Design: Going beyond being a player in the game	TBD	Slides