DATS 6450 – Reinforcement Learning
Instructor Information
- Name: Tyler Wallett
- Term: Fall 2025
- Class location: MON 114
- Class hours: 06:10 PM - 08:40 PM
- Office location: Samson Hall Room 310
- Office hours: TBD
- E-mail: twallett@gwu.edu
- GitHub: twallett
Course Description
The aim of this course is to provide a comprehensive understanding of the reinforcement learning framework. The course will explore the key distinctions between reinforcement learning and other artificial intelligence learning paradigms, delve into relevant industry applications, and examine both classical and deep learning approaches. Additionally, the course will cover the taxonomy of reinforcement learning and offer hands-on experience through practical implementations using OpenAI Gymnasium and other learning environments.
The classical approach will focus on learning methods designed to find optimal solutions in tabular environments, whereas the deep learning approach will tackle the challenge of finding approximate optimal solutions in large or continuous environments through the use of deep learning architectures.
The course will introduce the taxonomy of reinforcement learning by focusing on model-free value-based and policy-based methods. Model-based reinforcement learning will be covered briefly, as it is allocated only one lecture.
To conclude, a discussion on advanced topics, applications, and outlook of reinforcement learning will be provided.
Learning Outcomes
- Implement reinforcement learning frameworks using
numpy
andtensorflow
. - Design decision-making systems using classical and deep learning architectures.
- Explain the reinforcement learning taxonomy.
- Identify reinforcement learning’s challenges, current research, and future outlook.
Resources
Software Requirements
- Programming Language: Python.
pip install numpy tensorflow pygame gymnasium pickle tqdm tensorboard
Cloud Services: Google Colab.
Version Control: GitHub.
Course Outline
Week | Topic | Quiz/Exams | Learning Objectives |
---|---|---|---|
Aug. 25, 2025 | Introduction to Reinforcement Learning | • Why should I study Reinforcement Learning? • What is Reinforcement Learning? • Where is Reinforcement Learning Applied? • How is Reinforcement Learning Structured? |
|
Sep. 1, 2025 | Labor Day | ||
Sep. 8, 2025 | Mathematical Foundations | • Set Theory • Axiomatic Probability • Conditioning • Independence • Random Variables • Expectation • Probability Distribution |
|
Sep. 15, 2025 | Multi-Armed Bandits | Quiz 1 | • Multi-Armed Bandit Framework • \(\epsilon\)-Greedy • Upper Confidence Boundary (UCB) • Thompson Sampling |
Sep. 22, 2025 | Dynamic Programming | Quiz 2 | • OpenAI Gymansium GridWorldEnv • Markov Chain • Markov Decision Process (MDPs) • Dynamic Programming |
Sep. 29, 2025 | Monte Carlo | Quiz 3 | • OpenAI Gymansium GridWorldEnv • Monte Carlo Prediction • Exploring Starts Monte Carlo • On-Policy Monte Carlo • Off-Policy Monte Carlo |
Oct. 6, 2025 | Temporal Difference | Quiz 4 | • OpenAI Gymansium GridWorldEnv • Temporal Difference (TD) Prediction • SARSA • Q-Learning • Double Q-Learning • (Optional) n-step TD |
Oct. 13, 2025 | Function Approximation | Exam 1 | • OpenAI Gymansium MountainCar-v0 • Value Function Approximation (VFA) • On-Policy Function Approximation • Semi-gradient SARSA • Limitations of Off-Policy Function Approximation |
Oct. 20, 2025 | Deep Q-Networks | Quiz 5 | • OpenAI Gymansium ALE/Breakout-v5 • Multi-Layered Perceprtons (MLPs) • Convolutional Neural Networks (CNNs) • Experience Replay • Fixed Targets • Vanilla Deep Q-Network |
Oct. 27, 2025 | Policy Gradients | Quiz 6 | • OpenAI Gymansium CartPole-v1 • Policy Gradient Theorem • Vanilla Policy Gradient |
Nov. 3, 2025 | Advanced Policy Gradients | Quiz 7 | • OpenAI Gymansium HalfCheetah-v5 • Trust Region Policy Optimization (TRPO) • Proximal Policy Optimization: KL-Divergence • Proximal Policy Optimization: Clip |
Nov. 10, 2025 | Monte Carlo Tree Search | Exam 2 | • OpenAI Gymansium CartPole-v0 • Model-based Reinforcement Learning • Monte Carlo Tree Search • AlphaGo • MuZero |
Nov. 17, 2025 | Conclusion | • Advanced Topics in Reinforcement Learning • Identify the Reinforcement Learning Application • Outlook of Reinforcement Learning |
|
Nov. 24, 2025 | Thanksgiving Break | ||
Dec. 1, 2025 | Final Project Submission |
Prerequisites
- DATS 6101 - Introduction to Data Science
Assignments & Grading
Assignment | Points |
---|---|
Quizzes (5 best scores) | 25 |
Exam 1 | 25 |
Exam 2 | 25 |
Final Project | 25 |
Students are expected to spend a minimum of 100 minutes of out-of-class work for every 50 minutes of direct instruction, for a minimum total of 2.5 hours a week. A 3-credit course should include 2.5 hours of direct instruction and a minimum of 5 hours of independent learning or 7.5 hours per week.
For technical requirements and support, student services, obtaining a GWorld card, and state contact information please check HERE
The particular class recordings will be available to students who are registered on an individual basis, upon request. Please let me know in advance if you have any medical issues or emergencies that will prevent you from joining the class.
Writing and research consultations are available online. See HERE. Coaching, offered through the Office of Student Success, is available in a virtual format. See HERE. Academic Commons offers several short videos addressing different virtual learning strategies for the unique circumstances of the fall 2020 semester. See HERE. They also offer a variety of live virtual workshops to equip students with the tools they need to succeed in a virtual environment. See HERE.