DATS 6450 – Reinforcement Learning
Instructor Information

- Name: Tyler Wallett
- Term: Fall 2025
- Class location: MON 114
- Class hours: 06:10 PM - 08:40 PM
- Office location: Samson Hall Room 310
- Office hours: Monday’s 2 - 4 PM
- E-mail: twallett@gwu.edu
- GitHub: twallett
Course Description
The aim of this course is to provide a comprehensive understanding of the reinforcement learning framework. The course will explore the key distinctions between reinforcement learning and other artificial intelligence learning paradigms, delve into relevant industry applications, and examine both classical and deep learning approaches. Additionally, the course will cover the taxonomy of reinforcement learning and offer hands-on experience through practical implementations using OpenAI Gymnasium (Brockman et al. 2016) and other learning environments.
The classical approach will focus on learning methods designed to find optimal solutions in tabular environments, whereas the deep learning approach will tackle the challenge of finding approximate optimal solutions in large or continuous environments through the use of deep learning architectures.
The course will introduce the taxonomy of reinforcement learning by focusing on model-free value-based and policy-based methods. Model-based reinforcement learning will be covered briefly, as it is allocated only one lecture.
To conclude, a discussion on advanced topics, applications, and outlook of reinforcement learning will be provided.
Learning Outcomes
- Implement reinforcement learning frameworks using
numpyandtensorflow. - Design decision-making systems using classical and deep learning architectures.
- Explain the reinforcement learning taxonomy.
- Identify reinforcement learning’s challenges, current research, and future outlook.
Resources
Software Requirements
- Programming Language: Python.
pip install numpy tensorflow pygame gymnasium pickle tqdm tensorboardCloud-based GPU Environment: Google Colab.
Version Control: GitHub.
Course Outline
Summary of the Course Outline section.
| Week | Topic | Quiz/Exams | Learning Objectives |
|---|---|---|---|
| Aug. 28, 2025 | Introduction | • Why Should I Learn Reinforcement Learning? • What is Reinforcement Learning? • Where is Reinforcement Learning Applied? • How is Reinforcement Learning Structured? |
|
| Sep. 4, 2025 | Math Foundations | • Set Theory • Axiomatic Probability • Conditioning • Independence • Random Variables • Expectation • Probability Distribution |
|
| Sep. 11, 2025 | Multi-Armed Bandits | Quiz 1 | • Multi-Armed Bandit Framework • \(\epsilon\)-Greedy • Upper Confidence Boundary (UCB) • Thompson Sampling |
| Sep. 18, 2025 | Dynamic Programming | Quiz 2 | • OpenAI Gymansium GridWorldEnv • Markov Chain • Markov Decision Process (MDPs) • Iterative Policy Evaluation • Value Iteration |
| Sep. 25, 2025 | Monte Carlo | Quiz 3 | • OpenAI Gymansium GridWorldEnv • Monte Carlo Prediction • Exploring Starts Monte Carlo • On-Policy Monte Carlo • Off-Policy Monte Carlo |
| Oct. 2, 2025 | Temporal Difference | Quiz 4 | • OpenAI Gymansium GridWorldEnv • Temporal Difference (TD) Prediction • SARSA • Q-Learning • Double Q-Learning • (Optional) n-step Bootstrapping |
| Oct. 9, 2025 | Fall Break | ||
| Oct. 16, 2025 | Function Approximation | Exam 1 | • OpenAI Gymansium MountainCar-v0 • Value Function Approximation (VFA) • On-Policy Function Approximation • Semi-gradient SARSA • Limitations of Off-Policy Function Approximation |
| Oct. 23, 2025 | Deep Q-Networks | Quiz 5 | • OpenAI Gymansium ALE/Breakout-v5 • Deep Learning • Deep Q-Networks (DQN) |
| Oct. 30, 2025 | Policy Gradients I | Quiz 6 | • OpenAI Gymansium CartPole-v1 and Pusher-v5 • Policy Gradient Theorem • Addressing Sparse Rewards • Action Selections • Vanilla Policy Gradient (VPG) |
| Nov. 6, 2025 | Policy Gradients II | Quiz 7 | • OpenAI Gymansium HalfCheetah-v5 • Trust Regions • Monotonic Improvement • Proximal Policy Optimization (PPO) |
| Nov. 13, 2025 | Monte Carlo Tree Search | Exam 2 | • OpenAI Gymansium CartPole-v0 • Model-based Reinforcement Learning • Monte Carlo Tree Search (MCTS) |
| Nov. 20, 2025 | Conclusion | • Advanced Topics in Reinforcement Learning • Identify the Reinforcement Learning Application • Outlook of Reinforcement Learning |
|
| Nov. 27, 2025 | Thanksgiving Break | ||
| Dec. 4, 2025 | Final Project Presentation & Submission |
Prerequisites
- DATS 6101 - Introduction to Data Science
Assignments & Grading
Summary of the Assignments & Grading section.
| Assignment | Points |
|---|---|
| Quizzes (5 best scores) | 25 |
| Exam 1 | 25 |
| Exam 2 | 25 |
| Final Project | 25 |
Students are expected to spend a minimum of 100 minutes of out-of-class work for every 50 minutes of direct instruction, for a minimum total of 2.5 hours a week. A 3-credit course should include 2.5 hours of direct instruction and a minimum of 5 hours of independent learning or 7.5 hours per week.
For technical requirements and support, student services, obtaining a GWorld card, and state contact information please check HERE
The particular class recordings will be available to students who are registered on an individual basis, upon request. Please let me know in advance if you have any medical issues or emergencies that will prevent you from joining the class.
Writing and research consultations are available online. See HERE. Coaching, offered through the Office of Student Success, is available in a virtual format. See HERE. Academic Commons offers several short videos addressing different virtual learning strategies for the unique circumstances of the fall 2020 semester. See HERE. They also offer a variety of live virtual workshops to equip students with the tools they need to succeed in a virtual environment. See HERE.