DATS 6450 – Reinforcement Learning

Authors

Tyler Wallett, MS

Amir Jafari, PhD

Published

June 6, 2025

Gridworld (Lectures 4-6)

MountainCar (Lecture 7)

Breakout (Lecture 8)

Highway (Lecture 8)

CartPole (Lecture 9)

Cheetah (Lecture 10)

Instructor Information

Name: Tyler Wallett
Term: Fall 2025
Class location: MON 114
Class hours: 06:10 PM - 08:40 PM
Office location: Samson Hall Room 310
Office hours: TBD
E-mail: twallett@gwu.edu
GitHub: twallett

Course Description

The aim of this course is to provide a comprehensive understanding of the reinforcement learning framework. The course will explore the key distinctions between reinforcement learning and other artificial intelligence learning paradigms, delve into relevant industry applications, and examine both classical and deep learning approaches. Additionally, the course will cover the taxonomy of reinforcement learning and offer hands-on experience through practical implementations using OpenAI Gymnasium and other learning environments.

The classical approach will focus on learning methods designed to find optimal solutions in tabular environments, whereas the deep learning approach will tackle the challenge of finding approximate optimal solutions in large or continuous environments through the use of deep learning architectures.

The course will introduce the taxonomy of reinforcement learning by focusing on model-free value-based and policy-based methods. Model-based reinforcement learning will be covered briefly, as it is allocated only one lecture.

To conclude, a discussion on advanced topics, applications, and outlook of reinforcement learning will be provided.

Learning Outcomes

Implement reinforcement learning frameworks using numpy and tensorflow.
Design decision-making systems using classical and deep learning architectures.
Explain the reinforcement learning taxonomy.
Identify reinforcement learning’s challenges, current research, and future outlook.

Resources

Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto (Web Link)
The Reinforcement Learning Course by Hugging Face (Web Link)
Spinning Up in Deep RL by OpenAI (Web Link)
OpenAI Gymnasium API documentation (Web Link)
Tensorflow Python API documentation (Web Link)

Software Requirements

Programming Language: Python.

pip install numpy tensorflow pygame gymnasium pickle tqdm tensorboard

Cloud Services: Google Colab.
Version Control: GitHub.

Course Outline

Week	Topic	Quiz/Exams	Learning Objectives
Aug. 25, 2025	Introduction to Reinforcement Learning		• Why should I study Reinforcement Learning? • What is Reinforcement Learning? • Where is Reinforcement Learning Applied? • How is Reinforcement Learning Structured?
Sep. 1, 2025	Labor Day
Sep. 8, 2025	Mathematical Foundations		• Set Theory • Axiomatic Probability • Conditioning • Independence • Random Variables • Expectation • Probability Distribution
Sep. 15, 2025	Multi-Armed Bandits	Quiz 1	• Multi-Armed Bandit Framework • \(\epsilon\)-Greedy • Upper Confidence Boundary (UCB) • Thompson Sampling
Sep. 22, 2025	Dynamic Programming	Quiz 2	• OpenAI Gymansium `GridWorldEnv` • Markov Chain • Markov Decision Process (MDPs) • Dynamic Programming
Sep. 29, 2025	Monte Carlo	Quiz 3	• OpenAI Gymansium `GridWorldEnv` • Monte Carlo Prediction • Exploring Starts Monte Carlo • On-Policy Monte Carlo • Off-Policy Monte Carlo
Oct. 6, 2025	Temporal Difference	Quiz 4	• OpenAI Gymansium `GridWorldEnv` • Temporal Difference (TD) Prediction • SARSA • Q-Learning • Double Q-Learning • (Optional) n-step TD
Oct. 13, 2025	Function Approximation	Exam 1	• OpenAI Gymansium `MountainCar-v0` • Value Function Approximation (VFA) • On-Policy Function Approximation • Semi-gradient SARSA • Limitations of Off-Policy Function Approximation
Oct. 20, 2025	Deep Q-Networks	Quiz 5	• OpenAI Gymansium `ALE/Breakout-v5` • Multi-Layered Perceprtons (MLPs) • Convolutional Neural Networks (CNNs) • Experience Replay • Fixed Targets • Vanilla Deep Q-Network
Oct. 27, 2025	Policy Gradients	Quiz 6	• OpenAI Gymansium `CartPole-v1` • Policy Gradient Theorem • Vanilla Policy Gradient
Nov. 3, 2025	Advanced Policy Gradients	Quiz 7	• OpenAI Gymansium `HalfCheetah-v5` • Trust Region Policy Optimization (TRPO) • Proximal Policy Optimization: KL-Divergence • Proximal Policy Optimization: Clip
Nov. 10, 2025	Monte Carlo Tree Search	Exam 2	• OpenAI Gymansium `CartPole-v0` • Model-based Reinforcement Learning • Monte Carlo Tree Search • AlphaGo • MuZero
Nov. 17, 2025	Conclusion		• Advanced Topics in Reinforcement Learning • Identify the Reinforcement Learning Application • Outlook of Reinforcement Learning
Nov. 24, 2025	Thanksgiving Break
Dec. 1, 2025	Final Project Submission

Prerequisites

DATS 6101 - Introduction to Data Science

Assignments & Grading

Assignment	Points
Quizzes (5 best scores)	25
Exam 1	25
Exam 2	25
Final Project	25

Average Learning Per Week

Students are expected to spend a minimum of 100 minutes of out-of-class work for every 50 minutes of direct instruction, for a minimum total of 2.5 hours a week. A 3-credit course should include 2.5 hours of direct instruction and a minimum of 5 hours of independent learning or 7.5 hours per week.

Online Resources

For technical requirements and support, student services, obtaining a GWorld card, and state contact information please check HERE

Classroom Recording

The particular class recordings will be available to students who are registered on an individual basis, upon request. Please let me know in advance if you have any medical issues or emergencies that will prevent you from joining the class.

Virtual Academic Support

Writing and research consultations are available online. See HERE. Coaching, offered through the Office of Student Success, is available in a virtual format. See HERE. Academic Commons offers several short videos addressing different virtual learning strategies for the unique circumstances of the fall 2020 semester. See HERE. They also offer a variety of live virtual workshops to equip students with the tools they need to succeed in a virtual environment. See HERE.

Safety and Security

In an emergency: call GWPD 202-994-6111 or 911. For situation-specific actions: review the Emergency Response Handbook in HERE. In an active violence situation: Get Out, Hide Out, or Take Out. See HERE. Stay informed: safety.gwu.edu/stay-informed.