DATS 6450 – Reinforcement Learning

Authors

Tyler Wallett, MS

Amir Jafari, PhD

Published

June 6, 2025

Gridworld (Lectures 4-6)
MountainCar (Lecture 7)
Breakout (Lecture 8)
Highway (Lecture 8)
CartPole (Lecture 9)
Cheetah (Lecture 10)

Instructor Information

  • Name: Tyler Wallett
  • Term: Fall 2025
  • Class location: MON 114
  • Class hours: 06:10 PM - 08:40 PM
  • Office location: Samson Hall Room 310
  • Office hours: TBD
  • E-mail: twallett@gwu.edu
  • GitHub: twallett

Course Description

The aim of this course is to provide a comprehensive understanding of the reinforcement learning framework. The course will explore the key distinctions between reinforcement learning and other artificial intelligence learning paradigms, delve into relevant industry applications, and examine both classical and deep learning approaches. Additionally, the course will cover the taxonomy of reinforcement learning and offer hands-on experience through practical implementations using OpenAI Gymnasium and other learning environments.

The classical approach will focus on learning methods designed to find optimal solutions in tabular environments, whereas the deep learning approach will tackle the challenge of finding approximate optimal solutions in large or continuous environments through the use of deep learning architectures.

The course will introduce the taxonomy of reinforcement learning by focusing on model-free value-based and policy-based methods. Model-based reinforcement learning will be covered briefly, as it is allocated only one lecture.

To conclude, a discussion on advanced topics, applications, and outlook of reinforcement learning will be provided.

Learning Outcomes

  1. Implement reinforcement learning frameworks using numpy and tensorflow.
  2. Design decision-making systems using classical and deep learning architectures.
  3. Explain the reinforcement learning taxonomy.
  4. Identify reinforcement learning’s challenges, current research, and future outlook.

Resources

  • Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto (Web Link)
  • The Reinforcement Learning Course by Hugging Face (Web Link)
  • Spinning Up in Deep RL by OpenAI (Web Link)
  • OpenAI Gymnasium API documentation (Web Link)
  • Tensorflow Python API documentation (Web Link)

Software Requirements

  • Programming Language: Python.
pip install numpy tensorflow pygame gymnasium pickle tqdm tensorboard
  • Cloud Services: Google Colab.

  • Version Control: GitHub.

Course Outline

Week Topic Quiz/Exams Learning Objectives
Aug. 25, 2025 Introduction to Reinforcement Learning • Why should I study Reinforcement Learning?
• What is Reinforcement Learning?
• Where is Reinforcement Learning Applied?
• How is Reinforcement Learning Structured?
Sep. 1, 2025 Labor Day
Sep. 8, 2025 Mathematical Foundations • Set Theory
• Axiomatic Probability
• Conditioning
• Independence
• Random Variables
• Expectation
• Probability Distribution
Sep. 15, 2025 Multi-Armed Bandits Quiz 1 • Multi-Armed Bandit Framework
\(\epsilon\)-Greedy
• Upper Confidence Boundary (UCB)
• Thompson Sampling
Sep. 22, 2025 Dynamic Programming Quiz 2 • OpenAI Gymansium GridWorldEnv
• Markov Chain
• Markov Decision Process (MDPs)
• Dynamic Programming
Sep. 29, 2025 Monte Carlo Quiz 3 • OpenAI Gymansium GridWorldEnv
• Monte Carlo Prediction
• Exploring Starts Monte Carlo
• On-Policy Monte Carlo
• Off-Policy Monte Carlo
Oct. 6, 2025 Temporal Difference Quiz 4 • OpenAI Gymansium GridWorldEnv
• Temporal Difference (TD) Prediction
• SARSA
• Q-Learning
• Double Q-Learning
• (Optional) n-step TD
Oct. 13, 2025 Function Approximation Exam 1 • OpenAI Gymansium MountainCar-v0
• Value Function Approximation (VFA)
• On-Policy Function Approximation
• Semi-gradient SARSA
• Limitations of Off-Policy Function Approximation
Oct. 20, 2025 Deep Q-Networks Quiz 5 • OpenAI Gymansium ALE/Breakout-v5
• Multi-Layered Perceprtons (MLPs)
• Convolutional Neural Networks (CNNs)
• Experience Replay
• Fixed Targets
• Vanilla Deep Q-Network
Oct. 27, 2025 Policy Gradients Quiz 6 • OpenAI Gymansium CartPole-v1
• Policy Gradient Theorem
• Vanilla Policy Gradient
Nov. 3, 2025 Advanced Policy Gradients Quiz 7 • OpenAI Gymansium HalfCheetah-v5
• Trust Region Policy Optimization (TRPO)
• Proximal Policy Optimization: KL-Divergence
• Proximal Policy Optimization: Clip
Nov. 10, 2025 Monte Carlo Tree Search Exam 2 • OpenAI Gymansium CartPole-v0
• Model-based Reinforcement Learning
• Monte Carlo Tree Search
• AlphaGo
• MuZero
Nov. 17, 2025 Conclusion • Advanced Topics in Reinforcement Learning
• Identify the Reinforcement Learning Application
• Outlook of Reinforcement Learning
Nov. 24, 2025 Thanksgiving Break
Dec. 1, 2025 Final Project Submission

Prerequisites

  • DATS 6101 - Introduction to Data Science

Assignments & Grading

Assignment Points
Quizzes (5 best scores) 25
Exam 1 25
Exam 2 25
Final Project 25
Average Learning Per Week

Students are expected to spend a minimum of 100 minutes of out-of-class work for every 50 minutes of direct instruction, for a minimum total of 2.5 hours a week. A 3-credit course should include 2.5 hours of direct instruction and a minimum of 5 hours of independent learning or 7.5 hours per week.

Online Resources

For technical requirements and support, student services, obtaining a GWorld card, and state contact information please check HERE

Classroom Recording

The particular class recordings will be available to students who are registered on an individual basis, upon request. Please let me know in advance if you have any medical issues or emergencies that will prevent you from joining the class.

Virtual Academic Support

Writing and research consultations are available online. See HERE. Coaching, offered through the Office of Student Success, is available in a virtual format. See HERE. Academic Commons offers several short videos addressing different virtual learning strategies for the unique circumstances of the fall 2020 semester. See HERE. They also offer a variety of live virtual workshops to equip students with the tools they need to succeed in a virtual environment. See HERE.

Safety and Security

In an emergency: call GWPD 202-994-6111 or 911. For situation-specific actions: review the Emergency Response Handbook in HERE. In an active violence situation: Get Out, Hide Out, or Take Out. See HERE. Stay informed: safety.gwu.edu/stay-informed.