Learning Objectives
DATS 6450 – Reinforcement Learning
Lecture 1: Introduction
1.1 Why should I study Reinforcement Learning?
1.2 What is Reinforcement Learning?
1.3 Where is Reinforcement Learning Applied?
1.4 How is Reinforcement Learning Structured?
Lecture 2: Mathematical Foundations
Learning Objectives
2.1 Set Theory
2.2 Axiomatic Probability
2.3 Conditioning
2.4 Independence
2.5 Discrete Random Variables
2.6 Continuous Random Variables
2.7 Probability Distributions
Lecture 3: Multi-Armed Bandits
Learning Objectives
3.1 Multi-Armed Bandit Framework
3.2 ε-Greedy
3.3 Upper Confidence Boundary (UCB)
3.4 Thompson Sampling
Lecture 4: Dynamic Programming
Learning Objectives
4.1 Markov Chain
4.2 Markov Decision Process (MDPs)
4.3 Dynamic Programming
Lecture 5: Monte Carlo
Learning Objectives
5.1 Monte Carlo Prediction
5.2 Exploring Starts Monte Carlo
5.3 On-Policy Monte Carlo
5.4 Off-Policy Monte Carlo
Lecture 6: Temporal Difference
Learning Objectives
6.1 Temporal Difference (TD) Prediction
6.2 SARSA
6.3 Q-Learning
6.4 Double Q-Learning
Lecture 7: Function Approximation
Learning Objectives
7.1 Value Function Approximation
7.2 On-Policy Function Approximation
7.3 Off-Policy Function Approximation
Lecture 8: Deep Q-Networks (DQN)
Learning Objectives
8.1 Deep Learning
8.2 Deep Q-Networks
Lecture 9: Policy Gradients
Learning Objectives
9.1 Policy Gradients
Lecture 10: Advanced Policy Gradients
Learning Objectives
10.1 Trust Region Policy Optimization (TRPO)
10.2 Proximal Policy Optimization (PPO)
Lecture 11: Monte Carlo Tree Search
Learning Objectives
11.1 Monte Carlo Tree Search (MCTS)
11.2 Advanced Monte Carlo Tree Search
Lecture 12: Conclusion
Learning Objectives
12.1 Advanced Topics in Reinforcement Learning
12.2 Identify the Reinforcement Learning Application
12.3 Outlook of Reinforcement Learning
Homeworks
Homework 1
Homework 2
Homework 3
Homework 4
Homework 5
Homework 6
Homework 7
Homework 8
Homework 9
Homework 10
Homework 11
Homework 12
References
Learning Objectives
Learning Objectives for Lecture 5: Monte Carlo 🎯
Monte Carlo Prediction.
On-Policy Monte Carlo.
Off-Policy Monte Carlo.
Homemade GridWorld OpenAI environment using
gymnasium
,
pygame
&
numpy
.
Taxonomy of Reinforcement Learning
4.3 Dynamic Programming
5.1 Monte Carlo Prediction