Final Project Overview

Final Project Overview 🏁

Instructions

Instruction table for the Final Project.

Using the different Multi-Armed Bandit algorithms learned in Lecture 3. Your task is to:

Define the Reinforcement Learning Framework: Formulate a Multi-Armed Bandit problem.

Define the data.
Specify the action space \(\mathcal{A}\).
Specify the reward structure \(R\).

Define the model: Implement at least two Reinforcement Learning algorithms to solve the problem.

Define the metrics: To evaluate performance, we will use:

Cumulative reward: Total return over a time horizon.
(Optional) Regret: The difference between the reward of the best fixed arm and the reward obtained by the algorithm.
(Optional) Stability score: Standard deviation of the reward over time.
(Optional) Adaptability: Performance when the environment shifts.

Effectively communicate your findings: CV style.

Accomplished [X] as measured by [Y], by doing [Z].

Example: Improved asset allocation strategy stability as measured by lower reward variance across trials, by tuning \(\epsilon\) in an \(\epsilon\)-Greedy policy.

Using a Classical or Deep Reinforcement Learning algorithm learned in Lectures 5-11. Your task is to:

Define the Reinforcement Learning Framework: Formulate a Markov Decision Process (MDP).

Define the environment dynamics \(P(s',r|s,a)\).
Define the state space \(\mathcal{S}\).
Define the action space \(\mathcal{A}\).
Define the reward function \(R\).
Define the episode structure.

Define the model: Implement at least one Reinforcement Learning algorithm to solve the problem.

Define the metrics: To evaluate performance, we will use:

Cumulative reward: Total return over a time horizon.
(Optional) Stability score: Standard deviation of the reward over time.

Effectively communicate your findings: CV style.

Criteria	Description	Weight
Problem Formulation	Clear definition of the Reinforcement Learning framework.	20%
Algorithm Implementation	Correct and efficient implementation of at least two Reinforcement Learning algorithms.	30%
Performance Evaluation	Use of appropriate metrics to evaluate the algorithms’ performance.	20%
Communication of Findings	Clear and concise presentation of results, including visuals and insights.	20%
Code Quality and Documentation	Well-structured, readable, and documented code.	10%

Final Project Overview

Instructions

Presentation

Submission

Grading

(Optional) Capstone or Research