Homework 9

Homework for Lecture 9: Policy Gradients 📝

Instructions:

- Show ALL Work, Neatly and in Order.
- No credit for Answers Without Work.
- Submit a single PDF file including all solutions.
- DO NOT submit individual files or images.
- For coding questions, submit ONE .py file with comments.

Note

For this homework, you only need gymnasium, numpy, tensorflow, os, pickle, tqdm & tensorboard.

Coding Exercise 1: Vanilla Policy Gradient

For the CartPole-v1 environment, code the update function for the Vanilla Policy Gradient algorithm using the provided parameters.