References

Acero, Fernando, Parisa Zehtabi, Nicolas Marchesotti, Michael Cashmore, Daniele Magazzeni, and Manuela Veloso. 2024. “Deep Reinforcement Learning and Mean-Variance Strategies for Responsible Portfolio Optimization.” https://arxiv.org/abs/2403.16667.
Barto, Andrew G, Richard S Sutton, and Charles W Anderson. 1983. “Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems.” Technical Report, Institute for Cybernetic Studies, University of Massachusetts. https://psycnet.apa.org/record/1984-25798-001.
Bellemare, Marc G, Yavar Naddaf, Joel Veness, and Michael Bowling. 2013. “The Arcade Learning Environment: An Evaluation Platform for General Agents.” Journal of Artificial Intelligence Research 47: 253–79.
Bertsekas, Dimitri P., and John N. Tsitsiklis. 2008. Introduction to Probability. 2nd ed. Belmont, MA: Athena Scientific.
Brockman, Greg, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. “OpenAI Gym.” https://arxiv.org/abs/1606.01540.
Brunskill, Emma. 2022. “CS234: Reinforcement Learning - Lecture 1.” Course Lecture Slides, Stanford University. https://web.stanford.edu/class/cs234/slides/lecture1pre.pdf.
Fawzi, Alhussein, Matej Balog, Atri Huang, et al. 2022. “Discovering Faster Matrix Multiplication Algorithms with Reinforcement Learning.” Nature 610: 47–53. https://doi.org/10.1038/s41586-022-05172-4.
Hagan, Martin T., Howard B. Demuth, Mark H. Beale, and Orlando De Jesús. 2014. Neural Network Design. 2nd ed. Martin Hagan. https://hagan.okstate.edu/NNDesign.pdf.
Hammack, Richard H. 2013. Book of Proof. Richard Hammack.
Ie, Eugene, Chih-wei Hsu, Martin Mladenov, Vihan Jain, Sanmit Narvekar, Jing Wang, Rui Wu, and Craig Boutilier. 2019. “RecSim: A Configurable Simulation Platform for Recommender Systems.” https://arxiv.org/abs/1909.04847.
Levine, Sergey. 2019. “Introduction to Deep Reinforcement Learning.” Course Lecture Slides, Deep RL Course, UC Berkeley. https://rail.eecs.berkeley.edu/deeprlcourse/deeprlcourse/static/slides/lec-1.pdf.
Li, Lihong, Wei Chu, John Langford, and Robert E. Schapire. 2010. “A Contextual-Bandit Approach to Personalized News Article Recommendation.” In Proceedings of the 19th International Conference on World Wide Web, 661–70. ACM.
Martin T. Hagan, Amir Jafari. 2024. “NNDesignDeepLearning.” https://github.com/NNDesignDeepLearning/NNDesignDeepLearning.
Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. “Playing Atari with Deep Reinforcement Learning.” https://arxiv.org/abs/1312.5602.
Moore, Andrew William. 1990. “Efficient Memory-Based Learning for Robot Control.” University of Cambridge.
Ouyang, Long, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, et al. 2022. “Training Language Models to Follow Instructions with Human Feedback.” https://arxiv.org/abs/2203.02155.
Sanz-Cruzado, Javier, Nikolaos Droukas, and Richard McCreadie. 2024. “FAR-Trans: An Investment Dataset for Financial Asset Recommendation.” https://arxiv.org/abs/2407.08692.
Sutton, Richard S., and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. 2nd ed. Cambridge, MA: MIT Press.
Todorov, Emanuel, Tom Erez, and Yuval Tassa. 2012. “MuJoCo: A Physics Engine for Model-Based Control.” In *Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)*. http://www.mujoco.org/.