7.3 On-Policy Function Approximation

A formal framework that defines probability using three fundamental rules, ensuring consistency in measuring uncertainty. 🎲

Limits of Off-Policy Approximation

Limits of Off-Policy Approximation

Convergence of control algorithms:

Algorithm Tabular Linear Neural Networks
Monte-Carlo Control (✅)
SARSA (✅)
Q-learning