6.5 (Optional) n-step Bootstrapping

What if you waited a little longer before updating — not too soon, not too late? ⏱️

Noten-step TD Prediction

n-step Bootstrapping is a learning rule that is a combination of Monte Carlo and Temporal Difference ideas.

  • Like Monte Carlo, n-step methods learn from experience.
  • Like Temporal Difference, n-step methods bootstrap multiple time steps.

Noten-step SARSA

n-step SARSA extends the standard SARSA algorithm to incorporate multi-step returns. Instead of updating based on a single-step transition, it utilizes an accumulated return over n steps, striking a balance between bias and variance.

Noten-step Tree Backup

n-step Tree Backup is an extension of Q-learning that allows updates without the requirement of selecting an on-policy action. It generalizes the Expected SARSA algorithm by propagating multiple steps of information while weighting future actions by their probability under the policy.