Add to Chrome
✅ The verified answer to this question is available below. Our community-reviewed solutions help you understand the material better.
In multi-step TD methods, what does the "return" G(t) represent when using n-step bootstrapping?
The sum of rewards from step t to the end of the episode
The current estimated value of the state
The maximum Q-value over all actions
The discounted sum of the next n rewards and the estimated value of the nth state
Get Unlimited Answers To Exam Questions - Install Crowdly Extension Now!