Add to Chrome
✅ The verified answer to this question is available below. Our community-reviewed solutions help you understand the material better.
What is the reward hypothesis?
Always take the action that gives you the best reward at that point.
That all of what we mean by goals and purposes can be well thought of as the maximization of the expected value of the cumulative sum of a received scalar signal (called reward)
Ignore rewards and find other signals.
That all of what we mean by goals and purposes can be well thought of as the minimization of the expected value of the cumulative sum of a received scalar signal (called reward)
Get Unlimited Answers To Exam Questions - Install Crowdly Extension Now!