Add to Chrome
✅ The verified answer to this question is available below. Our community-reviewed solutions help you understand the material better.
In reinforcement learning, a policy that results in the maximum cumulative reward is called:
The stochastic policy
The optimal policy
The greedy policy
The deterministic policy
Get Unlimited Answers To Exam Questions - Install Crowdly Extension Now!