Crowdly

Додати до Chrome

How does Q-Learning differ from SARSA in TD control?

✅ Перевірена відповідь на це питання доступна нижче. Наші рішення, перевірені спільнотою, допомагають краще зрозуміти матеріал.

How does Q-Learning differ from SARSA in TD control?

SARSA requires a model of the environment, while Q-Learning does not

0%

Q-Learning updates only at the end of an episode, while SARSA updates at each step

0%

Q-Learning is on-policy, while SARSA is off-policy

0%

SARSA updates the Q-value using the actual action taken, while Q-Learning updates using the maximum action-value

0%

Більше питань подібних до цього

Хочете миттєвий доступ до всіх перевірених відповідей на elearning.aua.am?

Отримайте необмежений доступ до відповідей на екзаменаційні питання - встановіть розширення Crowdly зараз!

Додати до Chrome

Telegram Instagram TikTok Question Bank

Умови використання Зв'яжіться з нами