Add to Chrome
✅ The verified answer to this question is available below. Our community-reviewed solutions help you understand the material better.
Which of the following best describes Stochastic Gradient Descent (SGD)?
Parameters are updated without computing gradients, by directly adjusting weights toward the minimum.
Parameters are updated only once per epoch, after computing gradients on a mini-batch.
B) Parameters are updated after computing the gradient on just one randomly selected training example at a time, leading to faster initial progress but noisy convergence that oscillates near the minimum.
Parameters are updated after computing the gradient using the entire training dataset, leading to stable, deterministic convergence.
Get Unlimited Answers To Exam Questions - Install Crowdly Extension Now!