✅ The verified answer to this question is available below. Our community-reviewed solutions help you understand the material better.
Now choose 'LSTM' as model type and load the preset LSTM Simple (with forget).
Train for 20 epochs and use learning rate = 0.01. Take a look at how the train/validation accuracies develop over the epochs and what the final test accuracy is. Also check out the gradient magnitude plot (initialization + after training, batches to sample = 10).
Do the same for the LSTM Simple (no forget) preset (20 epochs, lr = 0.01).
Elaborate on the differences between the two models. How does disabling the forget gate affect performance and the gradient magnitudes (initialization and after training)?
Attach both plots here.
Get Unlimited Answers To Exam Questions - Install Crowdly Extension Now!