Looking for 365.221/2/4/6/7/8/9/30/56/67/81/325/326/348/349, UE Hands-on AI II, Rainer Dangl et al., 2026S test answers and solutions? Browse our comprehensive collection of verified answers for 365.221/2/4/6/7/8/9/30/56/67/81/325/326/348/349, UE Hands-on AI II, Rainer Dangl et al., 2026S at moodle.jku.at.
Get instant access to accurate answers and detailed explanations for your course questions. Our community-driven platform helps students succeed!
Now work with the LSTM - Simple (custom forget bias) preset.
Train two models:
Note: don't forget to click on 'Apply Changes' when you modify the preset in the architecture editor.
Train for 20 epochs with a learning rate of 0.01.
Examine the development of the loss/accuracy values over the epochs. How does the initial forget gate bias affect the training here?
Keep in mind - here we only set an initial bias, gradient computation is not disabled, thus in both cases the bias parameter will adapt during training.
In your analysis, also consider for both models the gradient magnitude plots. Show the gradients at initialization and after training and include both plots here.
Now choose 'LSTM' as model type and load the preset LSTM Simple (with forget).
Train for 20 epochs and use learning rate = 0.01. Take a look at how the train/validation accuracies develop over the epochs and what the final test accuracy is. Also check out the gradient magnitude plot (initialization + after training, batches to sample = 10).
Do the same for the LSTM Simple (no forget) preset (20 epochs, lr = 0.01).
Elaborate on the differences between the two models. How does disabling the forget gate affect performance and the gradient magnitudes (initialization and after training)?
Attach both plots here.
Attach the gradient magnitude plot here (initialization + after training, batches to sample = 10).
Comment on the plot, what can we say about the gradient magnitudes and how they develop when backpropagating through time?
Now select 'RNN' as model type. Choose the RNN - Simple preset and create the PyTorch model.
What is now the number of trainable parameters?
The results will be quite different. Discuss:
Now let's try a feed-forward network (FNN) first. Choose the FNN - 2xHidden (ReLU) preset and load the architecture.
How many trainable parameters does this model have?
Analyze the result:
Now try to catch up to our magic model! Try to find settings so that you are within 2% of the accuracy of the baseline magic model. In your answer, explain your reasoning for the settings you chose and upload screenshots that show your Dataset & loader, Model & optimization and Model architecture settings and the results, loss plot and confusion matrix/top-loss plots. You can upload several screenshot if not everything fits on one.
Now, with the same settings, switch to ReLU. Which model depth(s) now show at least some form of learning?
Go to the Vanishing Gradients tab. Use the following settings:
Run the training and go to the Gradient magnitudes tab. Attach the plot that was generated. Describe in your own words what you see in the gradient magnitude plot when comparing the different model depths.