Looking for 365.221/2/4/6/7/8/9/30/56/67/81/325/326/348/349, UE Hands-on AI II, Rainer Dangl et al., 2026S test answers and solutions? Browse our comprehensive collection of verified answers for 365.221/2/4/6/7/8/9/30/56/67/81/325/326/348/349, UE Hands-on AI II, Rainer Dangl et al., 2026S at moodle.jku.at.
Get instant access to accurate answers and detailed explanations for your course questions. Our community-driven platform helps students succeed!
Try all activation functions once. Which one is the worst choice with regard to the vanishing gradient issue?
Now train with the following settings:
Look at the loss plot. For which model depth(s) can you see that there is some learning taking place?
Why do you think that even after switching the activation function, some model depths don't seem learn well?
Why do you think that is the case? It might be helpful to check out the activation function and their derivative plot on the first tab.
Why can the gradient vanish during backpropagation?
In the Demo: one SGD step, set the random seed to 2026 and the learning rate to 0.1 then run the demo.
The initial weights should be:
Initial weights:
tensor([[ 0.3753, 0.1500],
[ 0.1319, -0.6104]])
Initial biases:
tensor([ 0.0136, -0.3036])
The gradients should be:
Gradient computation:
grad(W):
tensor([[ 0.2432, -0.1886],
[-0.2432, 0.1886]])
grad(b):
tensor([ 0.0957, -0.0957])
grad(W) norm: 0.4351941645145416
grad(b) norm: 0.13534791767597198
The updated weigths and biases should be:
Updated weights:
tensor([[ 0.3510, 0.1689],
[ 0.1562, -0.6293]])
Updated biases:
tensor([ 0.0041, -0.2940])
Can you explain how these updated weights and biases are calculated? Write down the formula with the for the complete computation for
initial weight -> updated weightAlso give an example computation for one of the parameters.
Train for 5 epochs with these settings:
What is the overall accuracy (enter full number with all three digits after the comma)?
Select the Street View House Numbers (SVHN) dataset. Select:
Load the CIFAR10 preset and apply the architecture. How many trainable parameters does the model have?
Now try to increase the accuracy to at least 80% overall. You can:
Include screenshots of
Do you see misclassified samples in the prediction scores? If yes, what can be said about them when looking at the top 3 probabilities that are listed?
What is the overall accuracy of the model?