logo

Crowdly

Browser

Додати до Chrome

365.221/2/4/6/7/8/9/30/56/67/81/325/326/348/349, UE Hands-on AI II, Rainer Dangl et al., 2026S

Шукаєте відповіді та рішення тестів для 365.221/2/4/6/7/8/9/30/56/67/81/325/326/348/349, UE Hands-on AI II, Rainer Dangl et al., 2026S? Перегляньте нашу велику колекцію перевірених відповідей для 365.221/2/4/6/7/8/9/30/56/67/81/325/326/348/349, UE Hands-on AI II, Rainer Dangl et al., 2026S в moodle.jku.at.

Отримайте миттєвий доступ до точних відповідей та детальних пояснень для питань вашого курсу. Наша платформа, створена спільнотою, допомагає студентам досягати успіху!

Now work with the LSTM - Simple (custom forget bias) preset.

Train two models:

  1. with a custom forget gate bias of 0 and
  2. with a custom forget gate bias of 1.

Note: don't forget to click on 'Apply Changes' when you modify the preset in the architecture editor.

Train for 20 epochs with a learning rate of 0.01

Examine the development of the loss/accuracy values over the epochs. How does the initial forget gate bias affect the training here?

Keep in mind - here we only set an initial bias, gradient computation is not disabled, thus in both cases the bias parameter will adapt during training.

In your analysis, also consider for both models the gradient magnitude plots. Show the gradients at initialization and after training and include both plots here.

Переглянути це питання

Now choose 'LSTM' as model type and load the preset LSTM Simple (with forget)

Train for 20 epochs and use learning rate = 0.01. Take a look at how the train/validation accuracies develop over the epochs and what the final test accuracy is. Also check out the gradient magnitude plot (initialization + after training, batches to sample = 10).

Do the same for the LSTM Simple (no forget) preset (20 epochs, lr = 0.01). 

Elaborate on the differences between the two models. How does disabling the forget gate affect performance and the gradient magnitudes (initialization and after training)? 

Attach both plots here.

Переглянути це питання

Attach the gradient magnitude plot here (initialization + after training, batches to sample = 10).

Comment on the plot, what can we say about the gradient magnitudes and how they develop when backpropagating through time?

Переглянути це питання

Now select 'RNN' as model type. Choose the RNN - Simple preset and create the PyTorch model.

What is now the number of trainable parameters?

Переглянути це питання

The results will be quite different. Discuss:

  • How can we explain this difference?
  • Why is one model so much better than the other?
  • Is a CNN a viable architecture for sequential data then?
Переглянути це питання

Now let's try a feed-forward network (FNN) first. Choose the FNN - 2xHidden (ReLU) preset and load the architecture.

How many trainable parameters does this model have?

Переглянути це питання

Analyze the result:

  • Compare the test accuracy with the mighty dice baseline. What can we say about the FNN performance?
  • Comment and interpret the training/validation/test set accuracies.
Переглянути це питання

Now try to catch up to our magic model! Try to find settings so that you are within 2% of the accuracy of the baseline magic model. In your answer, explain your reasoning for the settings you chose and upload screenshots that show your Dataset & loader, Model & optimization and Model architecture settings and the results, loss plot and confusion matrix/top-loss plots. You can upload several screenshot if not everything fits on one.

Переглянути це питання

Now, with the same settings, switch to ReLU. Which model depth(s) now show at least some form of learning?

100%
100%
100%
100%
100%
Переглянути це питання

Go to the Vanishing Gradients tab. Use the following settings:

  • seed: 2026
  • learning rate 0.001
  • activation: sigmoid
  • hidden: 3072
  • epochs: 1
  • model depths: all
  • image size: 32
  • batch size: 32

Run the training and go to the Gradient magnitudes tab. Attach the plot that was generated. Describe in your own words what you see in the gradient magnitude plot when comparing the different model depths.

Переглянути це питання

Хочете миттєвий доступ до всіх перевірених відповідей на moodle.jku.at?

Отримайте необмежений доступ до відповідей на екзаменаційні питання - встановіть розширення Crowdly зараз!

Browser

Додати до Chrome