Add to Chrome
✅ The verified answer to this question is available below. Our community-reviewed solutions help you understand the material better.
Which component in Transformer architecture is responsible for capturing long-range dependencies?
A. Convolutional layers
A. Batch normalization
A. Multi-head self-attention
A. Dropout layers
Get Unlimited Answers To Exam Questions - Install Crowdly Extension Now!