Add to Chrome
✅ The verified answer to this question is available below. Our community-reviewed solutions help you understand the material better.
Select one statement about attention and dot products that is correct.
Dot products quantify (i.e. measure) similarity between word embeddings, which is then useful for computing attention between words.
Dot products don't have anything to do with attention in Transformers, and Marek simply got carried away when he was writing his slides for this module, and as a result, he presented a lot of irrelevant linear algebra staff in his lectures.
Python can only approximate dot products, and therefore it is better to use other measures of similarity when Transformers are implemented in Python. This means that they should be avoided when Transformers are designed and implemented in Python.
Get Unlimited Answers To Exam Questions - Install Crowdly Extension Now!