Select one statement about attention and dot products that is correct.

Question

Accepted Answer

Dot products quantify (i.e. measure) similarity between word embeddings, which is then useful for computing attention between words.

Answer

Dot products don't have anything to do with attention in Transformers, and Marek simply got carried away when he was writing his slides for this module, and as a result, he presented a lot of irrelevant linear algebra staff in his lectures.

Answer

Python can only approximate dot products, and therefore it is better to use other measures of similarity when Transformers are implemented in Python. This means that they should be avoided when Transformers are designed and implemented in Python.

Crowdly

Select one statement about attention and dot products that is correct.

Want instant access to all verified answers on moodle.kent.ac.uk?