Crowdly

Add to Chrome

Masked attention in a standard GPT allows the word at position N to attend to al...

✅ The verified answer to this question is available below. Our community-reviewed solutions help you understand the material better.

Masked attention in a standard GPT allows the word at position N to attend to all previous words at positions N-1, N-2, etc.

True

100%

False

Get Unlimited Answers To Exam Questions - Install Crowdly Extension Now!

Add to Chrome