Question 3 of 10Pro Only
Explain the transformer architecture and how the self-attention mechanism works. Why was this architecture revolutionary for NLP?
Sample answer preview
The transformer architecture, introduced in the 2017 paper "Attention Is All You Need," revolutionized NLP by replacing recurrent neural networks with a purely attention-based mechanism.
self-attentionqueries keys valuesmulti-head attentionpositional encodingencoder-decoderparallelization