Question 4 of 10Pro Only
Compare BERT and GPT architectures. How do their training objectives differ, and what tasks is each model best suited for?
Sample answer preview
BERT and GPT represent two fundamental approaches to pre-trained language models, differing in architecture, training objectives, and optimal use cases. BERT uses only the encoder portion of the transformer architecture.
bidirectionalautoregressivemasked language modelingcausal maskingencoder-onlydecoder-only