The encoder
The decoder
The position embedding
The encoder block
The self-attention layer
The layer-normalization
The position-wise feed-forward network
The decoder block
The cross-attention layer
The predicting head
The overall architecture
The architecture is composed of an encoder and a decoder.
Watch with a 7-day free trial
Subscribe to The AiEdge Newsletter to watch this video and get 7 days of free access to the full post archives.