# Summary # Cues # Notes | 特征 | Encoder | Decoder | Decoder-only | |-----------------|---------|---------|--------------| | Token Embedding | βœ“ | βœ“ | βœ“ | | Self-Attention | βœ“ | βœ“ | βœ“ | | ε› ζžœζŽ©η  | βœ— | βœ“ | βœ“ | | εŒε‘ζ³¨ζ„εŠ› | βœ“ | βœ— | βœ— | | Cross-Attention | βœ— | βœ“ | βœ— | Decoder-only = ζœ‰ Decoder ηš„ε› ζžœζ³¨ζ„εŠ›οΌŒδ½†ζ²‘ζœ‰ Encoder ε’Œ Cross-Attention θΏ™η§ζžΆζž„ι€‚εˆθ‡ͺε›žε½’θ―­θ¨€ε»Ίζ¨‘δ»»εŠ‘οΌŒε¦‚ GPT η³»εˆ—ζ¨‘εž‹γ€‚