Transformers | Notion

Contents:

Basic Concepts - Tokens and Embeddings

Ref: https://www.udemy.com/course/aws-certified-machine-learning-engineer-associate-mla-c01/learn/lecture/45536625

Tokens = numerical representations of words or parts of words
- A word can consist of 1+ tokens
- Punctuation signs (. “ ,) are also usually tokens
- 💡 Words/tokens can be loosely thought as the same, although strictly speaking they're obviously different
Embeddings = mathematical representations (vectors) that encode the “meaning” of a token
💡 See also AIF-C01 notes on Advanced GenAI Concepts
Tokens UI from OpenAI's website

Evolution of the Transformer Architecture

Ref: https://www.udemy.com/course/aws-certified-machine-learning-engineer-associate-mla-c01/learn/lecture/45285869

1. RNNs and LSTMs

Feedback loop already present here
Useful for modeling sequential stuff like time series or language (sequence of words)
RNNs propagate the “hidden state” i.e. the previous output

2. Encoder-Decoder Architecture (e.g. for Machine Translation)

Encoders and Decoders are RNNs
Last Hidden State = huge vector that contains the meaning of the sentence “Please Translate Me”