LAB: Exploring Transformers in a SageMaker Notebook
PART 1: Tokenization and Positional Encoding
Ref: https://www.udemy.com/course/aws-certified-machine-learning-engineer-associate-mla-c01/learn/lecture/45285951
- Create a cheap notebook instance
- Once instance is ready, import the provided notebook file into the Jupyter notebook
- Install transformers package from Hugging Face, then follow notebook steps in order
- We use BERT model and tokenizer in the demo
- Notice how “I read a good novel” is tokenized into sequences of integers
- Notice the positional encoding function uses an interleaved sinusoidal function
PART 2: Multi-Headed, Masked Self-Attention
Ref: https://www.udemy.com/course/aws-certified-machine-learning-engineer-associate-mla-c01/learn/lecture/45285963
PART 3: Import GPT-2 from Hugging Face and generate text
Ref: https://www.udemy.com/course/aws-certified-machine-learning-engineer-associate-mla-c01/learn/lecture/45286023
- In just 3 lines of code, you can import GPT-2 and make it generate text from a prompt