Basic Concepts
- 🔧 AI focused on generating new data similar to training data
- Subset of Deep Learning, which itself is a subset of Machine Learning (ML), which itself is a subset of AI
- Data can be text, image, audio, code, video…
- Tons of unlabeled data pretrain a Foundation Model (FM), which can then generate data
- 💡 Foundation Models are extremely expensive, only big companies have their own
- E.g. OpenAI owns the foundation model GPT-4o (foundation model behind ChatGPT)
- User gives a prompt for Model to generate data
- ‼️ Generated data is non-deterministic!!
- Same prompt can generate similar but different data
- Generated data is determined thanks to statistical/probabilistic methods (not with deterministic methods)
Large Language Models (LLM)
- 🔧 AI designed to generate coherent human-like text
- Subset of Foundation Models
- e.g. OpenAI's GPT-4
- Can perform language-related tasks: translation, summarization, question answering, content creation
- Algorithm selects words from a list based on probability (randomly)
Graphical GenAI
- Generate images from text prompts
- “Generate a blue sky with white clouds and the word “Hello” written in the sky”
- Generate images from images
- “Transform this image in Japanese anime style”
- Generate text from images
- “Describe how many apples you see in the picture”
- Diffusion Models are very popular for generating images
- e.g. Stable Diffusion
- 💡Add noise to image until it's no longer recognizable, the model then learns “what makes a cat a cat”, can then generate a cat image from noise
Advanced GenAI Concepts