RAG - Basic Concepts
- RAG = Retrieval Augmented Generation
- 🔧 Allows an FM to reference data sources outside of its training data
- Bedrock takes care of creating Vector Embeddings in the DB of your choice, i.e. a Knowledge Base, based on your data
- Use where real-time data is needed to be fed into the FM
- Architecture diagram
- Answers to prompts will carry reference numbers
Knowledge Bases
- Data source documents are chunked, then processed by an embeddings model (creates vectors), then stored into a vector DB → knowledge base created
- ‼️ The embedding model doesn't need to be the same FM that uses RAG!
- Knowledge Base creation diagram
- RAG Vector DB types to learn for the exam:
- Amazon OpenSearch Service – search & analytics DB
- Real time similarity queries, store millions of vector embeddings
- Scalable index management, and fast nearest-neighbor (kNN) search capability
- 💡 Default to this DB unless you have reasons to choose another DB
- Amazon DocumentDB [with MongoDB compatibility]
- NoSQL database
- Real time similarity queries, store millions of vector embeddings
- Amazon Aurora – relational database, proprietary on AWS
- Amazon RDS for PostgreSQL – relational database, open-source
- Amazon Neptune – graph database
- RAG Data Sources
- Amazon S3
- Confluence
- Microsoft SharePoint
- Salesforce
- Web pages (your website, your social media feed, etc…)
- …etc (More added over time)
RAG Use cases
💡 Most typical use case is a chatbot with specific domain-knowledge, which can query external sources and knowledge bases
- Customer Service Chatbot
- Knowledge Base – products, features, specifications, troubleshooting guides, and FAQs
- RAG application – chatbot that can answer customer queries
- Legal Research and Analysis
- Knowledge Base – laws, regulations, case precedents, legal opinions, and expert analysis
- RAG application – chatbot that can provide relevant information for specific legal queries