Ref: https://www.udemy.com/course/aws-ai-practitioner-certified/learn/lecture/44886389
Retrieval-Augmented Generation (RAG) - Basic Concepts
- đź”§Â Allows an FM to reference data sources outside of its training data
- Bedrock takes care of creating Vector Embeddings in the DB of your choice, i.e. a Knowledge Base, based on your data
- Use where real-time data is needed to be fed into the FM
- 💡 Like an open-book exam for an LLM → can reference stuff outside their knowledge
- Architecture diagram
- Answers to prompts will carry reference numbers
Knowledge Bases
- Data source documents are chunked, then processed by an embeddings model (creates vectors), then stored into a vector DB → knowledge base created
- ‼️ The embedding model doesn't need to be the same FM that uses RAG!
- Knowledge Base creation diagram
- RAG Vector DB types to learn for the exam:
- Amazon OpenSearch Service – search & analytics DB
- Real-time similarity queries, store millions of vector embeddings
- Scalable index management, and fast nearest-neighbor (kNN) search capability
- 💡 Default to this DB unless you have reasons to choose another DB
- Amazon DocumentDB [with MongoDB compatibility]
- NoSQL database
- Real-time similarity queries, store millions of vector embeddings
- Amazon Aurora – relational (SQL) database, proprietary on AWS
- Amazon RDS for PostgreSQL – relational (SQL) database, open-source
- Amazon Neptune – graph database
- RAG Data Sources
- Amazon S3
- Atlassian Confluence
- Microsoft SharePoint
- Salesforce
- Web pages (your website, your social media feed, etc…)
- …etc (More added over time)
RAG Use cases
💡 Most typical use case is a chatbot with specific domain-knowledge, which can query external sources and knowledge bases
- Customer Service Chatbot
- Knowledge Base – products, features, specifications, troubleshooting guides, and FAQs
- RAG application – chatbot that can answer customer queries
- Legal Research and Analysis
- Knowledge Base – laws, regulations, case precedents, legal opinions, and expert analysis
- RAG application – chatbot that can provide relevant information for specific legal queries
- Healthcare Question-Answering
- Knowledge base – diseases, treatments, clinical guidelines, research papers, patients…
- RAG application – chatbot that can answer complex medical queries