Advanced GenAI Topics

Reasoning Models

Ref: https://www.udemy.com/course/ultimate-aws-certified-generative-ai-developer-professional/learn/lecture/53542835

Model reasoning → enabling reasoning means that FM will apply chain of thought to break down a complex task into several steps
- 👎 higher cost, more thinking time (slower)
- 👍 hopefully higher accuracy/performance in the output!

Ref: https://www.udemy.com/course/ultimate-aws-certified-generative-ai-developer-professional/learn/lecture/53542847

Multimodal models require specialized encoders for each type of media they support (audio, text, documents, images, video…)
Multimodal embedding models can convert different media types into compatible embedding vectors → can search for different media with vector similarity!
Multimodal pipelines: how to prepare data from different media types so it's usable by the model
- e.g. For Amazon Titan Multimodal Embeddings G1, pass in structured JSON, with image data being base64-encoded
- Data pipeline needs to do encoding to prepare the data (maybe SageMaker, Glue…)

Ref: https://www.udemy.com/course/ultimate-aws-certified-generative-ai-developer-professional/learn/lecture/53683943

GenAI FMs, Bedrock FMs, and LLMs deployed with SageMaker will almost always expect JSON-structured data as input
- ❗ SageMaker ML models that are not LLMs might require other structure as input data (CSV, Protobuff…)
- Example
‼️ It is YOUR responsibility to provide input data in correct format!!
- Your app/endpoint is responsible for this
Raw text from data source can lose structure (e.g. if ingesting documents via OCR or PDF)…
- Metadata, headings, tables… can be lost in sea of text
- Possible solutions:
  1. Format source text into HTML (GenAI models understand HTML quite well, especially useful for tables)
  2. Use tools like Amazon Comprehend or Amazon Textract to extract structure from text (also 3rd party tools like pandoc)
- Formatting unstructured text can be automatized in an AWS Glue ETL pipeline or with Bedrock Data Automation (BDA)
Divider strings can help better chunk data into a vector store
- e.g. transform HTML <h1>Title</h1> into <SECTION_BREAK:Title> → Embedding model will chunk here
- Lambda preprocessor in your KB can convert HTML tags to divider strings on demand
  - Can also be automatized with an AWS Glue ETL pipeline
Bedrock’s Converse API requires JSON output that includes:
- Role (assistant/user generally)
- Content (messages)
- Example