Bedrock - Pricing | Notion

Ref: https://www.udemy.com/course/aws-ai-practitioner-certified/learn/lecture/44886445

Contents:

Pricing Modes

On-Demand

Pay-as-you-go → no long-term commitment
- great for unpredictable workloads
‼️ Works with Base Models only!
Model charges:
- Text Models – charged for every input/output token processed
- Embedding Models – charged for every input token processed
- Image Models – charged for every image generated

Batch

Multiple predictions at a time, output is a single file in S3
Discounts of up to 50%
💡 Answers to prompts are no longer in real time, but can get significant discounts

Provisioned Throughput

Reserves throughput/capacity for a certain time (1 month, 6 months…)
Throughput = max number of input/output tokens processed per minute
‼️ Required for Fine-tuned and Custom Models!
- Base Models can also use provisioned throughput, but not required

Other cost considerations

Number of input & output tokens → main driver of cost
Model size → usually a smaller model is cheaper
- Varies based on providers
- Smaller models also have less capabilities and capacity