Ref: https://learn.cantrill.io/courses/1820301/lectures/41301409 [ASSOCIATESHARED]
Kinesis Data Streams - Basic Concepts
- 🔧 Scalable streaming service
- Designed to ingest lots and lots of data, from lots of devices or apps
- Kinesis Data Stream = basic entity (unit of configuration)
- Producers send data into a stream, consumers can read data from streams
- Can scale from low levels of data throughput to near infinite amounts of data
- 🔧 Stored in a moving window of data (24h by default)
- Older/expired data is discarded
- Storage for window data included in product (no matter how much data in window)
- Window can be increased up to 365 days (additional costs)
- Public service, regionally resilient by design
- Multiple producers can send data to a stream
- Multiple consumers can read/access stream data, with whatever granularity they choose
- Great fit for analytics & dashboards
Kinesis Data Streams - Architecture
- Streams ingest data from Producers. Consumers read data from streams.
- 🔧 Shard architecture → scaling → shards added to ingest more data
- Shard capacity:
- 1MBps of ingestion capacity
- 2MBps of consumption capacity
- More shards → more performance & more cost
- Data stored in Kinesis Data Records (max 1MB) across shards
- Performance scales linearly
- Billing:
- Number of shards (more shards cost more)
- Size of data window (bigger windows cost more)
- Amazon Data Firehose can move stream data en masse to another service e.g. S3
- Allows persisting data beyond the stream window
- Kinesis Data Streams Architecture Diagram
SQS vs Kinesis
- ‼️ Don't confuse the two services, they're very different!
- If scenario involves ingestion of data, lots of data → most likely Kinesis
- If another scenario → assume SQS by default
- Only change your mind if you have strong reasons to do so