Amazon Kinesis Data Streams [MLA-C01]

Intro to Kinesis Data Streams from SAA-C03 Notes

Ref: https://www.udemy.com/course/aws-certified-machine-learning-engineer-associate-mla-c01/learn/lecture/45356719

Data Records up to 1MB (typical use case is lot of “small” real-time data)
- Data ingestion:
  - 1MBps or 1000msgs per shard
- Data consumption (throughput modes):
  - Shared: 2MBps per shard for all consumers
  - Enhanced: 2MBps per shard per consumer
Data records are made of a partition key & data blob (up to 1MB)
- When delivered to consumer, they also have a seq number (indicates where they were in the stream)
- Records with same partition key go to same shard → key-based ordering
  - Data ordering guarantee for data with the same “Partition ID”
Data Retention
- Retention up to 365 days (default = 1 day)
- Data can’t be deleted from Kinesis (until it expires)
- Consumers can reprocess (replay) data
Security: At-rest KMS encryption, in-flight HTTPS encryption
SW libraries to create custom producers/consumers
- Kinesis Producer Library (KPL) to write an optimized producer application
- Kinesis Client Library (KCL) to write an optimized consumer application

Provisioned
- Choose the number of shards
- Scale manually to increase or decrease the number of shards
- Pay per provisioned shard per hour
  - 💡 Use when you can predict capacity beforehand
On-demand
- No need to provision or manage capacity
- Capacity scales automatically based on observed throughput peak during last 30 days
- Diff pricing model: Pay per stream per hour & data IN/OUT per GB
  - 💡 Use when your capacity is unknown

Ref: https://www.udemy.com/course/aws-certified-machine-learning-engineer-associate-mla-c01/learn/lecture/45284803

Performance (writing is too slow)
- Possible causes:
  1. Service limits may be exceeded. Different calls have different limits.
    - Some operations (e.g., CreateStream, ListStreams, DescribeStreams) have stream-level limits of 5-20 calls per second
  2. Shard-level limits for writes and reads
  3. Hot shards
- Possible solutions:
  1. Check for throughput exceptions, see if operations are being throttled. Then apply throttling solutions.
  2. Scale: increase number of shards
  3. Select good partition key to evenly distribute puts across shards
Stream returns errors 500 Internal Server Error or 503 Service Unavailable
- This indicates an AmazonKinesisException error rate above 1%
- Solution: Implement a retry mechanism