RDS Read-Replicas (RRs)

Ref: https://learn.cantrill.io/courses/1820301/lectures/41301424

RDS Read-Replicas (RRs) - Architecture

🔧 RDS Read-Replica (RR) = a read-only replica of an RDS instance
- Can be used for reads (unlike standby replica in Multi-AZ instance)
  - 👍 allow read performance scaling
- An instance can have up to 5 direct RRs
  - Can be created in the same region or in a different region (cross-region RRs)

Untitled

‼️ RRs are separate from the main RDS architecture!
- Each RR has its own endpoint address, independent from RDS instance endpoints
  - Requires app support: app needs to be adjusted to use a RR
  - ❗ Apps by default know nothing about RRs
- No automatic failover
❗ Asynchronous replication
- Data committed when written to main instance. After that, replicated to its RRs.
- Lag (can be noticeable depending on NW conditions & amount of writes)
  - 💡 RRs can have their own RRs, but lag starts to be even more noticeable!
Cross-region RRs allow global performance improvement of read workloads
- Users can read DBs from different regions more efficiently
- Cross-region NWing handled transparently by AWS (data fully encrypted in transit)
💡 Multi-AZ cluster deployment is like a combination of Multi-AZ instance deployment + RRs
- ‼️ BUT! The 2 Reader Replicas in Multi-AZ cluster deployment are part of the main architecture! External RRs should be considered something separate!
- For exam: synchronous replication → multi-AZ; asynchronous → RRs (excluding Aurora)

🔧 RRs are read-only until promoted. Upon promotion, they become a normal RDS instance
- Promotion can be done very quickly → 👍 low RTO
- 👍 Improves global availability/resilience
  - RR in a different (failover) region can be quickly promoted if main region outage
- 👍 RRs also offer near 0 RPO
  - data synced constantly from the main DB instance (very little potential for data loss)
- 💡 RRs are great for quickly recovering from failure, as long as there's no data corruption
‼️ Use RRs ONLY when recovering from failure/outage, NOT from data corruption!!
- 👎 Because data is constantly replicated to RR, data corruption is also replicated to RR!
- 💡 If data corruption → must rely on snapshots & backups
  - higher frequency and/or higher quality of snapshots/backups improves RPO