EC2 Auto Scaling & ASG Scaling Policies

Ref: https://learn.cantrill.io/courses/1820301/lectures/41301443 and https://learn.cantrill.io/courses/1820301/lectures/41301444

EC2 Auto Scaling - Types

Manual Scaling: manually adjust the ASG values
- No need for attached scaling policy
- Static values for MIN, DESIRED, MAX
  - Can manually tweak them (via AWS Console Management or AWS CLI)
  - Can write a manual script to adjust values depending on conditions → customer manages logic & execution (scaling policies offer automated solutions)
- Use cases: ASG testing, cost-control measure, urgent tweaking…
Scheduled Scaling: automatic, time-based adjustment of ASG DESIRED capacity
- No need for attached scaling policy
- You create a scheduled action in the ASG
  - ❗ The action can NOT update MIN/MAX values, you need to do that manually
- Useful for known high/low usage periods
  - e.g. sales time → increase/reduce desired capacity at a given time
Dynamic Scaling: automatic, rule-based adjustment of ASG DESIRED capacity
- ❗ Requires scaling policy attached, which contains rules to react to metrics/alarms
  - CW Alarm used to track metrics
- ‼️ Scaling policies only adjust ASG DESIRED capacity!!
  - MIN and MAX need to be manually adjusted! (maybe with a script or via AWS CLI)
- Types:

🔧 Contains rules that adjusts ASG values based on metric changes and/or CW Alarms
- ASG with attached scaling policy can perform EC2 Dynamic Scaling
Metrics
- Can be internal to EC2 instance: CPU usage, in-disk memory, disk I/O…
  - 💡 Some internal metrics require CW Agent installed in EC2 instances
- Can be external, outside EC2 (e.g. length of an Amazon SQS queue)
  - 💡 Common architecture for an EC2 worker pool: scale based on an Amazon SQS queue → add/delete instances based on ApproximateNumberOfMessagesVisible in the queue → if there are many messages to be processed, deploy more instances to accelerate the work
❗ Cooling Period = time value (secs) that an ASG waits after completing a scaling action before performing the next one
- Useful when a metric is chaotic
  - Prevents many changes in the ASG
  - Can reduce costs of constantly adding/removing instances (since EC2 instances have a minimum billable period)

🔧 Usually 1 pair of rules → one for provisioning (Scale Out), one for termination (Scale In)
- A CW alarm is used as a guide
Example: Metric = Average CPU Utilization in the whole ASG, and we define 2 rules:
1. Scale Out: If ASGAverageCPUUtilization > 50% → +2 instances
2. Scale In: If ASGAverageCPUUtilization < 50% → -2 instances
👎 Disadvantage: NOT very flexible!
- No matter how much we surpass the threshold, whether 51% or 90%, 2 instances are always added
  - Equivalent inflexibility for Scale In rule