Deployment Safeguards
Ref: https://www.udemy.com/course/aws-certified-machine-learning-engineer-associate-mla-c01/learn/lecture/45286467 and https://www.udemy.com/course/aws-certified-machine-learning-engineer-associate-mla-c01/learn/lecture/45286781
Production Variants
- 🔧 Test out multiple models on live traffic
- Variant weight → how much traffic it should get
- e.g. 10% variant weight gets 10% of traffic
- Once confident in new model's performance, can ramp it up to 100%
- Needed for A/B tests & real-world performance validation
- Some models (e.g. recommender systems) can't be effectively evaluated offline
Shadow Tests
- 🔧 Compare performance of shadow variant to production variant
- Shadow variant gets a small portion of traffic
- You monitor in SageMaker console and decide when to promote shadow variant to main variant
Deployment Guardrails
- 🔧 Control shifting traffic to new models
- “Blue/Green” Deployments (Blue fleet: traffic on old model; Green fleet: traffic on new model)
- All at once: Shift all traffic → Monitor that everything looks good → Terminate blue fleet
- Canary: Shift a small portion of traffic → Monitor that traffic on new model looks good → Shift the rest of the traffic
- Linear: Shift traffic in linearly spaced steps
- Auto-rollbacks if something goes wrong during deployment
- âť—Â Only for asynchronous or RT inference endpoints
SageMaker + Docker
Ref: https://www.udemy.com/course/aws-certified-machine-learning-engineer-associate-mla-c01/learn/lecture/45286781