Ref: https://www.udemy.com/course/aws-ai-practitioner-certified/learn/lecture/44887781
Clarify (part of SageMaker Studio)
- đź”§Â Evaluate and compare FMs, explain model outputs, detect bias
- Built-in or custom datasets
- Can evaluate human factors such as friendliness or humor
- Features
- Built-in metrics and algorithms
- Leverage an AWS-managed team or bring your own employees
Model Explainability
- Set of tools to explain how ML models make predictions
- Can help understand model prior to deployment
- Can help debug model predictions after deployment
- Increases trust and understanding of the model
- Examples
- “Why did the model predict a negative outcome such as a transaction fraud?”
- “Why did the model tag spam incorrectly?”
Detect (Human) Bias
- Detect and explain biases in your datasets and models
- Measure bias using statistical metrics
- Specify input features and bias will be automatically detected
- Kinds of biases
- Sampling bias
- Training data does not represent the full population fairly
- Leads to model that over-represents or disproportionately affects certain groups
- 💡 Example: an algorithm only flags people from specific ethnic groups. This is probably a sampling bias, and you need to perform data augmentation for imbalanced classes
- Measurement bias
- Tools or measurements used in data collection are flawed or skewed
- Observer bias
- Person collecting or interpreting the data has personal biases that affect the results
- Confirmation bias
- Individuals interpret or favor information that confirms their preconceptions
- âť—More applicable to human decision-making rather than automated model outputs
Ground Truth
- 🔧 RLHF – Reinforcement Learning from Human Feedback
- Model review, customization and evaluation by humans
- Aligns model to human preferences
- Reinforcement learning where human feedback is included in the “reward” function
- Human feedback for ML
- Creating or evaluating your models
- Data generation or annotation/labeling
- Reviewers: Amazon Mechanical Turk workers, your employees, or third-party vendors
- SageMaker Ground Truth Plus: specifically for labeling data