Data Wrangler
- 🔧 Data Quality tool
- Single interface for data exploration, selection, visualization, cleansing, and processing
- Data preparation, data transformation and feature engineering
- Import Data
- Preview Data
- Visualize Data
- Transform Data
Additional Features
- SQL support
- Quick Model → can quickly spin up a simple model to see if the data makes sense
- Export Data Flow → share data flow with others, potentially in other tools
Feature Store
- 💡 Reminder: training dataset is originally just raw data. But with feature engineering, raw data becomes features, i.e. inputs to ML models used during training and for inference
- 🔧 Centralized place for storing features and their metadata
- Ingests features from a variety of sources
- Features (no pun intended)
- Features discoverable within SageMaker Studio
- Ability to define the transformation of data into feature
- Data Wrangler can publish features directly into Feature Store
- Screenshot