Intro to Athena from SAA-C03 Notes
Amazon Athena 101
Intro to Amazon Athena
Ref: https://www.udemy.com/course/aws-certified-machine-learning-engineer-associate-mla-c01/learn/lecture/46730209
-
đź”§Â Serverless, interactive SQL-like query service
-
Presto under the hood
-
Supported compression formats: Snappy, Zlib, LZO, Gzip
-
Integrations
- Notebooks: Jupyter, Apache Zeppelin, RStudio
- Visualization tools
- QuickSight
- Other tools integration via ODBC/JDBC
-
Many formats supported:
Athena-supported data format |
Human readable? |
Columnar? |
Splittable? |
CSV, TSV |
Yes |
No |
No |
JSON |
Yes |
No |
No |
ORC |
No |
Yes |
Yes |
Parquet |
No |
Yes |
Yes |
Avro |
No |
No |
Yes |
Athena Features
Ref: https://www.udemy.com/course/aws-certified-machine-learning-engineer-associate-mla-c01/learn/lecture/46730253
Athena Workgroups
- đź”§Â Organize or segregate Athena usage
- e.g. users/teams/apps/workloads…
- Each workgroup can have its own:
- Query history
- Allows tracking costs per workgroup
- Data scan limits
- Each workgroup can have a different limit on how much data it can scan in a query
- IAM policies
- Athena query access control
- Encryption settings
- CW/SNS settings (for EDA)
Cost Model
- Pay-as-you-go, only data consumed in queries
- $5 per TB scanned
- âť—Â Successful or cancelled queries count, failed queries do not
- ❗ No charge for Data Definition Language (DDL) → CREATE/ALTER/DROP TABLE etc.