Ref: https://learn.cantrill.io/courses/1820301/lectures/42176908
Amazon Transcribe 101
- 🔧 Automatic Speech Recognition (ASR) service
- Audio (input) → Text (output)
- Pay-per-use: billed per second of transcribed audio
- 💡 ASR is a DL process
- Features:
- Automatic language identification for multi-lingual audio
- Language customization
- Filters for privacy
- e.g. filter out PII (identify and redact PII)
- Audience-appropriate language
- Speaker identification
- Improve accuracy for domain-specific, non-standard terms with:
- Custom vocabularies (words)
- Custom language models (context)
- Provide domain-specific text for Transcribe to learn context of the domain-specific words
- 💡 Use both for the highest transcription accuracy
- Use cases
- Full text indexing of audio → allows searching
- Meeting notes
- Subtitles/captions and transcripts
- Amazon Transcribe Call Analytics → Phone call analytics
- characteristics, summarization, categories, sentiment
- Amazon Transcribe Medical
- Integration with other apps and services
- Screenshot
Toxicity Detection
Ref: https://www.udemy.com/course/aws-ai-practitioner-certified/learn/lecture/44887629
- Transcribe can detect and score voice-based toxicity
- Leverages both speech cues (tone, pitch) & text cues
- Toxicity categories: sexual harassment, hate speech, threat, abuse, profanity, insult, graphic…
- Screenshot