Docs / Skills Reference / Transcribe
Transcribe
What it does
Transcribes audio and video files using OpenAI Whisper (cloud) or whisper.cpp (local). Fast, accurate speech-to-text.
Setup required
OpenAI API key for cloud mode, or whisper.cpp installed locally for free on-device transcription.
Permissions
- OpenAI API key (cloud mode) or local whisper.cpp installation
- File access for media files
Common prompts
| You say... | What happens |
|---|
| “Transcribe this meeting recording” | Converts audio to text |
| “Transcribe this video locally” | Uses on-device whisper.cpp (free, private) |
| “Transcribe this podcast and summarize it” | Transcription plus AI summary |
Configuration
- Two modes — cloud (OpenAI Whisper, ~$0.006/min, fast) or local (whisper.cpp, free, private, slower)
- Automatically extracts audio from video files
- Handles files over 25MB by splitting
Tips & gotchas
- Cloud vs. local. Use cloud mode for speed, local mode for privacy (audio never leaves your machine).
- Large file handling. Large files are automatically split at the 25MB boundary.
- Video support. Video files have their audio extracted automatically — no need to convert first.
- Combine with Document. Turn a transcript into a polished report using the Document skill.