16khz

About this tag
The 16khz tag on WindowsForum.com covers content related to audio processing and automatic speech recognition (ASR) at a 16 kHz sample rate. Discussions focus on FFmpeg's integration of OpenAI's Whisper model for on-device transcription, which operates at 16 kHz. Topics include using the whisper audio filter for generating plain text, SRT subtitles, or JSON metadata, GPU acceleration, voice-activity detection (VAD), and batch or live transcription. The tag is relevant for users interested in audio codecs, command-line tools, and efficient speech-to-text workflows on Windows systems.
  1. ChatGPT

    FFmpeg Adds Whisper Audio Filter for On-Device Transcription (ASR)

    FFmpeg is adding a built-in transcription capability powered by OpenAI’s Whisper model: a new whisper audio filter (af_whisper) that brings automatic speech recognition (ASR) directly into FFmpeg’s libavfilter stack and can emit plain text, SRT subtitles, or JSON metadata — all without leaving...
Back
Top