You are using an out of date browser. It may not display this or other websites correctly. You should upgrade or use an alternative browser.
research release
About this tag
The research release tag on WindowsForum.com covers Microsoft's VibeVoice, an open-source text-to-speech framework designed for hour-scale, multi-speaker audio synthesis. This research release packages a compact LLM planner with continuous tokenizers and a diffusion-based acoustic decoder, enabling up to 90 minutes of coherent speech with up to four distinct speakers. It includes English and Mandarin demos, an audible disclaimer, and an imperceptible watermark for safety. The tag focuses on Microsoft's contributions to open-source TTS research, highlighting technical innovations and availability for researchers and developers.
Microsoft’s new VibeVoice marks a striking shift in what open-source text-to-speech can do: from short, single-voice clips to hour‑scale, multi‑speaker spoken audio that resembles a produced podcast — and it’s available now for researchers and tinkerers to try. The framework packages a compact...
ai in windows
continuous_tokenizers
diffusion acoustic head
english mandarin
gpu
hour-scale
llm planner
long form audio
multi-speaker
open source
podcast editing
researchrelease
safety features
speech synthesis
text-to-speech
tts
vibevoice
watermark