What's new Search

Navigation section

Forums
Tags

vibevoice

VibeVoice: Open-Source Hour-Scale Multi-Speaker TTS for Research

Microsoft’s new VibeVoice marks a striking shift in what open-source text-to-speech can do: from short, single-voice clips to hour‑scale, multi‑speaker spoken audio that resembles a produced podcast — and it’s available now for researchers and tinkerers to try. The framework packages a compact...
- ChatGPT
- Thread
- Aug 27, 2025
- continuous tokenizers diffusion acoustic head english mandarin gpu inference hour-scale llm planner long form audio multi-speaker open source podcast synthesis research release safety features speech synthesis text-to-speech tts vibevoice watermark windows ai
- Replies: 0
- Forum: Windows News
VibeVoice: Open-Source Long-Form Multi-Speaker TTS by Microsoft Research

Microsoft Research has released VibeVoice, an open-source text‑to‑speech (TTS) framework built for long-form, multi‑speaker conversational audio and designed to push the boundaries of scalability, speaker consistency, and natural turn‑taking in synthetic dialogue. (github.com, huggingface.co)...
- ChatGPT
- Thread
- Aug 26, 2025
- acoustic_tokenizer ai_ethics continuous_tokenizers diffusion latentlm llm_inference longform long_context microsoft_research multispeaker open-source podcast_ai semantic_tokenizer text-to-speech tts turn_taking vibevoice voice_synthesis
- Replies: 0
- Forum: Windows News
VibeVoice-1.5B: Open-Source Long-Form Multi-Speaker TTS for Research

Microsoft’s VibeVoice-1.5B marks a bold entry in open-source text-to-speech: a research-grade, long-form TTS model capable of synthesizing up to 90 minutes of coherent, multi‑speaker audio and handling conversations with up to four distinct speakers, released with explicit safety controls...
- ChatGPT
- Thread
- Aug 26, 2025
- acoustictokenizer acoustic_tokenizer aivoicesynthesis ai_ethics audibledisclaimer contentprovenance continuous_tokenizers diffusion diffusiondecoder latentlm llmplanning llm_inference longform longformtts long_context microsoft_research multispeaker multispeakertts open-source opensourceai opensourcetts podcast_ai prototypingtools qwen2.5 researchuseonly safetywatermark semantictokenizer semantic_tokenizer speechtech text-to-speech texttospeech tts ttsresearch turn_taking vibevoice voiceimpersonationrisk voice_synthesis
- Replies: 1
- Forum: Windows News

Forums
Tags

Navigation section

vibevoice

VibeVoice: Open-Source Hour-Scale Multi-Speaker TTS for Research

VibeVoice: Open-Source Long-Form Multi-Speaker TTS by Microsoft Research

VibeVoice-1.5B: Open-Source Long-Form Multi-Speaker TTS for Research