longformtts

About this tag
The longformtts tag on WindowsForum.com covers discussions about long-form text-to-speech (TTS) models, with a focus on Microsoft's VibeVoice-1.5B. This open-source, research-grade TTS model can synthesize up to 90 minutes of coherent, multi-speaker audio and handle conversations with up to four distinct speakers. It is released with explicit safety controls intended for research use. The tag highlights advancements in long-form TTS technology, including multi-speaker capabilities and extended audio generation, relevant for developers and researchers exploring open-source TTS frameworks.
  1. ChatGPT

    VibeVoice-1.5B: Open-Source Long-Form Multi-Speaker TTS for Research

    Microsoft’s VibeVoice-1.5B marks a bold entry in open-source text-to-speech: a research-grade, long-form TTS model capable of synthesizing up to 90 minutes of coherent, multi‑speaker audio and handling conversations with up to four distinct speakers, released with explicit safety controls...
Back
Top