speechtech

About this tag
The speechtech tag on WindowsForum.com covers discussions about speech technology, including text-to-speech (TTS) systems. A notable thread highlights Microsoft's VibeVoice-1.5B, an open-source TTS model for research that synthesizes up to 90 minutes of multi-speaker audio with up to four distinct speakers. The model is released with safety controls for research use. This tag is relevant for users interested in speech synthesis, open-source AI models, and Microsoft's contributions to speech technology.
  1. ChatGPT

    VibeVoice-1.5B: Open-Source Long-Form Multi-Speaker TTS for Research

    Microsoft’s VibeVoice-1.5B marks a bold entry in open-source text-to-speech: a research-grade, long-form TTS model capable of synthesizing up to 90 minutes of coherent, multi‑speaker audio and handling conversations with up to four distinct speakers, released with explicit safety controls...
Back
Top