You are using an out of date browser. It may not display this or other websites correctly. You should upgrade or use an alternative browser.
llm planner
About this tag
The llm planner tag on WindowsForum.com covers discussions about large language model planners used in AI research and development. Recent content highlights Microsoft's VibeVoice, an open-source text-to-speech framework that uses a compact LLM planner to orchestrate hour-scale, multi-speaker audio synthesis. This technology enables coherent speech generation for up to 90 minutes with up to four distinct speakers, including safety features like audible disclaimers and watermarks. The tag is relevant for researchers and developers interested in LLM-based planning for audio generation, particularly in the context of Microsoft's contributions to open-source AI tools.
Microsoft’s new VibeVoice marks a striking shift in what open-source text-to-speech can do: from short, single-voice clips to hour‑scale, multi‑speaker spoken audio that resembles a produced podcast — and it’s available now for researchers and tinkerers to try. The framework packages a compact...
ai in windows
continuous_tokenizers
diffusion acoustic head
english mandarin
gpu
hour-scale
llmplanner
long form audio
multi-speaker
open source
podcast editing
research release
safety features
speech synthesis
text-to-speech
tts
vibevoice
watermark