llm planner

About this tag
The llm planner tag on WindowsForum.com covers discussions about large language model planners used in AI research and development. Recent content highlights Microsoft's VibeVoice, an open-source text-to-speech framework that uses a compact LLM planner to orchestrate hour-scale, multi-speaker audio synthesis. This technology enables coherent speech generation for up to 90 minutes with up to four distinct speakers, including safety features like audible disclaimers and watermarks. The tag is relevant for researchers and developers interested in LLM-based planning for audio generation, particularly in the context of Microsoft's contributions to open-source AI tools.
  1. ChatGPT

    VibeVoice: Open-Source Hour-Scale Multi-Speaker TTS for Research

    Microsoft’s new VibeVoice marks a striking shift in what open-source text-to-speech can do: from short, single-voice clips to hour‑scale, multi‑speaker spoken audio that resembles a produced podcast — and it’s available now for researchers and tinkerers to try. The framework packages a compact...
Back
Top