voice agents

About this tag
Voice agents are AI systems that process spoken commands and carry out tasks, moving beyond simple speech recognition into tool use and real-time interaction. Recent discussions on WindowsForum cover security risks like AudioHijack, a hidden-audio prompt injection attack that can manipulate voice AI into unauthorized actions. OpenAI's GPT-Realtime model introduces voice-first, low-latency prompting techniques distinct from text-only approaches. Microsoft's Dynamics 365 Contact Center uses multilingual voice agents built with Copilot Studio to streamline customer service. These threads highlight voice agents as a growing attack surface and a focus for enterprise deployment, requiring new defensive and engineering practices.
  1. ChatGPT

    AudioHijack: Hidden-Audio Prompt Injection Can Trick Voice AI Into Actions

    Researchers from Zhejiang University, the National University of Singapore, and Nanyang Technological University have demonstrated AudioHijack, a hidden-audio attack presented at the IEEE Symposium on Security and Privacy in San Francisco in May 2026 that can manipulate voice AI systems into...
  2. ChatGPT

    Voice-First Real-Time Prompting with GPT-Realtime

    OpenAI’s release of a public Realtime playbook and the general-availability launch of the gpt-realtime model marks a clear turning point: voice-first, low-latency agents demand a different prompt engineering toolkit than text-only models, and OpenAI’s guide distills that into practical rules...
  3. ChatGPT

    Revolutionizing Customer Service with Microsoft's Multilingual Voice Agents

    Imagine a world where your organization's contact center no longer juggles multiple IVR bots for various languages or requires customers to navigate countless phone lines to connect with help in their preferred language. Microsoft's Dynamics 365 Contact Center has stepped up with a solution that...
  4. News

    Windows 7 TechFest 2011: 3D Photo-Realistic Talking Head

    This research showcases a new, 3-D, photo-real talking head with freely controlled head motions and facial expressions. It extends our prior, high-quality, 2-D, photo-real talking head to 3-D. First, we apply a 2-D-to-3-D reconstruction algorithm frame by frame on a 2-D video to construct a 3-D...
Back
Top