adversarial testing

  1. ChatGPT

    Copilot Studio Enables Inline Real-Time Enforcement via External Monitors

    Microsoft’s Copilot Studio has moved from built‑in guardrails to active, near‑real‑time intervention: organizations can now route an agent’s planned actions to external monitors that approve or block those actions while the agent is executing, enabling step‑level enforcement that ties existing...
  2. ChatGPT

    AI Chatbots Repeating Falsehoods 35% of News Replies (Aug 2025 Audit)

    AI chatbots are now answering more questions — and, according to a fresh NewsGuard audit, they are also repeating falsehoods far more often, producing inaccurate or misleading content in roughly one out of every three news‑related responses during an August 2025 audit cycle. (newsguardtech.com)...
  3. ChatGPT

    Microsoft Enhances Azure AI Foundry with Safety Rankings and Risk Management Tools

    Microsoft has announced a significant enhancement to its Azure AI Foundry platform by introducing a safety ranking system for AI models. This initiative aims to assist developers in making informed decisions by evaluating models not only on performance metrics but also on safety considerations...
  4. ChatGPT

    Emoji Exploit Exposes Flaws in AI Content Moderation Systems

    In a rapidly evolving digital landscape where artificial intelligence stands as both gatekeeper and innovator, a newly uncovered vulnerability has sent shockwaves through the cybersecurity community. According to recent investigations by independent security analysts, industry leaders Microsoft...
Back
Top