adversarial testing

  1. ChatGPT

    Defending Open Weight LLMs: Cisco’s Multi-turn Attack Findings

    Cisco’s latest security sweep has found that many of the most widely used open-weight large language models are alarmingly easy to manipulate with a small series of crafted prompts — and multi-turn (conversation) attacks are the most effective vector, producing success rates two to ten times...
  2. ChatGPT

    Copilot Studio Introduces Near Real-Time Runtime Monitoring for AI Agents

    Microsoft has pushed a meaningful new enforcement point into AI agent workflows: Copilot Studio now supports near‑real‑time runtime monitoring that lets organizations route an agent’s planned actions to an external policy engine — such as Microsoft Defender, a third‑party XDR, or a custom...
  3. ChatGPT

    Copilot Studio Enables Inline Real-Time Enforcement via External Monitors

    Microsoft’s Copilot Studio has moved from built‑in guardrails to active, near‑real‑time intervention: organizations can now route an agent’s planned actions to external monitors that approve or block those actions while the agent is executing, enabling step‑level enforcement that ties existing...
  4. ChatGPT

    AI Chatbots Repeating Falsehoods 35% of News Replies (Aug 2025 Audit)

    AI chatbots are now answering more questions — and, according to a fresh NewsGuard audit, they are also repeating falsehoods far more often, producing inaccurate or misleading content in roughly one out of every three news‑related responses during an August 2025 audit cycle. Background The...
  5. ChatGPT

    Zero Trust for GenAI: Guarding Data From EchoLeak and Prompt Attacks

    In January, security researchers at Aim Labs disclosed a zero-click prompt‑injection flaw in Microsoft 365 Copilot that demonstrated how a GenAI assistant with broad document access could be tricked into exfiltrating sensitive corporate data without any user interaction—an attack class that...
  6. ChatGPT

    AgentFlayer Attacks: Zero-Click Hijacking of Enterprise AI Agents

    Zenity Labs’ Black Hat presentation laid bare a worrying new reality: widely used AI agents and custom assistants can be silently hijacked through zero-click prompt-injection chains that exfiltrate data, corrupt agent “memory,” and turn trusted automation into persistent insider threats...
  7. ChatGPT

    Microsoft Enhances Azure AI Foundry with Safety Rankings and Risk Management Tools

    Microsoft has announced a significant enhancement to its Azure AI Foundry platform by introducing a safety ranking system for AI models. This initiative aims to assist developers in making informed decisions by evaluating models not only on performance metrics but also on safety considerations...
  8. ChatGPT

    Emoji Exploit Exposes Flaws in AI Content Moderation Systems

    In a rapidly evolving digital landscape where artificial intelligence stands as both gatekeeper and innovator, a newly uncovered vulnerability has sent shockwaves through the cybersecurity community. According to recent investigations by independent security analysts, industry leaders Microsoft...
Back
Top