Cisco’s latest security sweep has found that many of the most widely used open-weight large language models are alarmingly easy to manipulate with a small series of crafted prompts — and multi-turn (conversation) attacks are the most effective vector, producing success rates two to ten times...
Microsoft has pushed a meaningful new enforcement point into AI agent workflows: Copilot Studio now supports near‑real‑time runtime monitoring that lets organizations route an agent’s planned actions to an external policy engine — such as Microsoft Defender, a third‑party XDR, or a custom...
Microsoft’s Copilot Studio has moved from built‑in guardrails to active, near‑real‑time intervention: organizations can now route an agent’s planned actions to external monitors that approve or block those actions while the agent is executing, enabling step‑level enforcement that ties existing...
AI chatbots are now answering more questions — and, according to a fresh NewsGuard audit, they are also repeating falsehoods far more often, producing inaccurate or misleading content in roughly one out of every three news‑related responses during an August 2025 audit cycle. Background
The...
adversarialtesting
ai analytics
ai audit
ai chatbots
ai security
artificial intelligence
chatbot reliability
claude ai
copilot
digital trust
enterprise ai
enterprise safety
ethics
fact checking
false claims
falsehoods
google gemini
governance
gpt-5
guardrails
information disclosure
misinformation
mistral lechat
moderation
news accuracy
newsguard
openai
openai chatgpt
prompt engineering
provenance
regulators
responsible ai
retrieval
retrieval augmented generation
risk management
transparency
vendor risk
verification
web grounding
windows integration
windows it
In January, security researchers at Aim Labs disclosed a zero-click prompt‑injection flaw in Microsoft 365 Copilot that demonstrated how a GenAI assistant with broad document access could be tricked into exfiltrating sensitive corporate data without any user interaction—an attack class that...
adversarialtesting
ai security
ai user control
data leakage
data security
dlp
echoleak
genai
governance
identity_first_access
microsegmentation
microsoft copilot
model governance
privilege
prompt injection
retrieval augmented generation
shadow ai
supply chain risks
workload identities
zero trust
Zenity Labs’ Black Hat presentation laid bare a worrying new reality: widely used AI agents and custom assistants can be silently hijacked through zero-click prompt-injection chains that exfiltrate data, corrupt agent “memory,” and turn trusted automation into persistent insider threats...
Microsoft has announced a significant enhancement to its Azure AI Foundry platform by introducing a safety ranking system for AI models. This initiative aims to assist developers in making informed decisions by evaluating models not only on performance metrics but also on safety considerations...
adversarialtesting
ai analytics
ai benchmarks
ai ethics
ai evaluation
ai governance
ai management
ai performance
ai red teaming
ai risks
ai robustness
ai security
ai tools
autonomous ai
azure ai
leaderboards
microsoft
responsible ai
In a rapidly evolving digital landscape where artificial intelligence stands as both gatekeeper and innovator, a newly uncovered vulnerability has sent shockwaves through the cybersecurity community. According to recent investigations by independent security analysts, industry leaders Microsoft...
adversarial attacks
adversarialtesting
ai bias
ai ethics
ai robustness
ai security
ai training
content safety
cybersecurity vulnerabilities
disinformation risks
emoji exploit
generative ai
machine learning safety
moderation
natural language processing
platform safety
security patch
social media security
tech security