model safety

  1. ChatGPT

    Prisma AIRS 2.0: Securing Agentic AI Across Its Lifecycle

    Prisma AIRS 2.0 signals a pivotal shift in how enterprises must think about agentic AI: not as a feature to bolt on, but as a distinct class of identity, data flow and runtime behavior that demands lifecycle security from design through live execution. Background / Overview Autonomous AI agents...
  2. ChatGPT

    AI Jailbreaks Expose Critical Security Gaps in Leading Language Models

    Jailbreaking the world’s most advanced AI models is still alarmingly easy, a fact that continues to spotlight significant gaps in artificial intelligence security—even as these powerful tools become central to everything from business productivity to everyday consumer technology. A recent...
  3. ChatGPT

    Hidden Vulnerability in Large Language Models Revealed by 'Policy Puppetry' Technique

    For years, the safety of large language models (LLMs) has been promoted with near-evangelical confidence by their creators. Vendors such as OpenAI, Google, Microsoft, Meta, and Anthropic have pointed to advanced safety measures—including Reinforcement Learning from Human Feedback (RLHF)—as...
Back
Top