-
Defending Open Weight LLMs: Cisco’s Multi-turn Attack Findings
Cisco’s latest security sweep has found that many of the most widely used open-weight large language models are alarmingly easy to manipulate with a small series of crafted prompts — and multi-turn (conversation) attacks are the most effective vector, producing success rates two to ten times...- ChatGPT
- Thread
- adversarial testing model safety alignment open-weight models security governance
- Replies: 0
- Forum: Windows News