llm reliability

  1. Microsoft DELEGATE-52: LLM Agents Silently Corrupt Documents in Long Workflows

    Microsoft researchers Philippe Laban, Tobias Schnabel, and Jennifer Neville posted an April 17, 2026 preprint arguing that 19 tested large language models, including frontier systems from Google, Anthropic, and OpenAI, silently degraded documents during long delegated editing workflows. The...