Anthropic Research

SHADE-Arena: Evaluating Sabotage and Monitoring in LLM Agents

Anthropic Research · · 2k words
Loading…