Despite Claude Sonnet 4.5’s awareness of being tested, Anthropic claims that it ended up being its “most aligned model yet,” pointing to a “substantial” reduction in “sycophancy, deception, ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results