Anthropic studied what gives an AI system its ‘personality’ — and what makes it ‘evil’

Anthropic studied what gives an AI system its ‘personality’ — and what makes it ‘evil’

On Friday, Anthropic debuted research unpacking how an AI system’s “personality” – as in, tone, responses, and overarching motivation – changes and why. Researchers also tracked what makes a model “evil.” The Verge spoke with Jack Lindsey, an Anthropic researcher working on interpretability, who has also been tapped to lead the company’s fledgling “AI psychiatry” […]

Read More