See every side of every news story
Published loading...Updated

Anthropic studied what gives an AI system its ‘personality’ — and what makes it ‘evil’

Summary by The Verge
On Friday, Anthropic debuted research unpacking how an AI system's "personality" - as in, tone, responses, and overarching motivation - changes and why. Researchers also tracked what makes a model "evil." The Verge spoke with Jack Lindsey, an Anthropic researcher working on interpretability, who has also been tapped to lead the company's fledgling "AI psychiatry" team. "Something that's been cropping up a lot recently is that language models can…

Bias Distribution

  • 100% of the sources lean Left
100% Left

Factuality 

To view factuality data please Upgrade to Premium

Ownership

To view ownership data please Upgrade to Vantage

The Verge broke the news in United States on Friday, August 1, 2025.
Sources are mostly out of (0)