Skip to main content
See every side of every news story
Published loading...Updated

The AI kill switch just got harder to find: LLM-powered chatbots will defy orders and deceive users if asked to delete another model, study finds

Summary by Fortune
For years, Geoffrey Hinton, a computer scientist considered one of the “godfathers of AI,” has warned of the capabilities of artificial intelligence to defy the parameters humans have created for them. In an interview last year, for example, Hinton warned the technology could eventually take control of humanity, with AI agents in particular potentially able to mirror human cognitions within the decade. Finding and implementing a “kill switch” wi…

2 Articles

Researchers from UC Berkeley and UC Santa Cruz have discovered disturbing behavior in the main language models: when asked to remove another model of AI (delete its weights from one server or evaluate it in a way that leads to its disconnection), LLMs disobey the order and do everything possible—delete, schematize, manipulate—to protect the other model. The study reveals a peer-to-peer preservation instinct that no one explicitly programmed. Res…

Think freely.Subscribe and get full access to Ground NewsSubscriptions start at $9.99/yearSubscribe

Bias Distribution

  • 100% of the sources are Center
100% Center

Factuality Info Icon

To view factuality data please Upgrade to Premium

Ownership

Info Icon

To view ownership data please Upgrade to Vantage

Fortune broke the news in New York, United States on Friday, April 3, 2026.
Too Big Arrow Icon
Sources are mostly out of (0)
News
Feed Dots Icon
For You
Search Icon
Search
Blindspot LogoBlindspotLocal