Published 2 months ago • loading... • Updated 1 month ago

Apple study finds cutting-edge AI models from OpenAI and DeepSeek undergo 'complete collapse' when problems get too difficult

On June 6, Apple published a paper titled 'The Illusion of Thinking' revealing that large reasoning models fail at complex logic tasks like Tower of Hanoi.
The research arose from testing reasoning-optimized models on puzzles and benchmarks, showing their accuracy collapses as complexity increases despite access to correct algorithms.
Apple found that models generate hallucinations up to 48% of the time and lack generalizable problem-solving skills, with performance dropping to zero beyond certain complexity thresholds.
The authors found that model performance sharply declines and eventually stops altogether once problems surpass a specific complexity level, indicating that current models depend more on pattern recognition than on genuine reasoning.
These findings challenge claims about near-term artificial general intelligence and imply fundamental limits in large reasoning models that urge more rigorous scientific analysis.

Insights by Ground AI

Does this summary seem wrong?

84 Articles

TechXplore

Center

Benchmarking hallucinations: New metric tracks where multimodal reasoning models go wrong

Over the past decades, computer scientists have introduced increasingly sophisticated machine learning-based models, which can perform remarkably well on various tasks. These include multimodal large language models (MLLMs), ...

2 months ago

Read Full Article

VentureBeat

Reposted by

RocketNews

Center

Do reasoning models really “think” or not? Apple research sparks lively debate, response

Ultimately, the big takeaway for ML researchers is that before proclaiming an AI milestone—or obituary—make sure the test itself isn’t flawed

2 months ago·San Francisco, United States

Read Full Article

American Faith

Right

AI Is Artificial, Not Intelligent

Will AI dominate and begin to think for itself? Apple’s recent research may have unintentionally revealed the truth many of us have long suspected: AI is artificial, but it is not intelligence. In a pre-WWDC 2025 paper, Apple exposed a fundamental flaw in the latest AI systems known as large reasoning models (LRMs). These systems — including OpenAI’s o1 and o3, DeepSeek R1, Claude 3.7 Sonnet Thinking, and Google’s Gemini Flash Thinking — demonst…

2 months ago

Read Full Article

Clarin

Lean Right

"It's an Illusion": an Apple Study Casts Doubt on the Greatest Myth of Artificial Intelligence

A controversial paper claims that models like ChatGPT do not think or reason, but merely imitate patterns. Researchers proved that these systems collapse in the face of complex problems.

2 months ago·Buenos Aires, Argentina

Read Full Article

Tech Spot

Center

AI flunks logic test: Multiple studies reveal illusion of reasoning

Apple researchers have uncovered a key weakness in today's most hyped AI systems – they falter at solving puzzles that require step-by-step reasoning. In a new paper, the team tested several leading models on the Tower of Hanoi, an age-old logic puzzle, and found that performance collapsed as complexity increased.Read Entire Article

2 months ago

Read Full Article

Ars Technica

Center

New Apple study challenges whether AI models truly “reason” through problems

In early June, Apple researchers released a study suggesting that simulated reasoning (SR) models, such as OpenAI's o1 and o3, DeepSeek-R1, and Claude 3.7 Sonnet Thinking, produce outputs consistent with pattern-matching from training data when faced with novel problems requiring systematic thinking. The researchers found similar results to a recent study by the United States of America Mathematical Olympiad (USAMO) in April, showing that these …

2 months ago·United States

Read Full Article

Think freely.Subscribe and get full access to Ground NewsSubscriptions start at $9.99/year