How AI Mathematicians Might Finally Deliver Human-Level Reasoning
- Apple's machine learning team published a study on 2025-06-09 testing models like Claude and DeepSeek-R1 in puzzle environments such as Tower of Hanoi.
- The study arose from concerns that reasoning models fail under increasing puzzle complexity, with accuracy dropping to zero despite ample compute and correct algorithms.
- Researchers found reasoning models excel at intermediate difficulty but reduce 'thinking' effort on hardest problems and produce incorrect answers when complexity grows.
- The paper titled 'The Illusion of Thinking' states these AI systems rely on pattern matching rather than true logical reasoning, warning that current reasoning methods face fundamental scaling limits.
- The findings imply serious structural flaws in reasoning models and suggest reevaluating AI designs for robust reasoning, especially as these models are increasingly embedded in critical applications.
14 Articles
14 Articles
Ken Ono and other experts stressed that the level of mathematical reasoning of AI is capable of solving level five problems, challenges that even humans fail to solve.
Photo by depositphotos.com Mexico City.- In September 2024, OpenAI presented o1, a great model of reasoning (LRM), unlike ChatGPT which is a great model of language (LLM) because the first is capable of “reasoning”, in the words of the company. Competitors were not left behind. DeepSeek-R1, Claude 3.7 Sonnet Thinking and Google Gemini Thinking were answers to this fresh and novel LRM. Sam Altman, executive director of OpenAI, commented at differ…
ALPHAONE: A Universal Test-Time Framework for Modulating Reasoning in AI Models
Large reasoning models, often powered by large language models, are increasingly used to solve high-level problems in mathematics, scientific analysis, and code generation. The central idea is to simulate two types of cognition: rapid responses for simpler reasoning and deliberate, slower thought for more complex problems. This dual-mode thinking reflects how humans transition from intuitive reactions to analytical thinking depending on task com…
Apple Study Questions AI Reasoning Models in Stark New Report
Apple has thrown a spanner into one of artificial intelligence’s most hyped developments. A new internal study suggests that so-called "reasoning" models—AI systems designed to emulate step-by-step human thought—are far less capable than the industry might have hoped. Published under the title The Illusion of Thinking, the paper from Apple’s machine learning team provides a comprehensive critique of reasoning-enhanced large language models (LLMs…
AI Models Lack Reasoning Capability Needed For AGI
The race to develop artificial general intelligence (AGI) still has a long way to run, according to Apple researchers who found that leading AI models still have trouble reasoning. Recent updates to leading AI large language models (LLMs) such as OpenAI’s ChatGPT and Anthropic’s Claude have included large reasoning models (LRMs), but their fundamental capabilities, scaling properties, and limitations “remain insufficiently understood,” said the…
Coverage Details
Bias Distribution
- 100% of the sources are Center
To view factuality data please Upgrade to Premium