Skip to main content
See every side of every news story
Published loading...Updated

China's DeepSeek releases 'intermediate' AI model on route to next generation

DeepSeek’s new V3.2-Exp model cuts API costs by over 50% using Sparse Attention, boosting efficiency and speed with minimal quality loss, the company said.

  • On Monday, DeepSeek released the experimental DeepSeek-V3.2-Exp and open-sourced it on Hugging Face and ModelScope, with the model live on its app, web platform, and API through October 15.
  • Facing the long-context challenge, DeepSeek targeted efficiency at scale with V3.2-Exp, building directly on DeepSeek-V3.1-Terminus after rapid releases earlier this year, including the V3 original launched in December.
  • DeepSeek's Sparse Attention uses a 'lightning indexer' to rank tokens, enabling up to a 64x speedup on sequences as long as 128,000 tokens, and is distinct from earlier Native Sparse Attention.
  • For developers, this translates to 2-3x faster inference, 30-40% reduced memory use, and API prices cut by more than 50% to $0.07 per million tokens.
  • DeepSeek's bet on efficiency positions it to compete with global AI leaders, using NVIDIA H100 GPUs—one for testing, eight for production—and a model with approximately 671 billion parameters.
Insights by Ground AI

29 Articles

Think freely.Subscribe and get full access to Ground NewsSubscriptions start at $9.99/yearSubscribe

Bias Distribution

  • 82% of the sources are Center
82% Center

Factuality 

To view factuality data please Upgrade to Premium

Ownership

To view ownership data please Upgrade to Vantage

YUAN TALKS broke the news in on Monday, September 29, 2025.
Sources are mostly out of (0)
News
For You
Search
BlindspotLocal