China's DeepSeek releases 'intermediate' AI model on route to next generation
DeepSeek’s new V3.2-Exp model cuts API costs by over 50% using Sparse Attention, boosting efficiency and speed with minimal quality loss, the company said.
- On Monday, DeepSeek released the experimental DeepSeek-V3.2-Exp and open-sourced it on Hugging Face and ModelScope, with the model live on its app, web platform, and API through October 15.
- Facing the long-context challenge, DeepSeek targeted efficiency at scale with V3.2-Exp, building directly on DeepSeek-V3.1-Terminus after rapid releases earlier this year, including the V3 original launched in December.
- DeepSeek's Sparse Attention uses a 'lightning indexer' to rank tokens, enabling up to a 64x speedup on sequences as long as 128,000 tokens, and is distinct from earlier Native Sparse Attention.
- For developers, this translates to 2-3x faster inference, 30-40% reduced memory use, and API prices cut by more than 50% to $0.07 per million tokens.
- DeepSeek's bet on efficiency positions it to compete with global AI leaders, using NVIDIA H100 GPUs—one for testing, eight for production—and a model with approximately 671 billion parameters.
29 Articles
29 Articles
DeepSeek tests “sparse attention” to slash AI processing costs
Ever wonder why ChatGPT slows down during long conversations? The culprit is a fundamental mathematical challenge: processing long sequences of text requires massive computational resources, even with the efficiency tricks that companies have already deployed. While US tech giants can afford to throw more hardware at the problem, Chinese AI company DeepSeek, which is cut off from a steady supply of some advanced AI chips by export restrictions, …
DeepSeek releases AI model on route to next generation
Chinese AI developer DeepSeek has released its "experimental" latest model, which it said was more efficient to train and better at processing long sequences of text than previous iterations of its large language models. The Hangzhou-based company called DeepSeek-V3.2-Exp an "intermediate step toward our next-generation architecture" in a post on developer forum Hugging Face. That architecture will likely be DeepSeek's most important product rel…
DeepSeek Releases V3.2-Exp Experimental Model, Cuts API Prices by Over 50% · TechNode
DeepSeek has launched and open-sourced DeepSeek-V3.2-Exp, an experimental large language model positioned as a step toward its next-generation architecture. The model introduces DeepSeek Sparse Attention, a fine-grained sparse attention mechanism designed to improve efficiency in long-text training and inference while maintaining output quality. Benchmarked against the previous V3.1-Terminus model under aligned training settings, V3.2-Exp delive…
Coverage Details
Bias Distribution
- 82% of the sources are Center
Factuality
To view factuality data please Upgrade to Premium