Skip to main content
See every side of every news story
Published loading...Updated

DeepSeek Drops Open-Source Model that Compresses Text 10x Through Images, Defying Conventions

DeepSeek-OCR compresses text by up to 10 times while retaining 97% of information to help large language models process longer documents with lower computing costs.

  • On Monday, DeepSeek released the open-source DeepSeek-OCR model on Hugging Face and GitHub, saying it compresses image-based text for LLMs using visual perception.
  • DeepSeek built the model to address LLM long-context limits, as researchers said processing text as images can be more efficient for handling long-context documents with vision encoders.
  • DeepSeek described the model's two-part architecture with a 380 million-parameter DeepEncoder and a DeepSeek3B-MoE-A570M decoder, trained on 30,000,000 PDF pages in roughly 100 languages.
  • Practically, the system supports high-throughput data generation for LLMs, producing training data at a scale of over 200,000 pages per day on a single NVIDIA A100 GPU, the company said.
  • The paper says vision-text compression delivers major token reductions, reporting seven- to 20-times reduction and a compression factor of ten with 97 per cent information retention, following DeepSeek's V3 and R1 open-weight models.
Insights by Ground AI

16 Articles

Chinese AI researchers want to keep chatbots fast and cheap with images in long contexts. Optical context compression is intended to improve AI assistants.

·Germany
Read Full Article

The Chinese start-up DeepSeek has just released an open-source multimodal AI model capable of processing complex documents by drastically reducing the cost of calculation. Using visual perception as a powerful compression tool, DeepSeek-OCR opens the way for the analysis of previously inaccessible data volumes.

Think freely.Subscribe and get full access to Ground NewsSubscriptions start at $9.99/yearSubscribe

Bias Distribution

  • 75% of the sources are Center
75% Center

Factuality 

To view factuality data please Upgrade to Premium

Ownership

To view ownership data please Upgrade to Vantage

the-decoder.de broke the news in Germany on Monday, October 20, 2025.
Sources are mostly out of (0)
News
For You
Search
BlindspotLocal