Google Unveils DiffusionGemma, an AI Model that Breaks Free of Left-to-Right Processing
4 Articles
4 Articles
Google unveils DiffusionGemma, an AI model that breaks free of left-to-right processing
Extremely powerful large language models (LLMs) still operate as though they’re typing on a keyboard, processing workloads in a simple left-to-right fashion. But in locally-run, single-user scenarios, this sequential processing can leave graphics processing units (GPUs) and tensor processing units (TPUs) underutilized. Google is betting that DiffusionGemma can get around this bottleneck. The new experimental open model generates text “exceptiona…
Google's New Diffusion Gemma Changes How AI Processes Language
Google’s Diffusion Gemma introduces a bold shift in AI language modeling by adopting a diffusion-based architecture that processes tokens in parallel, rather than sequentially. As explained by Prompt Engineering, this design enables the model to generate tokens in fixed 256-token patches, significantly enhancing speed while maintaining contextual understanding. With 26 billion parameters, 4 billion active […] The post Google’s New Diffusion Gemm…
More than 1,000 tokens per second on a single H100 card, the accelerator that Nvidia sells to data centers, and about 700 on a RTX 5090, its high-end gaming card. This is the speed that Google DeepMind announces for DiffusionGemma, its new open AI model, about four times what classic Gemma models produce of comparable size. All the difference is played in how to generate the text. The usual language models are self-regressive: they write from le…
Coverage Details
Bias Distribution
- There is no tracked Bias information for the sources covering this story.
Factuality
To view factuality data please Upgrade to Premium


