DiffusionGemma: Google’s Open Model That Writes Text in Blocks, Up to 4x Faster

Google’s DiffusionGemma is an experimental open model that generates whole 256-token blocks of text at once instead of one word at a time, running up to 4 times faster and topping 1,000 tokens per second on a single GPU.
artificial-intelligence
Author

Kabui, Charles

Published

2026-06-29

Keywords

diffusion-gemma, text-diffusion, open-weight-llm, on-device-ai, mixture-of-experts