DiffusionGemma Explained: How Diffusion LLMs Run Local AI Faster
Google DeepMind's DiffusionGemma is an open-weight, diffusion-based text model that Ars Technica reports runs local AI roughly 4x faster. Here's what diffusion LLMs are, how they differ from autoregressive models, what the speed claim actually says, and how to think about running one locally.
06/13/2026 · Research · 7 min read