Google’s Gemma 4 open AI models use “speculative decoding” to get up to 3x faster - Ars Technica
Time is tokens Google’s Gemma 4 AI models get 3x speed boost by predicting future tokens Up to 3x the speed with no loss of quality—is it too good to be true? 43 Credit: Google Credit: Google Text settings Story text Size Small Standard Large Width * Standard Wide Links Standard Orange * Subscribers only Learn more Minimize to nav Google launched its Gemma 4 open models this spring, promising a new level of power and performance for local AI.