At 4:03 AM Beijing Time on April 3, Google officially released the open-source large model Gemma4. With a breakthrough in "unit parameter intelligence," it defines a new standard for open-source model-assisted intelligent agent workflows.
The series includes the efficient versions E2B (2.3B) and E4B (4.5B), as well as high-performance versions 26B MoE and 31B dense models. As the latest achievement built on the Gemini3 technology stack, Gemma4 fully supports multimodal input (images and videos). E2B and E4B also support native voice input, achieving real-time voice understanding at the edge.

In terms of technical architecture, the large-parameter models achieve extremely high hardware efficiency through optimization. The 31B dense version ranks third globally among open-source models in the Arena AI text leaderboard, while the 26B MoE version ranks sixth. Its logical reasoning and function calling capabilities are sufficient to drive complex autonomous agents.
In terms of local deployment, Gemma4 significantly lowers the entry barrier for cutting-edge AI capabilities. The non-quantized weights of the 31B model can run on a single 80GB H100 GPU, while the quantized version is compatible with consumer-grade GPUs. For mobile and IoT devices, the E2B and E4B models achieve low-latency logical processing on Raspberry Pi and smartphones through innovative PLE embedding technology and 128K long context support.
This release not only demonstrates Google's deep accumulation in the open-source ecosystem but also provides a foundation for global developers to build localized and high-privacy AI applications through its open approach under the Apache 2.0 license.




