Google once again redefines the boundaries of performance and cost for large models. Today, the company officially released its new lightweight model, Gemini 3 Flash — not only does it respond three times faster than its predecessor, achieving near "zero latency," but it also surpasses the current flagship model Gemini 3 Pro in multiple high-difficulty benchmark tests, becoming the first "Flash" model in history to "defeat the elder brother" in a simultaneous comparison. More surprisingly, this top-tier performance version is globally free and defaults to being integrated into the Gemini App, AI Studio, Google Antigravity, and CLI tools.

The groundbreaking performance of Gemini 3 Flash can be described as a "downgrade attack":

- On the authoritative code repair ranking SWE-bench, it scores 78%, slightly ahead of Gemini 3 Pro (76.2%);

- It achieved a high score of 90.4% in the doctoral-level reasoning test GPQA Diamond;

- In the extremely difficult comprehensive assessment Humanity’s Last Exam (no tool mode), it scored 33.7%, significantly better than the previous flagship Gemini 2.5 Pro;

- It ranked third globally in the LMArena text capability ranking.

image.png

This performance miracle stems from Google's deep optimization of the model architecture: while maintaining extremely low inference costs, it uses techniques such as knowledge distillation, inference path compression, and multimodal alignment, enabling the small model to have logic depth close to that of large models. When users upload an image or video, Flash can analyze the content and generate an executable plan within seconds — from identifying circuit faults to planning travel routes, responding like lightning.

To adapt to different scenarios, the new Gemini App introduces three interaction modes:

- Speed Mode: Default enables Gemini 3 Flash, suitable for daily Q&A;

- Think Mode: Activates Flash's deep reasoning chain, handling complex logical problems;

- Professional Mode: Retains Gemini 3 Pro, focusing on high-difficulty math and programming tasks.

This means that ordinary users can enjoy intelligent experiences previously limited to premium subscriptions without paying. The complex questions you ask in Google search are now powered by an AI engine with top-tier reasoning capabilities behind them.

image.png

Market data confirms the success of this strategy: the monthly active users of Gemini App soared from 450 million to 650 million in just one quarter, with over 13 million developers and API call volume increasing threefold year-over-year. With the addition of Flash, the Gemini 3 product line has formed a clear hierarchy — Deep Think (deep reasoning), Pro (professional breakthrough), Flash (inclusive speed) — fully covering the entire spectrum of needs from general users to research developers.