xAI has launched Grok4Fast, a lightweight flagship model that, according to the company, performs comparably to Grok4 but with 40% less computational demand. According to AIbase, this significant efficiency improvement can reduce the cost of each task by up to 98%.

Grok, Musk, xAI

Performance and Efficiency Balance

Grok4Fast performs well in multiple benchmark tests, scoring as high as 85.7% on GPQA Diamond and 92.0% on AIME2025, comparable to top models like Grok4 and even GPT-5. xAI emphasizes that the model achieved this by reducing "thinking tokens," obtaining similar results with 40% fewer tokens than Grok4 on average. This efficiency advantage is particularly prominent when dealing with complex reasoning tasks.

Integrated Architecture and External Tools

Differing from earlier versions that relied on separate models for different tasks, Grok4Fast integrates two approaches into one architecture and controls behavior through system prompts, reflecting the latest trend in hybrid models.

The model also has strong capabilities in using external tools, including web browsing and code execution. In benchmark tests such as BrowseComp and X Bench Deepsearch, Grok4Fast outperforms Grok4. In the LMArena-Search benchmark test, it even surpasses the previously leading OpenAI o3-websearch model. In the Text Arena ranking, Grok4Fast ranks eighth, outperforming other models of similar scale.

Availability and Pricing

Grok4Fast offers two versions: one optimized for inference-intensive tasks, and the other focused on quick answers. Both versions support a context window of 2 million tokens. The model is available via grok.com, iOS and Android apps, and the xAI API. Its pricing ranges from $0.05 to $1.00 per million tokens, depending on the token type. Currently, users can also access Grok4Fast for free through OpenRouter and Vercel.