Google Releases New Pricing Strategy for Gemini API, Billing for Inference Services on Demand

AIbase基地

Published inAI News · 4 min read · Apr 3, 2026

Google has recently updated the billing structure of its Gemini API to better meet users' inference needs. This update introduces several new service tiers, including Standard, Flexible, Priority, Batch, and Cache. Users can choose the most suitable tier based on their specific needs.

First, the Standard tier provides basic inference services, which users can select according to their usage. The Flexible tier is an innovative option that utilizes idle computing resources during off-peak hours, offering a 50% discount compared to standard pricing. The target latency for this tier ranges from 1 to 15 minutes but does not guarantee a fixed latency, making it suitable for applications with less strict time requirements.

Additionally, the Batch tier also offers a 50% discount compared to standard pricing, making it ideal for users who need to process large amounts of data, with a maximum latency of up to 24 hours. This tier is particularly suitable for large-scale data processing scenarios, allowing users to significantly reduce costs when performing large-scale information queries.

Regarding the Cache tier, billing is based on the number of cached tokens and storage duration, making it especially suitable for chatbots that frequently call complex instructions, long video analysis, or queries involving large document sets. This tier allows users to effectively manage storage and computing resources, improving system efficiency.

The Priority tier is priced 75% to 100% higher than the standard price but can control latency at millisecond to second level. This tier is ideal for applications requiring real-time responses, such as customer service chatbots, real-time fraud detection, and critical business intelligent assistants. Google recommends users with such needs to choose the Priority tier to ensure optimal performance in terms of response speed and efficiency for their applications.

Key Points:

🌟 New Gemini API service tiers have been added to meet different user needs.

⏳ Flexible and Batch tiers offer a 50% discount, suitable for large-scale data processing.

⚡ Priority tier ensures millisecond-level response, suitable for real-time application scenarios.

GeminiAPI ElasticBit BatchBit CacheBit

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Google Upgrades Gemini API File Search: Multimodal RAG Capabilities Achieve Comprehensive Advancement

Google announced an upgrade to the file search feature of the Gemini API, breaking through text retrieval limitations based on the Gemini Embedding2 model, integrating images and complex documents, enhancing multimodal RAG capabilities, and marking a critical step forward for enterprise-level AI information retrieval accuracy.

May 11, 2026

550

Google DeepMind Upgrades Gemini API with Multi-Toolchain and Context Loop Features

In March 2026, Google DeepMind upgraded the Gemini API, introducing the "multi-toolchain" and "context loop" mechanism. This move simplifies the development process, allowing integration of built-in tools such as Google Search and Maps along with custom functions in a single request. The "context loop" enables automated data transfer across tools, improving response efficiency and task handling capabilities.

Mar 19, 2026

2.4k

Google Releases Replacement Notice: Gemini 3 Pro Preview to Be Sunset, Developers Required to Migrate Within a Time Limit

Google announced that the Gemini 3 Pro Preview model service will be discontinued starting March 9, 2026, and urged developers to migrate to version 3.1 as soon as possible. The transition period has two key milestones: on March 6, the model alias will point to the new version, and on March 9, the old version will be officially discontinued.

Feb 28, 2026

2.2k

Google Launches Gemini API File Search Tool: Simplifying Private RAG Integration, Developers No Longer Need to Build Their Own Vector Databases

Google launched the file search tool for Gemini API, as a fully managed RAG system, it can convert private files directly into knowledge bases without users needing to handle data chunking, embedding generation, and other steps. Through API integration, efficient retrieval and generation can be achieved. The core function is end-to-end automatic processing of file uploads, indexing, and retrieval, simplifying the RAG process.

Nov 7, 2025

960

Google Launches Gemini Map Data Integration Tool: AI Can Access Real-Time Information of 250 Million Locations

Google launches a new tool called Grounding with Google Maps for the Gemini API, integrating AI with map data deeply. This feature provides access to over 250 million location details, including addresses and operating hours, to generate geospatial answers based on real data. When users ask questions related to locations, Gemini can automatically call real-time map data to respond.

Oct 20, 2025

1.2k

Google Launches New Gemini API URL Context Feature to Explain Web Page Content

Google launches Gemini API URL Context for AI to understand web content like humans, available on Google AI Studio since May 28, offering smarter webpage reading than traditional methods.....

Sep 2, 2025

960

Google Gemini API Major Upgrade: Capture Web Pages with a Single Line of Code, Dramatically Improving Development Efficiency

Google's Gemini API introduces URL Context, enabling direct content fetching from web links, simplifying data retrieval for developers without complex scripts.....

Aug 20, 2025

870

Google Launches Imagen4: Breaking the Text-to-Image Generation Bottleneck, Gemini API Empowers Text-to-Image

Recently, Google officially launched its latest text-to-image model **Imagen4** through the Gemini API, marking an important milestone in the field of generative AI (AIGC). According to Google's official blog and community feedback, Imagen4 has achieved breakthroughs in generating text within images, solving a long-standing technical bottleneck in AIGC, and providing developers with a tool for creating high-quality visual content. It is reported that the model comes in two versions: **Imagen4** and **Imagen4Ultra**, with respective pricing details yet to be fully disclosed.

Jun 26, 2025

840

Freebie Frenzy! Veo2 Lands on Google AI Studio, Generating Incredibly Realistic 8-Second Videos

Apr 16, 2025

1.4k

Veo 2 Launches on Gemini API: Revolutionizing AI Video Generation

Google's AI team recently announced the launch of Veo 2, its highly anticipated video generation model, via the Gemini API. This news has generated significant excitement in the tech world, marking a new era for AI video generation. Starting today, developers with billing enabled and Tier 1 access or higher can utilize the API to access Veo 2 and experience its powerful Text-to-Video and Image-to-Video capabilities.

Apr 10, 2025

2.1k

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

GEO Brand Visibility

AI Visibility Audit

AI Search Visibility Checker

GEO Ranking Monitor

AI Conversation Insight

GEO Promotion Link Detection

GEO Ranking Optimization System

GEO Ranking Optimization

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

LLM API Hub

AI Models Finder

Model Providers

LLM Leaderboard

Compare LLMs

LLM Cost Calculator

LLM Arena

AI Model Compatibility Checker

AI Deployment Calculator

Google Releases New Pricing Strategy for Gemini API, Billing for Inference Services on Demand

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Google Upgrades Gemini API File Search: Multimodal RAG Capabilities Achieve Comprehensive Advancement

Google DeepMind Upgrades Gemini API with Multi-Toolchain and Context Loop Features

Google Releases Replacement Notice: Gemini 3 Pro Preview to Be Sunset, Developers Required to Migrate Within a Time Limit

Google Launches Gemini API File Search Tool: Simplifying Private RAG Integration, Developers No Longer Need to Build Their Own Vector Databases

Google Launches Gemini Map Data Integration Tool: AI Can Access Real-Time Information of 250 Million Locations

Google Launches New Gemini API URL Context Feature to Explain Web Page Content

Google Gemini API Major Upgrade: Capture Web Pages with a Single Line of Code, Dramatically Improving Development Efficiency

Google Launches Imagen4: Breaking the Text-to-Image Generation Bottleneck, Gemini API Empowers Text-to-Image

Freebie Frenzy! Veo2 Lands on Google AI Studio, Generating Incredibly Realistic 8-Second Videos

Veo 2 Launches on Gemini API: Revolutionizing AI Video Generation

AI News Recommendations

Google Upgrades Gemini API File Search: Multimodal RAG Capabilities Achieve Comprehensive Advancement

Google DeepMind Upgrades Gemini API with Multi-Toolchain and Context Loop Features

Google Releases Replacement Notice: Gemini 3 Pro Preview to Be Sunset, Developers Required to Migrate Within a Time Limit

Google Launches Gemini API File Search Tool: Simplifying Private RAG Integration, Developers No Longer Need to Build Their Own Vector Databases

Google Launches Gemini Map Data Integration Tool: AI Can Access Real-Time Information of 250 Million Locations

Google Launches New Gemini API URL Context Feature to Explain Web Page Content

Google Gemini API Major Upgrade: Capture Web Pages with a Single Line of Code, Dramatically Improving Development Efficiency

Google Launches Imagen4: Breaking the Text-to-Image Generation Bottleneck, Gemini API Empowers Text-to-Image

Freebie Frenzy! Veo2 Lands on Google AI Studio, Generating Incredibly Realistic 8-Second Videos

Veo 2 Launches on Gemini API: Revolutionizing AI Video Generation