Million-Level Intelligent Agent Training! MiniMax Collaborates with Tencent Cloud: RL Sandbox Achieves Full and Stable Operation

AIbase基地

Published inAI News · 4 min read · Mar 18, 2026

As AI agents move from laboratories to large-scale applications, the supporting capabilities of the underlying infrastructure are facing unprecedented challenges.

Recently, MiniMax and 腾讯云 announced a deep collaboration and successfully completed an important practice in agent infrastructure. Relying on 腾讯云's powerful computing scheduling and cloud-native capabilities, MiniMax has started deploying a Agent RL (Agent Reinforcement Learning) sandbox with millions of throughput and tens of thousands of concurrent connections, and it has achieved full stable operation in the test environment.

Reinforcement learning (Reinforcement Learning) is key to enhancing the decision-making capabilities of AI agents. However, large-scale agent training often comes with high computational costs and environmental construction pressure. The core highlight of this collaboration is that 腾讯云 has helped MiniMax's reinforcement learning framework Forge achieve a qualitative leap:

Extreme efficiency: The training environment supports "second-level activation," significantly shortening the experiment preparation time.

Resource optimization: Achieving dynamic resource management with "use and then delete," ensuring that computing resources are not wasted.

Cost reduction and efficiency enhancement: Under the condition of ensuring a more stable and faster training process, it significantly reduces the overall cost of large-scale training.

As an AI newcomer with a valuation exceeding traditional internet giants, MiniMax has been active in both the capital market and the technology field recently. Not only has its market value continued to rise, but its overseas market share has also exceeded 70%. This collaboration with 腾讯云 is not only a win-win in the technical field, but also provides an industry reference "standard model" for large-scale deployment of agent sandboxes.

As the雏形 of the "operating system" of the AI era begins to emerge, a more efficient underlying sandbox will become an accelerator for agent evolution. As MiniMax continues to deepen its research in the field of reinforcement learning, a million-level agent ecosystem capable of self-learning and rapid iteration is getting closer to real life.

AI Intelligent Agents Tencent Cloud Agent Reinforcement Learning

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

The Mastermind Behind Teaching Robots to Cook Tomato and Egg: Genesis AI Open-Source Full-Stack Training Platform

Genesis AI company opens its World 1.0 platform, providing high-performance full-stack simulation infrastructure for robot and physical AI developers, helping agents efficiently train in a virtual training ground, reducing the learning barrier for skills such as robots cooking, and accelerating the implementation of embodied intelligence.

Jul 3, 2026

2.9k

Goldman Sachs Analysis of the Second Half of the Market: Funds Are Leaving the Seven Tech Giants, Semiconductor and Other AI Upstream Sectors Are More Preferred

Goldman Sachs derivatives expert Brian Garret notes investors are broadly underweight the Magnificent Seven, with funds rotating into AI-benefiting semiconductors. The Seven have underperformed, raising concerns over profitability of massive AI spending and institutional selling. Hedging costs in QQQ options are rising, outpacing those of small-caps.....

Jul 3, 2026

240

ByteDance Douba AI Phone Project Changes: Hardware Leader Resigns, Project Enters Adjustment Period

Lin Xi, head of ByteDance's Doubao AI phone hardware, has left, becoming the first core hardware lead to depart since the 2024 AI phone project launch, sparking speculation. Insiders deny the project is cancelled, stating it has entered a new adjustment phase.....

Jul 3, 2026

310

AI Daily: Alibaba Internally Reverses Disable of Claude; Microsoft's Pure Web-Based Aion System Exposed; Claude Premium Model Launches On-Demand Payment Model

Welcome to the [AI Daily] section! Here is your guide to exploring the world of artificial intelligence every day. Every day, we present you with the latest content in the AI field, focusing on developers to help you grasp technical trends and understand innovative AI product applications. Click to learn more about new AI products: https://app.aibase.com/zh1. 'Alibaba Internally Reverses Disable': Alibaba has completely delisted the Claude series AI tools. Alibaba has completely prohibited employees from using the Claude series AI tools, triggering discussions about AI technology security and data privacy.

Jul 3, 2026

9.2k

AI Research Enters the Autonomous Driving Era: Yang Zhilin Discusses the Third Stage of Large Model Training

AI research paradigm is undergoing profound transformation. At the 2026 Zhongguancun Forum, Yang Zhilin, founder of Moonshot AI, noted that AI R&D has entered the third stage of 'AI-led research.' Starting 2026, past reliance on human-crafted rules and fine-tuning will be overturned, as AI increasingly leads its own development.....

Jul 3, 2026

700

New Era of Scientific Discovery: AI Agent Elements Claw Successfully Completes Superconducting Material Development

Alibaba DAMO Academy, Renmin University and UCAS released the first AI agent for superconducting material discovery, Elements Claw. It achieves independent research, offering an efficient automated paradigm that could replace the trial-and-error long-cycle method.....

Jul 3, 2026

340

AI Video Market Landscape Reimagined: Google Gemini Omni Flash Tops Blind Test Rankings

Google DeepMind's text-to-video model Gemini Omni Flash climbed to the top of the authoritative blind test ranking Video Arena with 1404 Elo points, demonstrating Google's multimodal technology strength and confirming that the video generation field is rapidly evolving.

Jul 3, 2026

330

NVIDIA Releases Open-Source Dual-Tower AI Model, Text Generation Speed Increased by 2.42 Times, Image Quality Retained at 98.7%

NVIDIA released the Nemotron-Labs-TwoTower discrete diffusion language model, solving the problem of slow token-by-token generation speed in large models. The weights have been open-sourced on Huggingface. The model reuses pre-trained weights of existing backbone networks without the need for retraining from scratch, significantly reducing costs. It adopts a 60B dual-tower architecture, with two 30B networks working in parallel. Each tower activates 3B parameters and is equipped with 128 routable expert modules to improve generation efficiency.

Jul 3, 2026

300

Microsoft AI PC Dedicated System Project Aion Exposed, Completely Removes Traditional Start Menu

Microsoft's internal AI operating system, Project Aion, designed for AI PCs, has been exposed. Built on Edge and lightweight web technologies, it abandons traditional Start menus and desktop icons, with the Copilot-invoked taskbar as the sole interaction hub. Focused on feeds, creation, and real-time info, it introduces 'Spaces' to auto-categorize webpages, fundamentally rethinking interaction logic.....

Jul 3, 2026

270

Advertising Governance Embarks on a Visual Evolution: ByteDance Engine Launches Mamoda 2.5 Version to Achieve Comprehensive Video Coverage

ByteDance Engine launched its self-developed advertising governance large model, Mamoda 2.5, achieving an upgrade in content safety risk control technology. Starting from version 1.0, which could only identify basic prohibited text, the model has continuously evolved, expanding its capabilities. It now provides stronger support for efficiently and accurately identifying and managing prohibited content in the digital advertising ecosystem.

Jul 3, 2026

210

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

GEO Brand Visibility

AI Visibility Audit

AI Search Visibility Checker

GEO Ranking Monitor

AI Conversation Insight

GEO Promotion Link Detection

Website AI Friendliness Detection

GEO Ranking Optimization System

GEO Ranking Optimization

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

LLM API Hub

AI Models Finder

Model Providers

LLM Leaderboard

LLM API Proxy Checker

Compare LLMs

LLM Cost Calculator

LLM Arena

AI Model Compatibility Checker

AI Deployment Calculator

Million-Level Intelligent Agent Training! MiniMax Collaborates with Tencent Cloud: RL Sandbox Achieves Full and Stable Operation

AIbase基地

This article is from AIbase Daily

AI News Recommendations

The Mastermind Behind Teaching Robots to Cook Tomato and Egg: Genesis AI Open-Source Full-Stack Training Platform

Goldman Sachs Analysis of the Second Half of the Market: Funds Are Leaving the Seven Tech Giants, Semiconductor and Other AI Upstream Sectors Are More Preferred

ByteDance Douba AI Phone Project Changes: Hardware Leader Resigns, Project Enters Adjustment Period

AI Daily: Alibaba Internally Reverses Disable of Claude; Microsoft's Pure Web-Based Aion System Exposed; Claude Premium Model Launches On-Demand Payment Model

AI Research Enters the Autonomous Driving Era: Yang Zhilin Discusses the Third Stage of Large Model Training

New Era of Scientific Discovery: AI Agent Elements Claw Successfully Completes Superconducting Material Development

AI Video Market Landscape Reimagined: Google Gemini Omni Flash Tops Blind Test Rankings

NVIDIA Releases Open-Source Dual-Tower AI Model, Text Generation Speed Increased by 2.42 Times, Image Quality Retained at 98.7%

Microsoft AI PC Dedicated System Project Aion Exposed, Completely Removes Traditional Start Menu

Advertising Governance Embarks on a Visual Evolution: ByteDance Engine Launches Mamoda 2.5 Version to Achieve Comprehensive Video Coverage

AI News Recommendations

The Mastermind Behind Teaching Robots to Cook Tomato and Egg: Genesis AI Open-Source Full-Stack Training Platform

Goldman Sachs Analysis of the Second Half of the Market: Funds Are Leaving the Seven Tech Giants, Semiconductor and Other AI Upstream Sectors Are More Preferred

ByteDance Douba AI Phone Project Changes: Hardware Leader Resigns, Project Enters Adjustment Period

AI Daily: Alibaba Internally Reverses Disable of Claude; Microsoft's Pure Web-Based Aion System Exposed; Claude Premium Model Launches On-Demand Payment Model

AI Research Enters the Autonomous Driving Era: Yang Zhilin Discusses the Third Stage of Large Model Training

New Era of Scientific Discovery: AI Agent Elements Claw Successfully Completes Superconducting Material Development

AI Video Market Landscape Reimagined: Google Gemini Omni Flash Tops Blind Test Rankings

NVIDIA Releases Open-Source Dual-Tower AI Model, Text Generation Speed Increased by 2.42 Times, Image Quality Retained at 98.7%

Microsoft AI PC Dedicated System Project Aion Exposed, Completely Removes Traditional Start Menu

Advertising Governance Embarks on a Visual Evolution: ByteDance Engine Launches Mamoda 2.5 Version to Achieve Comprehensive Video Coverage