Recently, NVIDIA and researchers from the University of Hong Kong jointly released a new model called "Orchestrator," which has 800 million parameters and can coordinate different tools and large language models (LLMs) to solve complex problems. In their experiments, Orchestrator achieved higher accuracy at a lower cost in tool usage benchmark tests and can intelligently select appropriate tools based on user preferences.

The training of Orchestrator was conducted through a new reinforcement learning framework called ToolOrchestra, aimed at cultivating small models as intelligent coordinators. The core idea of this approach is that a lightweight "coordinator" manages various specialized models and tools, solving problems more efficiently than a single large AI system.
Currently, most large language model tools combine basic tools (such as web search or calculators) with powerful models. Researchers believe that humans actually call upon various resources beyond their own intelligence when reasoning, so LLMs should also be able to interact with multiple tools. To achieve this, they proposed a transition from a single model system to a composite system consisting of multiple models. The coordinator analyzes complex tasks, breaks them into subtasks, and calls upon appropriate tools as needed.
Through the ToolOrchestra framework, the research team trained the Orchestrator model and evaluated its performance on three challenging benchmarks. After comparing it with several large general models, Orchestrator showed significant advantages on the "HLE" benchmark for doctoral-level questions, with much lower computational costs than other methods. Especially when calling tools, Orchestrator can effectively arrange the use of different tools, maintaining efficiency while reducing the use of high-cost models.
The researchers stated that the Orchestrator, trained through reinforcement learning, demonstrates strong general reasoning capabilities and can flexibly adapt to new challenges. For enterprise applications, Orchestrator can well adapt to unseen models and pricing structures, providing a more economical and flexible solution for enterprises relying on multiple AI models.
Project: https://research.nvidia.com/labs/lpr/ToolOrchestra/
Key Points:
🌟 Orchestrator is an 800 million parameter model that can intelligently coordinate multiple tools, enhancing AI's reasoning capabilities.
💡 The ToolOrchestra framework uses reinforcement learning to train small models, managing complex tasks in a more efficient way.
🚀 Orchestrator performs excellently in multiple benchmark tests, significantly reducing computational costs and adapting to various enterprise needs.


