AI News

Don't miss any moment of global AI innovation

AI Daily

Daily three-minute AI industry trends

AI Timeline

AI industry milestones

Al Hardware

Lists all AI hardware products.

AI Monetization Guide

Latest Cases

AI monetization case sharing

Image Collection

AI image creation monetization cases

Video Collection

AI video creation monetization cases

Audio Collection

AI audio creation monetization cases

Content Collection

AI content writing monetization cases

AI Tutorials

Latest Tutorials

Free sharing of the latest AI tutorials

AI Product Rankings

AI Product Ranking

Shows total visits ranking of AI websites

AI Traffic Growth Ranking

Track fastest growing AI websites by traffic

AI Traffic Decline Ranking

Focus on AI websites with significant traffic drops

AI Weekly Ranking

Shows weekly visits ranking of AI websites

Popular Country Rankings

United States

AI websites most popular with US users

China

AI websites most popular with Chinese users

India

AI websites most popular with Indian users

Brazil

AI websites most popular with Brazilian users

Popular Category Rankings

Image Generation

Total visits ranking of AI image generation websites

Personal Assistant

Total visits ranking of AI personal assistant websites

Character Generation

Total visits ranking of AI character generation websites

Video Generation

Total visits ranking of AI video generation websites

Popular Open Source Data Rankings

AI Project Ranking

GitHub popular AI projects by total stars

AI Project Growth Ranking

GitHub popular AI projects by growth rate

AI Developer Ranking

GitHub popular AI developer ranking

AI Organization Ranking

GitHub popular AI organization ranking

Popular Open Source Categories

Deepseek

GitHub popular deepseek open source projects

TTS

GitHub popular TTS open source projects

LLM

GitHub popular LLM open source projects

ChatGPT

GitHub popular ChatGPT open source projects

AI Open Source Project Library

Overview

Overview of GitHub popular AI open source projects

Product Library Tool Navigation MCP

Silicon Base Flow Updates DeepSeek-R1 and Other Inference Model APIs to Support a 128K Context Length

AIbase基地

Published inAI News · 4 min read · May 22, 2025

SiliconCloud has announced a significant upgrade to its DeepSeek-R1 and other inference model APIs, aiming to better meet developers' needs for long contexts and flexible parameter configurations. In this upgrade, the maximum context length of multiple inference models has been increased to 128K, enabling the model to think more comprehensively and produce more complete output content.

In this upgrade, many well-known models, such as Qwen3, QWQ, GLM-Z1, etc., all support a maximum context length of 128K, while DeepSeek-R1 supports 96K. This enhancement provides strong support for complex reasoning tasks like code generation and intelligent applications.

More importantly, SiliconCloud has introduced an independent control function for "reasoning chain" and "response content" lengths. Through this approach, developers can more efficiently utilize the model's reasoning capabilities. The maximum response length (max_tokens) is now only used to limit the content ultimately output by the model to users, while the reasoning budget (thinking_budget) is specifically designed to control the number of tokens used during the reasoning phase. Such a design allows developers to flexibly adjust the depth of the model's reasoning and the length of the output based on the complexity of the actual task.

For example, on the SiliconCloud platform, users can control the maximum reasoning chain length and maximum response length of the Qwen3-14B series by setting the thinking_budget and max_tokens. During the reasoning process, if the number of tokens generated in the reasoning stage reaches the thinking_budget, the Qwen3 series inference models will forcibly stop the reasoning chain. For other inference models, they may continue to output reasoning content.

In addition, if the maximum response length exceeds the max_tokens or the context length exceeds the context_length limit, the model's output response content will be truncated, and the finish_reason field in the response will be marked as length, indicating that the output was terminated due to length restrictions.

Users can visit SiliconCloud's official documentation to learn more about the details of API usage. As SiliconCloud continues to innovate, user experience will continue to improve, and more features will be rolled out successively.

https://docs.siliconflow.cn/en/userguide/capabilities/reasoning

Key Points:

🔹 Supports a maximum context length of 128K, enhancing the model's reasoning and output capabilities.

🔹 Independently controls the length of the reasoning chain and response content, increasing developer flexibility.

🔹 If length limits are reached, the model's output will be truncated, with reasons clearly marked.

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team