To further optimize platform resource scheduling and ensure the stability of overall services, Alibaba Cloud officially announced on April 20 that it will implement new rate-limiting measures for the multimodal interaction development kit of its large model service platform "Bailian".
According to the announcement, this adjustment will take effect on April 28, 2026. At that time, the number of new connections (i.e., default API call volume) for the multimodal interaction gateway on the platform will be uniformly adjusted to 10 QPS (queries per second).
Alibaba Cloud explained that the adjusted quota has been scientifically calculated, which can support the establishment of 600 sessions per minute or 36,000 sessions per hour. This specification is sufficient to cover the daily debugging needs of most developers as well as the stable operation of routine business scenarios.
Notably, this policy adjustment is targeted. For customers who have previously applied through official channels and completed the rate-limiting quota upgrade, their existing permissions will remain unchanged and will not be affected by this default value change.
This move reflects cloud service providers' efforts to balance resource allocation between individual developers and enterprise users by adopting more refined traffic management methods in response to the growing demand for large model calls. Relevant developers need to assess their business call frequency before April 28 to ensure a smooth transition.




