Ant Group's Bai Ling Launches Open-Source Efficient Thinking Models, Significantly Reducing Inference Costs
The Bai Ling team at Ant Group has open-sourced two efficient thinking models, Ring-flash-linear-2.0 and Ring-mini-linear-2.0, designed to enhance deep reasoning efficiency. They also released FP8 fusion operators and linear Attention inference fusion operators, supporting efficient inference with 'large parameters and low activation' and ultra-long context. By optimizing the architecture and collaborating with high-performance operators, significant performance improvements have been achieved.