New Breakthrough in Domestic Computing Power! Mooley × Silicon Flow Achieve Efficient Inference of DeepSeek V3 671B on MTT S5000, Single Card Performance Approaching International Top Standards
Domestic AI chips and large models have achieved significant progress in collaborative optimization. Moore Threads and Silicon-based Flow successfully adapted the 671B-parameter DeepSeek V3 model to the domestic GPU MTT S5000. Using FP8 low-precision inference, they achieved over 4000 tokens/sec prefill and over 1000 tokens/sec decoding throughput, nearing international high-end AI accelerator performance.....