Recently, the Seed team under ByteDance has released the latest open-source large language model, Seed-OSS-36B, on the AI code sharing platform Hugging Face. This new model focuses on advanced reasoning and developer-friendliness. Its main feature is supporting input text processing of up to 512,000 tokens, far exceeding the products of American tech companies such as OpenAI and Anthropic.
The Seed-OSS-36B series includes three main variants: Seed-OSS-36B-Base (with synthetic data), Seed-OSS-36B-Base (without synthetic data), and Seed-OSS-36B-Instruct. The synthetic data version performs better in standard benchmark tests and is suitable for general use, while the version without synthetic data provides a more pure foundation for research. Seed-OSS-36B-Instruct is focused on task execution and instruction following, and has been fine-tuned to optimize performance.
All models are licensed under the Apache-2.0 license, which means researchers and developers can freely use, modify, and redistribute these models without paying any licensing fees to ByteDance. This marks another important advancement for Chinese companies in the field of open-source models and also provides more possibilities for international applications.
The design and core features of Seed-OSS-36B include 3.6 billion parameters, a 64-layer architecture, and a vocabulary size of 155,000 tokens. The model's long text processing capabilities and reasoning budget settings allow developers to adjust the depth of model reasoning based on the complexity of tasks. In addition, the model has demonstrated excellent performance in multiple benchmark tests, achieving industry-leading results in mathematical and programming tasks.
The Seed team also pays special attention to the accessibility of the model. Users can deploy it through Hugging Face Transformers and support 4-bit and 8-bit quantization formats to reduce memory requirements. In addition, the team provides scripts for inference, prompt customization, and tool integration, further lowering the operational threshold for small teams.
By providing high-performance and flexible deployment open models, the Seed team of ByteDance brings new choices for enterprises, researchers, and developers.
huggingface:https://huggingface.co/collections/ByteDance-Seed/seed-oss-68a609f4201e788db05b5dcd
Key Points:
🌟 The Seed-OSS-36B model supports input of up to 512,000 tokens, surpassing competitors.
💡 The model comes in versions with and without synthetic data to meet different user needs.
🔧 All models are available for free and support various deployment and integration options, making them easy for developers to use.