This is an open-source large language model based on the DeepSeek-R1-0528-Qwen3-8B model, using Intel's AutoRound algorithm for INT4 quantization. While maintaining high performance, this model significantly reduces the model size and inference resource requirements, making it suitable for efficient inference on devices such as CPUs, Intel GPUs, or CUDA.
Natural Language Processing
Transformers