neuralmagic
INT8 quantized version of DeepSeek-R1-Distill-Qwen-32B, reducing VRAM usage and improving computational efficiency through weight and activation quantization.
The quantized version of DeepSeek-R1-Distill-Qwen-14B, optimized with INT8 quantization for weights and activations, reducing GPU memory requirements and improving computational efficiency.