This project provides the GGUF quantized files for the Vikhr-Nemo-12B-Instruct model, which is a large language model with 12 billion parameters and is specifically optimized for instruction-following tasks. By using various quantization techniques (such as IQ4_XS, Q2_K, Q3_K_M, Q4_K_M, Q5_K_M, Q6_K, Q8_0, etc.), the model size is compressed, enabling it to run efficiently on consumer-grade hardware while maintaining good performance.
Natural Language Processing
Gguf