This is the quantized version of the NVIDIA Nemotron-Nano-9B-v2 model, which is quantized using the llama.cpp b6317 version. The model provides a variety of quantization options, including bf16, Q8_0, Q6_K_L, etc., suitable for different hardware and usage scenarios, making it convenient for users to deploy and use.
Natural Language Processing
Gguf