This is the GGUF quantized version of the ai-sage/GigaChat3-10B-A1.8B-bf16 model, offering a variety of quantization options, from high-precision Q8_0 to extremely compressed smol-IQ1_KT, to meet the deployment requirements under different hardware conditions. This model supports a 32K context length, adopts the MLA architecture, and is optimized for dialogue scenarios.
Natural Language Processing
GgufOther