GigaChat3-10B-A1.8B is a dialogue model in the GigaChat series, based on the Mixture of Experts (MoE) architecture, with a total of 10 billion parameters, of which 1.8 billion are active parameters. This model uses multi-head latent attention and multi-token prediction technologies, supports a long context of 256,000 tokens, and performs excellently in multilingual dialogue and reasoning tasks.
Natural Language Processing
SafetensorsMultiple Languages