This is the GGUF quantized version of the large language model Mistral-Large-3-675B-Instruct-2512 developed by Mistral AI. The original model has 675 billion parameters and is designed for instruction-following tasks. This project uses the llama.cpp tool and combines it with the imatrix calibration dataset to generate more than 20 different precision quantized model files from Q8_0 to IQ1_S, aiming to balance model performance, inference speed, and storage/memory usage, enabling it to run on a wider range of hardware.
Natural Language Processing
GgufMultiple Languages