This is a quantized version based on Qwen3-Coder-REAP-25B-A3B, specifically optimized for Mac devices. It uses the Deckard (qx) formula for quantization, with 6-bit quantization for the embedding layer, heads, and selective attention paths, and 5-bit quantization for the rest, with a group size of 32. It achieves more efficient operation while maintaining a quality close to q8 quantization.
Natural Language Processing
MlxEnglish