ArtQuantization
PublicArtQuantization is developed for quantizing Large Language Models, focusing on optimizing the memory usage and performance. This repository provides experimental results of quantizing models such as Qwen2.5 using different algorithms like AWQ and GPTQ, and demonstrates the memory requirements under various graphics card configurations.
Creat:2025-04-28T16:11:53
Update:2025-05-17T20:43:20
1
Stars
0
Stars Increase