My AdaRound Code
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Lossy PNG compressor — pngquant command based on libimagequant library
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Fast inference engine for Transformer models
Vector (and Scalar) Quantization, in Pytorch
Sparsity-aware deep learning inference runtime for CPUs
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
? Accelerate inference and training of ? Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools
A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks