HomeAI Tutorial

Survey-of-Quantization-Formats

Public

A survey of modern quantization formats (e.g., MXFP8, NVFP4) and inference optimization tools (e.g., TorchAO, GemLite), illustrated through the example of Llama-3.1 inference.

Creat2025-08-19T01:51:09
Update2025-10-28T06:51:57
8
Stars
0
Stars Increase

Related projects