The article introduces the solution proposed by the Chen Tianqi TVM team for performing large-scale model inference using AMD graphics cards. Through optimization methods, the performance of the AMD Radeon RX 7900 XTX can reach 80% of NVIDIA's RTX 4090. The author also introduces MLC-LLM, a tool that provides high-performance generic deployment, making AMD GPUs competitive in large language model inference. The article points out that it is possible to address hardware shortages through software improvements and optimizations, and provides specific implementation schemes and performance test results.