Recently, MLX-LM has been directly integrated into the Hugging Face platform. This milestone update provides Apple Silicon device users (including M1, M2, M3, and M4 chips) with unprecedented convenience, allowing them to run over 4,400 large language models (LLMs) locally at maximum speed without relying on cloud services or waiting for model conversion.

QQ20250520-092128.png

This integration further promotes the popularization of localized AI development, providing developers and researchers with more efficient and flexible tools.

The deep integration of MLX-LM with Hugging Face, MLX is a machine learning framework developed by Apple's machine learning research team, specifically optimized for Apple Silicon. It aims to fully leverage the powerful performance of the neural engine (ANE) and Metal GPU in M-series chips.

As a sub-package of MLX, MLX-LM focuses on the training and inference of large language models. In recent years, it has gained significant attention due to its efficiency and ease of use. Through integration with Hugging Face, MLX-LM can now load models directly from the Hugging Face Hub without additional conversion steps, greatly simplifying the workflow.

Address: https://huggingface.co/models?library=mlx