MobileLLM-125M

An efficient, optimized small language model designed for device-side applications.

CommonProductProgrammingLanguage ModelDevice-side Applications
MobileLLM-125M is an autoregressive language model developed by Meta, using an optimized transformer architecture specifically designed for resource-constrained device applications. This model integrates several key technologies, such as the SwiGLU activation function, a deep thin architecture, shared embeddings, and grouped query attention. MobileLLM-125M/350M achieved accuracy improvements of 2.7% and 4.3%, respectively, on zero-shot commonsense reasoning tasks compared to the previous 125M/350M state-of-the-art models. The design principles can be effectively scaled to larger models, with MobileLLM-600M/1B/1.5B all achieving state-of-the-art results.
Visit

MobileLLM-125M Visit Over Time

Monthly Visits

20899836

Bounce Rate

46.04%

Page per Visit

5.2

Visit Duration

00:04:57

MobileLLM-125M Visit Trend

MobileLLM-125M Visit Geography

MobileLLM-125M Traffic Sources

MobileLLM-125M Alternatives