Samba is a simple yet powerful hybrid model with infinite context length. Its architecture is straightforward: Samba = Mamba + MLP + Sliding Window Attention + Hierarchical MLP Stacking. The Samba-3.8B model was trained on the Phi3 dataset with 32 trillion tokens and significantly outperformed Phi3-mini on major benchmark tests (e.g., MMLU, GSM8K, and HumanEval). Samba can also achieve perfect long-context retrieval ability with minimal instruction tuning while maintaining linear complexity with respect to the sequence length. This enables Samba-3.8B-instruct to excel in downstream tasks such as long-context summarization.