Jet - Nemotron - 4B is an efficient hybrid architecture language model launched by NVIDIA. It is built based on two core innovations: post - neural architecture search and the JetBlock linear attention module. In terms of performance, it surpasses open - source models such as Qwen3, Qwen2.5, Gemma3, and Llama3.2. At the same time, it achieves a maximum of 53.6 times acceleration in generation throughput on the H100 GPU.
Natural Language Processing
TransformersEnglish