Qwen3-VL-4B-Instruct is a 4-billion-parameter instruction-tuned multimodal large language model launched by the Alibaba Cloud Qwen team. It is optimized for Qualcomm NPUs, integrating powerful visual-language understanding capabilities with dialogue fine-tuning functions, and is suitable for practical application scenarios such as chat reasoning, document analysis, and visual dialogue.
Multimodal
Gguf