InternVL3-8B is an advanced multimodal large language model with excellent multimodal perception and reasoning capabilities, and performs well in multiple fields such as tool use, GUI agents, and industrial image analysis.
Multimodal
TransformersOther