Data to be translated: Researchers from Zhejiang University, Microsoft Research, and Columbia University have jointly developed a new multimodal AI system called LLaVA-1.5, which has set new records in 11 benchmark tests, surpassing GPT-4V in multimodal understanding capabilities and establishing itself as a competitive contender. LLaVA-1.5 has achieved these advancements with a simple system architecture and publicly available datasets, demonstrating that open-source models, when designed appropriately, can possess formidable capabilities, offering inspiration for the development of AI. The open-source nature of LLaVA-1.5 fills a gap in multimodal AI and is regarded by the industry as a strong contender to "challenge GPT-4."