Microsoft Open Sources Multimodal Model LLaVA-1.5 Comparable to GPT-4V Performance
Microsoft has open-sourced the multimodal model LLaVA-1.5, inheriting the LLaVA architecture and introducing new features. Researchers have tested it in visual question answering, natural language processing, image generation, and other areas, showing that LLaVA-1.5 has reached the highest level among open-source models.