AIBase
Home
AI NEWS
AI Tools
AI Models
MCP
AI Services
AI Compute
AI Tutorial
EN

AI News

View More

LLaVA-OneVision-1.5, a Fully Open-Source Multimodal Model That Exceeds Qwen2.5-VL

LLaVA-OneVision-1.5, a breakthrough multimodal model, evolved over two years from basic image-text alignment to handling images/videos. It offers an open, efficient training framework for building high-quality vision-language models via three-stage training.....

10.4k 7 hours ago
LLaVA-OneVision-1.5, a Fully Open-Source Multimodal Model That Exceeds Qwen2.5-VL

Models

View More

LLaVA OneVision 1.5 8B Instruct

lmms-lab

L

LLaVA-OneVision-1.5 is a series of fully open-source large multimodal models that achieve advanced performance at a lower cost by training on native resolution images. This model demonstrates excellent performance in multiple multimodal benchmark tests, surpassing competitors such as Qwen2.5-VL.

MultimodalTransformersTransformers
lmms-lab
829
22
AIBase
Empowering the future, your artificial intelligence solution think tank
English简体中文繁體中文にほんご
FirendLinks:
AI Newsletters AI ToolsMCP ServersAI NewsAIBaseLLM LeaderboardAI Ranking
© 2025AIBase
Business CooperationSite Map