Best LLaVA AI Tools & Models - Premium LLaVA News

AI News

ByteDance Launches Sa2VA: Achieving Multimodal Intelligent Segmentation by Combining LLaVA with SAM-2

ByteDance and universities launch Sa2VA, integrating LLaVA for video understanding and SAM-2 for precise object segmentation, enhancing video analysis through complementary capabilities.....

12.6k 2 days ago

ByteDance Launches Sa2VA: Achieving Multimodal Intelligent Segmentation by Combining LLaVA with SAM-2

LLaVA-OneVision-1.5, a Fully Open-Source Multimodal Model That Exceeds Qwen2.5-VL

LLaVA-OneVision-1.5, a breakthrough multimodal model, evolved over two years from basic image-text alignment to handling images/videos. It offers an open, efficient training framework for building high-quality vision-language models via three-stage training.....

11.5k 10 hours ago

LLaVA-OneVision-1.5, a Fully Open-Source Multimodal Model That Exceeds Qwen2.5-VL

AI Daily: Alibaba Launches New Image Model Qwen-Image; Zread.ai Powered by GLM-4.5; Claude Opus 4.1 May Start Internal Testing

AI highlights: 1. Alibaba's Qwen-Image excels in Chinese text rendering. 2. ChatGPT hits 700M users, OpenAI earns $12B. 3. Anthropic tests Claude 4.1. 4. Zread.ai by Zhipu. 5. xAI's Grok Imagine4 for text-to-video. 6. Character.AI's social features. 7. Alibaba & Nankai's LLaVA-Scissor. 8. Beijing's humanoid robot vision. 9. 8 AI models in Kaggle chess. 10. OpenMind's OM1 OS.....

8.8k 13 hours ago

AI Daily: Alibaba Launches New Image Model Qwen-Image; Zread.ai Powered by GLM-4.5; Claude Opus 4.1 May Start Internal Testing

Alibaba and Nankai University Collaborate to Launch a New Video Large Model Compression Technology LLaVA-Scissor

No description available

6.6k 2 days ago

Alibaba and Nankai University Collaborate to Launch a New Video Large Model Compression Technology LLaVA-Scissor

AI Products

LLaVA-Mini

LLaVA-Mini

LLaVA-Mini is a large-scale multimodal model designed for efficient comprehension of images and videos.

LLaVA-o1

LLaVA-o1

A visual language model capable of step-by-step reasoning.

Aquila-VL-2B-llava-qwen

Aquila-VL-2B-llava-qwen

A visual-language model that intelligently processes both image and text information.

LLaVA-Video

LLaVA-Video

Research on video instruction tuning and synthetic data.

Models

Qianfan-QI-VL

Baidu

Qianfan-QI-VL

-

Input tokens/M

-

Output tokens/M

32

Context Length

QianfanHuijin-8B

Baidu

QianfanHuijin-8B

-

Input tokens/M

-

Output tokens/M

32

Context Length

QianfanHuijin-70B

Baidu

QianfanHuijin-70B

-

Input tokens/M

-

Output tokens/M

32

Context Length

Qwen3-8B

Alibaba

Qwen3-8B

$0.5

Input tokens/M

-

Output tokens/M

32

Context Length

Qianfan-Llama-VL-8B

Baidu

Qianfan-Llama-VL-8B

-

Input tokens/M

-

Output tokens/M

32

Context Length

Gemma 3 27B

Google

Gemma 3 27B

$0.7

Input tokens/M

$1.4

Output tokens/M

131

Context Length

Hunyuan-Large

Tencent

Hunyuan-Large

$4

Input tokens/M

$12

Output tokens/M

28

Context Length

DeepSeek-R1-Distill-Llama-8B

Deepseek

DeepSeek-R1-Distill-Llama-8B

$1

Input tokens/M

-

Output tokens/M

8

Context Length

Gemma 2 9B

Google

Gemma 2 9B

$1

Input tokens/M

-

Output tokens/M

-

Context Length

Gemma 2 27B

Google

Gemma 2 27B

-

Input tokens/M

-

Output tokens/M

-

Context Length

Yi-6B-Chat

01-ai

Yi-6B-Chat

-

Input tokens/M

-

Output tokens/M

4

Context Length

Yi-34B-200K

01-ai

Yi-34B-200K

-

Input tokens/M

-

Output tokens/M

200

Context Length

Yi-34B

01-ai

Yi-34B

-

Input tokens/M

-

Output tokens/M

4

Context Length

Baichuan2-7B-Base

Baichuan

Baichuan2-7B-Base

-

Input tokens/M

-

Output tokens/M

4

Context Length

Doubao-1.5-pro-256k

Bytedance

Doubao-1.5-pro-256k

$5

Input tokens/M

$9

Output tokens/M

256

Context Length

AIBase

Empowering the future, your artificial intelligence solution think tank

English 简体中文繁體中文にほんご

FirendLinks:

AI Newsletters AI Tools MCP Servers AI News AIBase LLM Leaderboard AI Ranking

© 2025AIBase

Business Cooperation Site Map