Best Multimodal AI AI Tools & Models - Premium Multimodal AI News

AI News

The Indicator is Impressive but Faces Backlash from the Community? MiniMax's New Flagship Model M3 Triggers Divisive Controversy

AI startup MiniMax releases flagship model M3, achieving 59% in near-real software engineering tests, surpassing GPT-5.5 and approaching Opus 4.7, with million-level context and native multimodality, sparking controversy.....

8.1k 5 minutes ago

Hourly Rate Up to 304 Yuan and Remote Work Supported xAI Is Offering High Rewards Globally for the Most Fluent Chinese Voice

In June 2026, Musk's xAI publicly recruited Chinese AI trainers to optimize Grok's voice interaction and multilingual capabilities. The position offers $35-45 per hour, supporting full-time, part-time, or collaborative modes, aiming to advance multimodal voice technology.....

12.7k 5 minutes ago

Hourly Rate Up to 304 Yuan and Remote Work Supported xAI Is Offering High Rewards Globally for the Most Fluent Chinese Voice

499 Yuan with Its Own Brain! WIKO Releases AI Cute Pet Xingzai, Equipped with Large Model and Can Locate Sound Sources

French tech brand WIKO launches AI pet hardware 'Xingzai', focusing on multimodal interactive experience, priced at 499 yuan, available from June 30. It comes in blue, pink, and gray-gold colors, with a magnetic module for easy attachment, featuring a fun and portable design.....

10.2k 25 minutes ago

499 Yuan with Its Own Brain! WIKO Releases AI Cute Pet Xingzai, Equipped with Large Model and Can Locate Sound Sources

AI Daily: MiniMax Launches M3 Large Model; NVIDIA's Physical Large Model Cosmos3 Released; Xiaohongshu Has Removed Over 1.2 Million AI-Managed Accounts

Welcome to the 【AI Daily】 segment! Here is your daily guide to exploring the world of artificial intelligence. Every day, we present you with the latest content in the AI field, focusing on developers to help you understand technological trends and innovative AI product applications. Discover new AI products by clicking here: https://app.aibase.com/zh1. MiniMax Launches M3 Large Model: Pioneering the MSA Architecture and Supporting 1M Context, Fully Open-Sourced to Compete with Overseas Flagship Models. The Release of MiniMax M3 Marks a Major Achievement in Domestic Multimodal and Long-Context Capabilities.

16.5k 5 minutes ago

AI Daily: MiniMax Launches M3 Large Model; NVIDIA's Physical Large Model Cosmos3 Released; Xiaohongshu Has Removed Over 1.2 Million AI-Managed Accounts

AI Products

Gemini Omni AI

A multimodal AI video generator that produces 4K movie clips with synchronized native audio.

Video generation

Gemini Omni Flash

Google's native multimodal AI video generation and editing tool, supporting one-click synchronization of text, image, and audio.

Video generation

VeoOmni

VeoOmni is powered by Google AI and can generate 1080p movie - grade videos from text or images and synchronize audio.

Video generation

VibePaper

VibePaper is a multimodal AI creative collaboration tool that integrates images, videos, and text on an infinite canvas.

AI design tools

4.6k

Models

Gemini 2.0 Flash-Lite

Google

$0.49

Input tokens/M

$2.1

Output tokens/M

Context Length

GPT-4.1 mini

Openai

$2.8

Input tokens/M

$11.2

Output tokens/M

Context Length

Grok 4 Fast

Xai

$1.4

Input tokens/M

$3.5

Output tokens/M

Context Length

GPT-5 Codex

Openai

Input tokens/M

Output tokens/M

Context Length

Claude 3 Opus

Anthropic

$105

Input tokens/M

$525

Output tokens/M

200

Context Length

Gemini 2.0 Flash

Google

$0.7

Input tokens/M

$2.8

Output tokens/M

Context Length

Claude Haiku 4.5

Anthropic

Input tokens/M

$35

Output tokens/M

200

Context Length

Gemini 2.5 Flash

Google

$2.1

Input tokens/M

$17.5

Output tokens/M

Context Length

Claude Sonnet 4.5

Anthropic

$21

Input tokens/M

$105

Output tokens/M

200

Context Length

Claude 3 Sonnet

Anthropic

$21

Input tokens/M

$105

Output tokens/M

200

Context Length

Gemini 2.5 Flash-Lite

Google

$0.7

Input tokens/M

$2.8

Output tokens/M

Context Length

qwen3-vl-235b-a22b-thinking

Alibaba

Input tokens/M

$20

Output tokens/M

Context Length

qwen3-vl-plus

Alibaba

Input tokens/M

$10

Output tokens/M

256

Context Length

qwen-image-edit

Alibaba

Input tokens/M

Output tokens/M

Context Length

Doubao-Seed-Translation

Bytedance

$1.2

Input tokens/M

$3.6

Output tokens/M

Context Length

qwen3-livetranslate-flaltimeash-re-2025-09-22

Alibaba

Input tokens/M

$240

Output tokens/M

Context Length

wan2.5-i2v-preview

Alibaba

Input tokens/M

Output tokens/M

Context Length

wan2.5-t2i-preview

Alibaba

Input tokens/M

Output tokens/M

Context Length

wan2.5-t2v-preview

Alibaba

Input tokens/M

Output tokens/M

Context Length

qwen3-omni-flash-realtime

Alibaba

$3.9

Input tokens/M

$15.2

Output tokens/M

Context Length

MCP

MCPollinations

MCPollinations is a multimodal AI service based on the Model Context Protocol (MCP) that supports generating images, text, and audio through the Pollinations API. It offers a lightweight service that does not require authentication, is compatible with multiple AI models, and supports image saving and Base64 encoding return.

javascript

10.5k

2.5points

Mcp Portal

The MCP portal is the official community platform for the Model Context Protocol, providing resources such as documentation, practical guides, server implementations, and tool integrations. It supports AI models to access external tools through the MCP protocol, covering a wide range of scenarios from database connections to multimodal applications.

javascript

10.1k

2.5points

Ai Vision Mcp

An AI vision analysis MCP server based on Google Gemini and Vertex AI, supporting multimodal analysis of images and videos, providing functions such as object detection and image comparison, and can be integrated into various MCP clients.

typescript

9.3k

2.5points

Context_engineering_mcp

The Context Engineering MCP Platform is an AI context management and optimization platform that designs, manages, and optimizes the input information of AI models through a systematic approach, realizing the engineering of prompt engineering. The platform provides functions such as intelligent analysis engines, optimization algorithms, and template management, significantly improving AI response quality, reducing API costs, and supporting multimodal content processing.

python

2.5points

Whatsapp Bot Mcp

NiagaBot is a smart WhatsApp business automation robot based on Qwen3-Omni AI, supporting functions such as multimodal message processing, group management, batch broadcasting, and data analysis

javascript

10.2k

2.0points

MCP To Langchain Addapter

This project provides an adapter for seamlessly integrating the tools of the MCP (Multimodal Conversation Program) server into LangChain and LangGraph applications, supporting the use of MCP tools in AI application pipelines.

python

9.9k

2.0points

MaxKB

MaxKB is an open-source AI assistant designed specifically for enterprises. It supports the RAG process, workflow engine, and multimodal interaction, and is suitable for scenarios such as intelligent customer service and knowledge bases.

python

29.6k

No rating available

Empowering the future, your artificial intelligence solution think tank

English 简体中文繁體中文にほんご

FirendLinks:

AI Newsletters AI Tools MCP Servers AI News AIBase LLM Leaderboard AI Ranking

Business Cooperation Site Map

AI News

The Indicator is Impressive but Faces Backlash from the Community? MiniMax's New Flagship Model M3 Triggers Divisive Controversy

Hourly Rate Up to 304 Yuan and Remote Work Supported xAI Is Offering High Rewards Globally for the Most Fluent Chinese Voice

499 Yuan with Its Own Brain! WIKO Releases AI Cute Pet Xingzai, Equipped with Large Model and Can Locate Sound Sources

AI Daily: MiniMax Launches M3 Large Model; NVIDIA's Physical Large Model Cosmos3 Released; Xiaohongshu Has Removed Over 1.2 Million AI-Managed Accounts

AI Products

Gemini Omni AI

Gemini Omni Flash

VeoOmni

VibePaper

Models

Gemini 2.0 Flash-Lite

GPT-4.1 mini

Grok 4 Fast

GPT-5 Codex

Claude 3 Opus

Gemini 2.0 Flash

Claude Haiku 4.5

Gemini 2.5 Flash

Claude Sonnet 4.5

Claude 3 Sonnet

Gemini 2.5 Flash-Lite

qwen3-vl-235b-a22b-thinking

qwen3-vl-plus

qwen-image-edit

Doubao-Seed-Translation

qwen3-livetranslate-flaltimeash-re-2025-09-22

wan2.5-i2v-preview

wan2.5-t2i-preview

wan2.5-t2v-preview

qwen3-omni-flash-realtime

JanusCoder 14B GGUF

JanusCoder 8B GGUF

Qwen3 VL 2B Instruct GGUF

Huihui Qwen3 VL 30B A3B Instruct Abliterated GGUF

LFM2 VL 3B

Bee 8B RL

Apriel 1.5 15b Thinker GGUF

Qwen2.5 VL 7B Instruct NVFP4

GLM 4.5V AWQ 4bit

GLM 4.1V 9B Base

Medgemma 4b It GGUF

Cosmos Reason1 7B

Medgemma 4b Pt

Medgemma 27b Text It Unsloth Bnb 4bit

Medgemma 4b It

Llama 4 Scout 17B 16E Instruct

Gemma 3 R1984 4B

Llama 4 Scout 17B 16E Instruct INT4

Llama 4 Scout 17B 16E Instruct FP8

Debiased Llama 4 Scout 17B 16E Instruct

MCP

MCPollinations

Mcp Portal

Ai Vision Mcp

Context_engineering_mcp

Whatsapp Bot Mcp

MCP To Langchain Addapter

MaxKB