Best 'Architecture MoE' AI Tools & Models - Premium 'Architecture MoE' News

AI News

Weibo Open Sources Vibe Thinker: 1.5 Billion Parameters Outperform DeepSeek R1 with a Training Cost of Only $7,800

Weibo launches the open-source large model Vibe Thinker, which has only 1.5 billion parameters but outperforms the 671 billion parameter DeepSeek R1 in mathematical competition benchmarks, with higher accuracy and a training cost of only $7,800. It adopts a lightweight MoE architecture and knowledge distillation technology, requiring only 5GB of mathematical corpus for fine-tuning. It supports downloading from Hugging Face and commercial use. The model performs outstandingly in international math competitions such as AIME.

11.8k 23 hours ago

Ant Group's BaiLing Large Model Team Open Sources Ring-flash-linear-2.0-128K, Combining Hybrid Attention and MoE Architecture to Reshape Long-Text Programming Efficiency

Ant Group open-sources the BaiLing Large Model Ring-flash-linear-2.0-128K, specifically targeting long-text programming. It employs a hybrid linear attention mechanism with a sparse MoE architecture, achieving performance comparable to a 40B dense model by activating only 6.1B parameters. It achieves optimal results in code generation and intelligent agent applications, efficiently addressing the challenges of long context processing.

12.4k 9 hours ago

AntBaiLing Team Releases the Next-Generation Efficient Inference Model Ring-mini-sparse-2.0-exp

Ant's Ring-mini-sparse-2.0-exp model enhances long-sequence decoding via MoE architecture with sparse attention, optimizing inference performance for complex sequences.....

9.3k yesterday

AntBaiLing Team Releases the Next-Generation Efficient Inference Model Ring-mini-sparse-2.0-exp

MiniMax Open-Source M2 Model: High-Performance AI Empowers Coding and Proxy, Cost is Only 8% of Competitors

MiniMax open-sourced its M2 large language model on Oct 27, 2025. Designed for agent workflows and end-to-end coding with MoE architecture, it offers 2x faster speed at just 8% of Claude Sonnet's cost, providing cost-effective AI solutions.....

17.2k 11 hours ago

Models

GLM-4.5

chatglm

$0.43

Input tokens/M

$1.01

Output tokens/M

131.1k

Context Length

Qwen3 235B A22B (Reasoning)

alibaba

$0.72

Input tokens/M

$0.72

Output tokens/M

128k

Context Length

GLM-4.5-AirX

chatglm

$2.02

Input tokens/M

$5.98

Output tokens/M

128k

Context Length

Kimi K2

moonshotai

$4.1

Input tokens/M

$16.56

Output tokens/M

128k

Context Length

Hunyuan-T1-20250403

tencent

Input tokens/M

Output tokens/M

64k

Context Length

Llama 4 Maverick

Qwen3 235B A22B

alibaba

$0.72

Input tokens/M

$0.72

Output tokens/M

128k

Context Length

SenseNova V6 Pro

sensetime

$2.8

Input tokens/M

$8.4

Output tokens/M

256k

Context Length

Llama 4 Scout

MiniMax-Text-01

minimax

Input tokens/M

Output tokens/M

Context Length

DeepSeek-V3

deepseek

$1.94

Input tokens/M

$7.92

Output tokens/M

128k

Context Length

Mixtral 8x7B Instruct

mistral

Input tokens/M

Output tokens/M

32.8k

Context Length

Wan2.2

alibaba

Input tokens/M

Output tokens/M

131.1k

Context Length

Doubao-1.5-thinking-pro

bytedance

Input tokens/M

$16

Output tokens/M

128k

Context Length

Doubao-1.5-pro-32k

bytedance

$0.8

Input tokens/M

$0.2

Output tokens/M

32k

Context Length

Doubao-1.5-pro-256k

bytedance

Input tokens/M

Output tokens/M

256k

Context Length

Empowering the future, your artificial intelligence solution think tank

English 简体中文繁體中文にほんご

FirendLinks:

AI Newsletters AI Tools MCP Servers AI News AIBase LLM Leaderboard AI Ranking

Business Cooperation Site Map

AI News

Weibo Open Sources Vibe Thinker: 1.5 Billion Parameters Outperform DeepSeek R1 with a Training Cost of Only $7,800

Ant Group's BaiLing Large Model Team Open Sources Ring-flash-linear-2.0-128K, Combining Hybrid Attention and MoE Architecture to Reshape Long-Text Programming Efficiency

AntBaiLing Team Releases the Next-Generation Efficient Inference Model Ring-mini-sparse-2.0-exp

MiniMax Open-Source M2 Model: High-Performance AI Empowers Coding and Proxy, Cost is Only 8% of Competitors

Models

GLM-4.5

Qwen3 235B A22B (Reasoning)

GLM-4.5-AirX

Kimi K2

Hunyuan-T1-20250403

Llama 4 Maverick

Qwen3 235B A22B

SenseNova V6 Pro

Llama 4 Scout

MiniMax-Text-01

DeepSeek-V3

Mixtral 8x7B Instruct

Wan2.2

Doubao-1.5-thinking-pro

Doubao-1.5-pro-32k

Doubao-1.5-pro-256k

Wan2.2 I2V A14B Diffusers

GigaChat3 10B A1.8B Bf16

GigaChat3 10B A1.8B Base

Cerebras_MiniMax M2 REAP 139B A10B GGUF

Fyodor StarCoder2 7B Instruct Agentic

Moondream3 Preview Hf

Qwen3 VL 235B A22B Instruct GGUF

Qwen3 VL 30B A3B Instruct GGUF

Qwen3 VL 30B A3B Thinking GGUF

Qwen3 Next 80B A3B Thinking GGUF

Ling 1T GGUF

Ming Flash Omni Preview

LFM2 8B A1B 8bit MLX

Gpt Oss 120b Eagle3 V2

SFWan2.2 T2V A14B Diffusers

Granite 4.0 H Tiny 5bit MLX

Granite 4.0 H Tiny 3bit MLX

Qwen3 Next 80B A3B Instruct AWQ 8bit

Megrez2 3x7B A3B GGUF

Qwen3 Next 80B A3B Thinking AWQ 4bit