SenseTime Launches SenseNova-MARS: A New Chapter in Multimodal Autonomous Reasoning

AIbase基地

Published inAI News · 2 min read · Jan 30, 2026

On January 29, 2026, SenseTime officially announced the open-source release of its multimodal autonomous reasoning model SenseNova-MARS, along with two versions: 8B and 32B. The launch of this model marks a critical step forward in the field of multimodal large models for autonomous reasoning.

Technical Breakthrough: First Agentic VLM Model

SenseNova-MARS has achieved significant innovation in its technical architecture, making it the first Agentic VLM (Agentive Vision-Language Model) in the industry that integrates dynamic visual reasoning with text-image search.

Autonomous Reasoning: The model not only understands image content but also possesses autonomous planning and reasoning capabilities similar to those of an agent.

Deep Integration: By integrating real-time search capabilities into the visual understanding process, the model can handle complex visual tasks that require external knowledge support.

Industry Impact and Significance

SenseTime's decision to open-source both versions aims to provide global developers with more flexible research tools:

8B Version: Balances performance and efficiency, suitable for deployment on edge devices or in environments with limited computing power.

32B Version: Offers stronger logical reasoning capabilities, meeting the needs of complex industry applications.

SenseNova-MARS AgenticVLM MultimodalLargeModel ShangtangTechnology

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

SenseNova-MARS by SenseTime Open Source: Agentic VLM Empowers AI with Independent Thinking and Action Capabilities

SenseNova-MARS model by SenseTime offers dynamic visual reasoning and image-text search, simulating detective logic for AI autonomy. Available in 8B and 32B versions, it leads MMSearch with 74.2 points, outperforming GPT-5.2, marking a key AI shift from understanding to execution.....

Jan 30, 2026

230

China's First Cloud宇星空 Large Model Released, Aiding Intelligent Urban Planning!

Shanghai launches 'Cloud Universe', China's first AI model for planning resources with 600B parameters, integrating remote sensing and 3D data to serve as an 'AI urban planner'. It features a '1 base + 6 specialized agents' structure for planning and governance.....

Dec 25, 2025

280

Long-Running AI Makes Its Debut: Jan Team Releases Jan-v2-VL, Deeply Optimized for Multi-Step Task Execution

Jan introduces Jan-v2-VL-Max, a 30B-parameter multimodal model designed to enhance AI stability in complex automation tasks. Based on Qwen3-VL-30B-A3B-Thinking, it incorporates LoRA-based RLVR technology to improve multi-step operation reliability for long-term task execution.....

Dec 24, 2025

240

Zhipu Open-Source GLM-4.6V Series: 106B Native Support for Function Call Lightweight Version 9B Free for Commercial Use

Zhipu AI launches open-source multimodal model GLM-4.6V series with 106B and 9B parameter versions. It features 128k token context, top-tier visual accuracy, native function call integration, and significantly reduced API pricing, with the lightweight version being free.....

Dec 9, 2025

640

Zhiyuan Institute Unveils the World's Most Powerful Multimodal World Model Emu3.5, Predicts the Next Second of the Real World in One Click!

Beijing's AI research institute unveils Emu3.5, a multimodal model achieving 'world-class unified modeling' to enhance AI's physical understanding and causal reasoning, moving beyond basic generation to true comprehension of the physical world.....

Dec 4, 2025

370

SenseTime NEO Open Source: Achieve Top Multimodal Model Performance with 1/10 of the Data Volume, Ending the Era of Patchwork AI

SenseTime and NTU S-Lab launch open-source multimodal model NEO, achieving deep vision-language integration via architectural innovation. With only 39M image-text pairs (1/10 of similar models), it attains top-tier visual perception without massive data or extra encoders, advancing efficiency and versatility.....

Dec 3, 2025

290

Xiaomi Opensources 7B Multimodal Model MiMo-VL, Promotes AI Assistant Miloco to Automatically Adjust Home Environment

Xiaomi launches 7B-parameter multimodal model 'Xiaomi-MiMo-VL-Miloco-7B-GGUF' and smart assistant 'Xiaomi Miloco'. It uses Mi cameras for real-time activity/gesture recognition to automate smart home devices, supports Home Assistant, and is open-source for non-commercial use with NVIDIA GPU/Docker deployment.....

Nov 17, 2025

500

SenseNova-SI Model Released by SenseTime, Spatial Intelligence Performance Exceeds GPT-5

SenseTime Technology released the open-source SenseNova-SI series model, achieving breakthroughs in the field of spatial intelligence. The model exceeded international top closed-source models such as GPT-5 in authoritative evaluations, compensating for the current shortcomings of large models in spatial understanding and reasoning, and demonstrating exceptional performance.

Nov 11, 2025

540

Sensetime Launches Open-Source Spatial Intelligence Large Model SenseNova-SI

SenseNova-SI series, with 2B and 8B versions, is fully open-source. It integrates deep learning and spatial data analysis for efficient complex scene handling, smart solutions, and flexible application development across industries.....

Nov 11, 2025

280

ByteDance Engine Unveils AI Governance Sword: Self-Developed Multimodal Large Model Can Review 90% of Ads in 10 Minutes, Intercepting 840,000 Violating Materials in a Quarter

A self-developed AI ad governance model enhances efficiency, enabling 90% of ad materials to be intelligently reviewed within 10 minutes, surpassing manual processing.....

Oct 27, 2025

520

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Brand Visibility

AI Brand Monitoring Tool

AI Search Visibility Checker

GEO Promotion Link Detection

GEO Ranking Optimization System

GEO Services​

AI Model Compatibility Checker

AI Deployment Calculator

SenseTime Launches SenseNova-MARS: A New Chapter in Multimodal Autonomous Reasoning

AIbase基地

This article is from AIbase Daily

AI News Recommendations

SenseNova-MARS by SenseTime Open Source: Agentic VLM Empowers AI with Independent Thinking and Action Capabilities

China's First Cloud宇星空 Large Model Released, Aiding Intelligent Urban Planning!

Long-Running AI Makes Its Debut: Jan Team Releases Jan-v2-VL, Deeply Optimized for Multi-Step Task Execution

Zhipu Open-Source GLM-4.6V Series: 106B Native Support for Function Call Lightweight Version 9B Free for Commercial Use

Zhiyuan Institute Unveils the World's Most Powerful Multimodal World Model Emu3.5, Predicts the Next Second of the Real World in One Click!

SenseTime NEO Open Source: Achieve Top Multimodal Model Performance with 1/10 of the Data Volume, Ending the Era of Patchwork AI

Xiaomi Opensources 7B Multimodal Model MiMo-VL, Promotes AI Assistant Miloco to Automatically Adjust Home Environment

SenseNova-SI Model Released by SenseTime, Spatial Intelligence Performance Exceeds GPT-5

Sensetime Launches Open-Source Spatial Intelligence Large Model SenseNova-SI

ByteDance Engine Unveils AI Governance Sword: Self-Developed Multimodal Large Model Can Review 90% of Ads in 10 Minutes, Intercepting 840,000 Violating Materials in a Quarter

AI News Recommendations

SenseNova-MARS by SenseTime Open Source: Agentic VLM Empowers AI with Independent Thinking and Action Capabilities

China's First Cloud宇星空 Large Model Released, Aiding Intelligent Urban Planning!

Long-Running AI Makes Its Debut: Jan Team Releases Jan-v2-VL, Deeply Optimized for Multi-Step Task Execution

Zhipu Open-Source GLM-4.6V Series: 106B Native Support for Function Call Lightweight Version 9B Free for Commercial Use

Zhiyuan Institute Unveils the World's Most Powerful Multimodal World Model Emu3.5, Predicts the Next Second of the Real World in One Click!

SenseTime NEO Open Source: Achieve Top Multimodal Model Performance with 1/10 of the Data Volume, Ending the Era of Patchwork AI

Xiaomi Opensources 7B Multimodal Model MiMo-VL, Promotes AI Assistant Miloco to Automatically Adjust Home Environment

SenseNova-SI Model Released by SenseTime, Spatial Intelligence Performance Exceeds GPT-5

Sensetime Launches Open-Source Spatial Intelligence Large Model SenseNova-SI

ByteDance Engine Unveils AI Governance Sword: Self-Developed Multimodal Large Model Can Review 90% of Ads in 10 Minutes, Intercepting 840,000 Violating Materials in a Quarter

GEO Services