Breaking the Bottleneck of 3D Reconstruction SuperDec Empowers Robots and Content Generation

AIbase基地

Published inAI News · 5 min read · Jun 25, 2025

Recently, a research team from ETH Zurich, Stanford University, and Microsoft introduced a new method called SuperDec, which aims to achieve a compact and expressive 3D scene representation through the principle of hyper-tetrahedrons. This innovative approach not only effectively decomposes individual objects in 3D scenes but can also be applied to robotics and controllable visual content generation, bringing new possibilities to various fields.

How SuperDec Works

The core idea of SuperDec is to use hyper-tetrahedrons as geometric primitives for local processing of 3D scenes. During the processing, this method combines instance segmentation techniques to effectively expand the entire 3D scene. The research team designed a new architecture that can efficiently decompose the point cloud of any object into a set of compact hyper-tetrahedrons. The model was trained on the ShapeNet dataset and validated its generalization ability on the ScanNet++ dataset and full Replica scenes.

In the processing workflow of SuperDec, given a point cloud of an object containing N points, a transformer-based neural network predicts the parameters of P hyper-tetrahedrons and a soft segmentation matrix that assigns points in the point cloud to the corresponding hyper-tetrahedrons. These predictions provide effective initialization for subsequent Levenberg-Marquardt optimization, further refining the shape of the hyper-tetrahedrons.

Experimental Results and Performance Evaluation

The research team conducted a comprehensive evaluation of SuperDec's performance, including both object-level and scene-level aspects. In the object-level evaluation, SuperDec demonstrated superior decomposition capabilities on the ShapeNet dataset. Through intra-class and inter-class experiments, the team evaluated the model's accuracy and generalization ability, with results showing that SuperDec performs well in decomposing objects of different categories.

In the scene-level evaluation, SuperDec can scale the model to full 3D scenes without any additional fine-tuning. Using object instance masks extracted by Mask3D, SuperDec successfully visualized hyper-tetrahedron representations in multiple scenes of the Replica dataset, demonstrating its applicability in real environments.

Broad Application Prospects

SuperDec has a wide range of potential applications, especially in robotics and controllable content generation. The research team verified its application in path planning and object grasping through field experiments. By scanning real 3D scenes, SuperDec can compute the hyper-tetrahedron representation of objects and plan effective grasping paths for robots.

Additionally, SuperDec can be combined with text-to-image diffusion models to achieve dual control over space and semantics. The research team demonstrated how to generate images with specific depth information using a control network (ControlNet), thereby achieving diverse room styles while maintaining the geometric and semantic structure unchanged.

The introduction of SuperDec marks an important breakthrough in 3D scene decomposition technology. Its compact representation based on hyper-tetrahedrons not only improves the efficiency of 3D reconstruction but also opens up new paths for future robotic applications and content generation. With further research, SuperDec is expected to play an important role in multiple fields.

Project entry: https://super-dec.github.io/

Anthropic Launches New Feature: Users Can Directly Build AI Applications in Claude

Anthropic, an American startup focused on generative artificial intelligence, recently announced the launch of a new feature called "Artifacts," which allows users to create personalized applications. Users can create by means of simple conversations without any programming knowledge. This feature's release marks an important step for Anthropic in the field of artificial intelligence application development. The Artifacts feature was initially launched last June and became available to all users in August. Users can access it next to the conversation window.

New GoT-R1 Multimodal Model Released: Making AI Drawing Smarter, the New Era of Image Generation!

Recently, a research team from the University of Hong Kong, The Chinese University of Hong Kong, and SenseTime has released a groundbreaking framework - GoT-R1. This new multimodal large model significantly enhances the semantic and spatial reasoning capabilities of AI in visual generation tasks by introducing reinforcement learning (RL), successfully generating high-fidelity and semantically consistent images from complex text prompts. This advancement marks another leap in image generation technology. Currently, although existing multimodal large models have made significant progress in generating images based on text prompts

AI startup Scale AI exposed for leaking customer confidential information using Google Docs

Recently, the artificial intelligence startup Scale AI has been involved in a serious data security scandal. This highly valued company, which was acquired by Meta for 49% of its shares for 14.8 billion dollars, was exposed for using public Google Docs to store confidential information from numerous clients including Meta, Google, and xAI. Google Docs is a convenient collaboration tool, but its sharing options of either inviting or completely public access clearly conflict with any company's strict security standards. It was reported that anyone who knew Sca

Google Launches Gemini CLI! AI Assistant Touches the Developer Terminal

Recently, Google officially launched a new command-line tool - Gemini CLI. This tool is based on Google's self-developed Gemini2.5Pro AI model, aiming to provide developers with convenient AI Q&A and content generation services. With Gemini CLI, developers can directly call the power of AI in their own terminal interface, thereby improving programming efficiency and work convenience. One of the highlights of Gemini CLI is its support for up to 1 million tokens

ChatGPT iOS App Monthly Downloads Exceed 30 Million, Surpassing All Social Apps

The iOS app of ChatGPT had 29.6 million downloads in the past 28 days, becoming the most popular app globally. This achievement made ChatGPT's download count exceed the combined total of four social apps - TikTok, Facebook, Instagram, and X - which had approximately 32.9 million downloads during the same period, creating a difference of 10.6%. Although social apps have been on the market longer, ChatGPT achieved this in a short period of time.

Image Giant Getty Images Reverses Core Copyright Lawsuit Against Stability AI, UK Case Continues

Recently, Getty Images announced in the London High Court that it has withdrawn its main copyright infringement allegations against Stability AI, further narrowing the focus of this closely watched legal battle. The core of this lawsuit revolves around how AI companies use copyrighted content to train their models. Image source note: The image is AI-generated, and the image licensing service is Midjourney. Although Getty Images' dismissal of the case did not end it, the company is still pursuing other allegations.

Dou Bao AI's Gaokao Score Reaches the Threshold for Tsinghua and Peking University Admission! Literature Score of 683 Leads Domestic and International Top Models

ByteDance's Seed Team recently released the impressive results of the 2025 full subject Gaokao test: the Dou Bao Seed1.6-Thinking model scored 683 in literature and 648 in science, meeting the admission threshold for Tsinghua and Peking University, and performed outstandingly in domestic and international AI model Gaokao tests. The test used the national new volume one and Shandong Province's independent proposition papers, with Dou Bao competing against five other top domestic and international AI models such as Google Gemini 2.5 Pro, DeepSeek R1, and OpenAI o3.

Google Unveils a Major Move! Gemini CLI Open Source Release, Free AI Coding Assistant Challenges Cursor

Google has officially launched Gemini CLI today, an open-source terminal AI agent tool, directly challenging commercial AI coding tools. The project immediately received over 9,000 stars on GitHub, showing strong community interest. The free strategy disrupts the market: Zero cost usage: Simply use a personal Google account to obtain a Gemini Code Assist license. Top model: Free access to the Gemini 2.5 Pro model.

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview