Tsinghua IDEA Team Unveils GUAVA! Generate 3D Avatars from a Single Photo in 0.1 Seconds, Technological Breakthrough Shocking the World

AIbase基地

Published inAI News · 8 min read · Aug 22, 2025

Historic breakthroughs in 3D avatar generation technology are unfolding before our eyes. While the entire industry is still struggling with complex multi-view modeling and long training times, a joint research team from Tsinghua University and the Digital Economy Research Institute of the Guangdong-Hong Kong-Macao Greater Bay Area has quietly rewritten the rules. Their latest GUAVA framework has pushed 3D avatar generation technology to an unprecedented level with astonishing speed and accuracy, and has been successfully selected for ICCV 2025, the top conference in the field of computer vision.

The revolutionary nature of this technology is reflected in its remarkable efficiency. The GUAVA framework can complete the full generation process of a 3D upper-body avatar in just 0.1 seconds, using only a single ordinary photo. This number may seem unimpressive at first glance, but for professionals familiar with traditional 3D modeling processes, it is undoubtedly a groundbreaking breakthrough. Traditional 3D avatar generation methods not only require video footage taken from multiple angles, but also require specialized model training for each individual, which often takes several hours or even days.

The emergence of GUAVA has completely broken through these technical barriers. Now, users need only provide a clear photo, and the system can generate a 3D avatar model with rich details and high realism in real time. This improvement in convenience is not just a quantitative change, but a qualitative leap, transforming 3D avatar technology from a tool exclusive to professional studios into a daily application that ordinary users can easily use.

The core of the technological innovation lies in the new 3D Gaussian model introduced in the GUAVA framework. This innovative mathematical model fundamentally changes the way 3D avatars are generated, achieving an unprecedented level of expressiveness and detail in the final generated virtual characters. By cleverly combining with the EHM expression human model, GUAVA not only accurately captures the subtlest facial expressions, but also perfectly reproduces complex hand gestures, while maintaining an impressive reconstruction speed at all times.

Extensive comparative experiments conducted by the research team have fully demonstrated GUAVA's absolute advantages in performance. Whether in terms of final rendering quality or processing efficiency, GUAVA significantly surpasses all mainstream 2D and 3D avatar generation methods currently available on the market. More astonishingly, the framework can achieve a rendering speed of approximately 50 frames per second, a figure that far exceeds the typical few frames per second achieved by other similar methods, laying a solid technical foundation for real-time interactive applications.

The application prospects of GUAVA technology are extremely broad, covering almost all digital scenarios that require virtual character presentation. In the film production field, directors can quickly create digital doubles for actors, greatly shortening post-production time. Game developers can provide players with more personalized character customization experiences, allowing them to have unique avatars in the virtual world with just a selfie. In the increasingly popular remote work environment, virtual meeting participants can use more vivid and realistic 3D avatars for communication, enhancing communication effectiveness and engagement.

What is even more commendable is that the research team has chosen to open the complete source code of GUAVA to the global developer community. This open-source spirit not only reflects the open attitude of academic research, but also provides valuable innovation foundations for developers and researchers around the world. Countless technology enthusiasts can now conduct secondary development and innovative applications based on this powerful framework, which will surely lead to more surprising technological breakthroughs and commercial applications.

The success of the GUAVA project is not just a technological breakthrough, but also a concentrated demonstration of Tsinghua University's deep research capabilities in the fields of artificial intelligence and computer graphics. This project perfectly integrates the latest achievements from multiple cutting-edge areas such as deep learning, computer vision, and 3D modeling, representing the highest level of interdisciplinary collaboration in the current academic field.

As the digital economy era continues to deepen, virtual character technology has evolved from a science fiction concept into a practical demand. From virtual anchors on social media to AI customer service on e-commerce platforms, from virtual teachers in online education to personalized characters in gaming entertainment, the application scenarios of 3D avatar technology are experiencing explosive growth. The emergence of the GUAVA framework comes at the right time, not only providing technical support for these application scenarios, but also setting a new benchmark for the industry with its outstanding performance and ease of use.

The emergence of GUAVA marks the beginning of a new development stage in 3D avatar generation technology. It not only provides researchers with powerful tools, offers developers endless possibilities, but also opens a convenient door for ordinary users to enter the virtual world. At this critical moment of technological transformation, GUAVA is proving to the world that the future has already arrived with its outstanding performance.

Project Address: https://eastbeanzhang.github.io/GUAVA/

AliMobile-Agent-v3 Makes a Stunning Debut! A Groundbreaking Breakthrough in the Field of GUI Automation

A technological revolution in GUI automation is quietly taking shape. In August 2025, Alibaba once again shocked the industry with its powerful technological innovation capabilities, officially releasing the third-generation GUI intelligent agent framework, Mobile-Agent-v3, while open-sourcing the multimodal cross-platform GUI virtual layer model, GUI-Owl. This technical combination has demonstrated outstanding performance in over 10 authoritative GUI benchmark tests, particularly excelling in AndroidWorld and OSWorld, two widely recognized authoritative testing platforms in the industry.

ElevenLabs Launches v3 Alpha API: Supports Over 70 Languages and an Unlimited Number of Virtual Characters

On August 20, 2025, ElevenLabs, a leading global company in AI voice technology, officially announced the launch of its latest Eleven v3 Alpha API, providing developers with a groundbreaking text-to-speech (TTS) tool. The Eleven v3 Alpha API is hailed as "the most expressive text-to-speech model on Earth," with its core advantage being support for over 70 languages and the ability to generate

AI Revenue Exceeds 10 Billion! Baidu's Q2 2025 Financial Report is Impressive, Intelligent Search and Apollo Go Become the Dual Growth Engines

Today, Baidu released its second-quarter 2025 financial report, which showed that the company's total revenue reached 32.7 billion yuan, with Baidu Core revenue at 26.3 billion yuan. Notably, its AI new business revenue exceeded 10 billion yuan for the first time, with a year-on-year growth of as high as 34%, becoming the core engine driving the company's performance growth. The financial report data shows that Baidu's deep investment in the AI field has comprehensively empowered its core business. Baidu Search has undergone a comprehensive innovation from the search box to the result page. In July this year, the proportion of AI-generated content on mobile search result pages reached 64%.

Baidu Q2 2025 Earnings Release: AI Revenue Exceeds 10 Billion Yuan, Core Net Profit Surges 35%

Recently, Baidu released its Q2 2025 financial results, showcasing strong financial performance. According to the earnings report, Baidu's total revenue reached 32.7 billion yuan, with core business revenue at 26.3 billion yuan. This figure demonstrates that Baidu continues to maintain a solid growth trend in the ever-changing market environment. Notably, Baidu's AI new business performed exceptionally well, with revenue for this quarter breaking through the 10 billion yuan threshold, representing a 34% increase compared to the same period last year. This achievement reflects Baidu's continued investment in the field of artificial intelligence.

Meta Launches DINOv3: An Image Analysis AI Model That Doesn't Require Labeled Data

Meta recently released its latest AI image processing model, DINOv3. This model is based on self-supervised learning and has been trained on 1.7 billion images, with 7 billion parameters, enabling it to handle various image tasks and domains without relying on labeled data. This makes DINOv3 show great potential in areas with limited data, especially in satellite image analysis. The design of DINOv3 aims to adapt to different image processing needs, allowing users to apply it to specific tasks with only minimal adaptation.

Meta launches DINOv3, a general-purpose image processing AI model that does not require annotated data

Meta recently announced the release of DINOv3, a general-purpose image processing AI model that does not require annotated data. The model was trained using self-supervised learning on 1.7 billion images and has built 7 billion parameters, enabling it to handle various image tasks and domains with almost no adjustments required. This feature makes DINOv3 particularly valuable in specialized fields where annotated data is limited, such as satellite image processing. Meta stated that DINOv3 has the capability to accomplish challenges that previously required dedicated systems.

HKU, HIT and ZJU Jointly Launch OmniPart, a Decoupled 3D Model Technology That Redefines Creative Design

Recently, a research team from The University of Hong Kong, Harbin Institute of Technology, and Zhejiang University jointly released an innovative technology called "OmniPart." This technology aims to address the shortcomings of mainstream 3D generation models in constructing details, making the generated 3D models not only visually realistic but also with clear component structures. Existing 3D generation models such as DreamFusion and TRELLIS can generate seemingly complete 3D shapes, but lack internal part structures, making it difficult to meet

New Benchmark for AI Evaluation! GPT-5 and Other Cutting-Edge Models Score Zero Points. What Is the Level of Doctor-Level Reasoning?

Recently, a new AI evaluation benchmark called FormulaOne has attracted widespread attention. This benchmark was launched by AAI, an institution dedicated to superintelligence and advanced AI systems, and it challenged a range of top AI models, such as GPT-5, Grok4, and o3Pro. The results were astonishing: all these models scored zero points in the test! The FormulaOne benchmark includes 220 novel graph-structured dynamic programming problems, with questions divided into three difficulty levels, ranging from moderate to scientific.

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

Building and Deploying AI

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector