MoMask: A Text-Driven 3D Human Motion Generation Model

站长之家

Published inAI News · 1 min read · Dec 6, 2023

363

MoMask is an innovative 3D human motion generation model that utilizes a hierarchical quantization scheme to represent actions, consisting of a base layer and progressively added residual markers. The model architecture introduces Masked Transformer and Residual Transformer to predict the mask action markers for the base layer and gradually predict higher-level markers. In text-to-motion generation tasks, MoMask demonstrates exceptional performance, achieving an FID of 0.045 on the HumanML3D dataset, significantly surpassing other models.

MoMask 3D Human Motion Generation Text-Driven

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

AI Daily: Tencent Yuanbao Upgrades for One-Phrase Image and Video Search; WeChat Pay MCP Launches; Google Unveils Veo 3 Globally

Welcome to the [AI Daily] column! This is your guide to exploring the world of artificial intelligence every day. Each day, we present you with the latest content in the AI field, focusing on developers to help you understand technical trends and innovative AI product applications. Click to learn more about new AI products: https://top.aibase.com/1. Tencent Yuanbao upgrades again: one phrase search, images and videos appear instantly, making information retrieval more intuitive! The upgraded features of Tencent Yuanbao make information retrieval more intuitive and efficient. Users just need to ask a question in one phrase to get text and image results.

Jul 4, 2025

Google Launches New Veo 3 Video Generation Model Globally

Google announced the global launch of its latest video generation model, Veo3. This long-anticipated release has generated great excitement among users, as Veo3 is now available to Gemini users in over 159 countries, offering a new video creation experience. The key feature of the Veo3 video generation model is its ability to generate videos up to eight seconds long based on simple text prompts. According to Google, this technology is designed for creative users, especially those on social media who increasingly demand short-form content.

Jul 4, 2025

260

DeepMind introduces Crome: Enhancing the Alignment of Large Language Models with Human Feedback

In the field of artificial intelligence, reward models are a critical component for aligning large language models (LLMs) with human feedback, but existing models face the issue of "reward hacking." These models often focus on superficial features, such as the length or format of responses, rather than identifying genuine quality metrics, such as factual accuracy and relevance. The root cause lies in standard training objectives failing to distinguish between spurious associations and true causal drivers present in the training data. This failure leads to fragile reward models (RMs), which generate misaligned policies.

Jul 4, 2025

220

Google Veo 3 Video Generation Model Now Available to Pro/Ultra Subscribers, Will Add Photo-to-Video Function

Jul 4, 2025

250

Chip Design Company Ambiq Micro Applies for U.S. IPO, Benefiting from Market Demand Driven by Generative AI

Jul 4, 2025

190

Shortcut Makes Its Debut! AI Excel Assistant Surpasses Human Champions by 10 Times, Task Automation Efficiency Soars

Recently, an AI Excel assistant called Shortcut has sparked heated discussions on social media. It enables users to effortlessly complete Excel tasks without writing complex formulas or VBA code through natural language processing (NLP) technology. The AIbase editorial team has compiled the latest information from social media to provide an in-depth analysis of Shortcut's powerful features and its potential impact on the fields of data processing and financial modeling. Shortcut: An Excel Revolution Driven by Natural Language

Jul 3, 2025

8.6k

A Daily: Bilibili Upgrades Anime Video Generation Model AniSora V3; ByteDance Open Sources 4D Video Generation Framework EX-4D; DeepSWE Open Sources AI Agent System Rises to the Top

Jul 3, 2025

190

ByteDance Open Sources New Model VINCIE-3B: 300 Million Parameters Support Continuous Image Editing with Context

Jul 3, 2025

670

Bilibili Open-Sourced Anime Video Generation Model AniSora V3 Version - One-Click Generation of Various Style Anime Video Shots

Jul 3, 2025

530

Byte EX-4D Technology Achieves Monocular Video 4D Conversion, Unlocking High-Quality Content Generation Under Extreme Perspectives

The EX-4D (Extreme Viewpoint 4D Video Generation) technology, developed by the research team tau-yihouxiang, is a groundbreaking innovation in video generation that is gaining widespread attention globally. This technology aims to transform monocular videos into controllable 4D experiences, particularly demonstrating excellent performance under extreme camera angles. The core of the EX-4D technology lies in its unique 'depth watertight mesh' construction method. This novel geometric representation

Jul 3, 2025

140

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

MoMask: A Text-Driven 3D Human Motion Generation Model

站长之家

This article is from AIbase Daily

AI News Recommendations

AI Daily: Tencent Yuanbao Upgrades for One-Phrase Image and Video Search; WeChat Pay MCP Launches; Google Unveils Veo 3 Globally

Google Launches New Veo 3 Video Generation Model Globally

DeepMind introduces Crome: Enhancing the Alignment of Large Language Models with Human Feedback

Google Veo 3 Video Generation Model Now Available to Pro/Ultra Subscribers, Will Add Photo-to-Video Function

Chip Design Company Ambiq Micro Applies for U.S. IPO, Benefiting from Market Demand Driven by Generative AI

Shortcut Makes Its Debut! AI Excel Assistant Surpasses Human Champions by 10 Times, Task Automation Efficiency Soars

A Daily: Bilibili Upgrades Anime Video Generation Model AniSora V3; ByteDance Open Sources 4D Video Generation Framework EX-4D; DeepSWE Open Sources AI Agent System Rises to the Top

ByteDance Open Sources New Model VINCIE-3B: 300 Million Parameters Support Continuous Image Editing with Context

Bilibili Open-Sourced Anime Video Generation Model AniSora V3 Version - One-Click Generation of Various Style Anime Video Shots

Byte EX-4D Technology Achieves Monocular Video 4D Conversion, Unlocking High-Quality Content Generation Under Extreme Perspectives