AIBase
Home
AI NEWS
AI Tools
AI Models
MCP
AI Services
AI Compute
AI Tutorial
EN

AI News

View More

Wuhan University Collaborates with China Mobile and Jiutian AI Team to Release Open-source Audio-Video Speaker Recognition Dataset VoxBlink2

Wuhan University, in collaboration with China Mobile's Jiutian AI team and Duke Kunshan University, has released the open-source audio-video speaker recognition dataset VoxBlink2, which is based on YouTube data and contains over 110,000 hours of audio-video recordings. The dataset includes 9,904,382 high-quality audio clips and their corresponding video segments, sourced from 111,284 users on YouTube, making it the largest publicly available audio-video speaker recognition dataset to date. The release of this dataset aims to enrich open-source speech corpora and support the training of voiceprint large models.

16.5k 2 days ago
Wuhan University Collaborates with China Mobile and Jiutian AI Team to Release Open-source Audio-Video Speaker Recognition Dataset VoxBlink2
AIBase
Empowering the future, your artificial intelligence solution think tank
English简体中文繁體中文にほんご
FirendLinks:
AI Newsletters AI ToolsMCP ServersAI NewsAIBaseLLM LeaderboardAI Ranking
© 2025AIBase
Business CooperationSite Map