The Kuaishou Kling 2.6 version introduces two major features: voice and motion control, achieving native audio generation and improving the precision of complex motion processing. Voice control can generate audio effects, vocal tracks, and music that match the video, and supports personalized voice customization.
Welcome to the [AI Daily] section! This is your guide to exploring the world of artificial intelligence every day. Every day, we bring you the latest content in the AI field, focusing on developers, helping you understand technical trends and innovative AI product applications. Click to learn more about new AI products: https://app.aibase.com/zh1. Douyin launches Seedance 1.5 Pro, which can directly generate audio-visual videos. ByteDance's new audio-video creation model, Seedance 1.5 Pro, has officially launched on Douyin, providing ordinary users
QQ Music has launched the "AI Song Composition" feature, achieving fully localized AI music creation for the first time. Users do not need to be connected to the internet; with an AI PC equipped with the Core Ultra processor, they can generate complete original songs in just a few minutes. The process is simple: input keywords to create music, such as the recent hit song "Da Dongbei".
OpenAI upgrades ChatGPT, launching an 'App Directory' to integrate third-party tools, allowing users to directly access services within the chat interface. At the same time, it opens up a Developer SDK, supporting external teams to build deep integration experiences, moving towards a mature AI platform.
Use advanced AI to generate royalty-free music, songs, and vocals without the need for musical skills.
A comprehensive AI music and music video generator that quickly turns creative ideas into professional works.
Create Christmas videos with AI in just a few minutes. Select a template and click generate to get a video clip with music.
Free AI music generator that can convert text into 8-minute professional music without copyright risks
Alibaba
-
Input tokens/M
Output tokens/M
Context Length
nvidia
Audio Flamingo 3 is an advanced, fully open-source large audio language model that can enhance the reasoning and understanding abilities of speech, sound, and music.
ACE-Step
A hybrid rap vocal model focused on improving the generation quality of Chinese rap/hip-hop music
calcuis
ACE-Step-v1-3.5B is a text-to-audio model that supports high-quality audio generation, suitable for music and sound effects creation.
walterheart
Bark is a Transformer-based text-to-audio model created by Suno, capable of generating highly realistic multilingual speech, music, background noise, and sound effects.
sicto
The SICTO Vocal Separator is a high-quality vocal separation model developed based on the PyTorch framework, specifically designed to extract clear vocal parts from music audio. This model is trained on the musdb18hq dataset and can provide professional-level vocal separation effects for music production and audio editing.
dallinmackay
Stable Diffusion model fine-tuned with screenshots from the movie 'Cats (2019)', capable of transforming characters into the style of the musical 'Cats'
HKUSTAudio
AudioX is a unified diffusion transformer model capable of generating audio and music from arbitrary content. It produces high-quality general audio and musical compositions, offers flexible natural language control, and seamlessly handles multimodal inputs.
m-a-p
YuE is a series of open-source foundational models designed for music generation, particularly for converting lyrics into complete songs (lyrics2song).
awsaf49
Advanced model for end-to-end synthetic song detection, capable of identifying AI-generated complete songs (including vocals, music, lyrics, and style)
facebook
Unified automatic quality assessment model for speech, music, and sound
Gyaneshere
This model is an audio classification model fine-tuned on the GTZAN music genre classification dataset based on DistilHuBERT, with an accuracy of 84%.
Alissonerdx
YuE is a groundbreaking open-source foundational model series specifically designed for music generation, particularly for converting lyrics into complete songs (lyrics2song).
sugarblock
Music genre classification model fine-tuned on the GTZAN dataset with 93% accuracy
wkCircle
This model is an audio classification model based on the Audio Spectrogram Transformer (AST) architecture. After pre-training on the Audioset dataset, it was fine-tuned on the GTZAN music genre classification dataset.
duysal
An audio classification model based on the DistilHuBERT architecture, fine-tuned on the GTZAN dataset for music genre classification tasks.
Doctor-Shotgun
A quantized version based on the m-a-p/YuE-s1-7B-anneal-en-cot model using Exllamav2, suitable for text generation tasks, particularly excelling in music-related fields.
Felguk
This model is used to classify audio clips as either 'Suno' music or 'People' music.
YuE is a series of open-source foundational models specifically designed for music generation, particularly for converting lyrics into complete songs (lyrics2song).
FunAudioLLM
InspireMusic is a unified framework focused on music generation, song generation, and audio generation, combining audio tokenization with autoregressive transformers and flow-matching models to support high-quality long-form audio generation.
InspireMusic is a unified framework focused on music generation, song generation, and audio generation, integrating autoregressive transformers with flow-matching models through audio tokenization technology, supporting high-quality long audio generation.
AbletonMCP is an integration tool that connects Ableton Live and Claude AI. It enables two - way communication via the Model Context Protocol (MCP), allowing AI to directly control and operate Ableton Live for music creation and production.
An open - source short - video automatic generation tool that integrates text - to - speech, automatic subtitles, background videos, and music to create professional short videos from simple text input.
An MCP server for controlling YouTube music playback
The MCP service of Bangumi TV provides access to the BangumiTV API, supporting queries for information on entries such as anime, manga, music, games, and related character and personnel data.
ArtistLens is a powerful MCP server that provides access to the Spotify Web API, supporting functions such as music search, artist information retrieval, and playlist management.
An Apple Music API interaction server based on the MCP protocol, providing song search and playback link generation functions.
An MCP server that integrates the Spotify API, supporting playlist management, music search, and recommendation retrieval through Claude
TIDAL MCP is a personalized music recommendation system that filters the TIDAL music library through LLM combined with user - defined conditions, supporting intelligent recommendation and playlist management based on playback history/playlists.
An integration project that enables Claude Desktop to interact with Spotify via the Model Context Protocol (MCP), providing functions such as music search, playback control, and playlist management.
The MCP service of Bangumi TV provides access to the BangumiTV API, supporting queries for information on entries such as anime, manga, music, and games, including entry details, characters, personnel, and related data retrieval functions.
An MCP server based on ableton-js for real-time interaction and control of Ableton Live, assisting music producers in music creation.
A production - ready MCP server that enables AI - driven music generation through Strudel.cc, providing complete browser automation control, real - time audio analysis, and pattern generation functions.
A Python - based MCP server project that can collaborate with AI assistants such as Claude to generate local music playlists in .m3u format based on the user's mood or theme and save them to the specified directory.
The official MCP server of MusicMCP.AI allows AI assistants (such as Claude) to call an advanced AI music generation platform through natural language instructions, supports song generation in inspiration mode and custom mode, and provides balance query and health check functions.
An audio analysis tool based on MCP and librosa, supporting the analysis of local files, YouTube links, and audio links.
A QQ Music search test server based on the MCP protocol
An MCP server based on YouTube Music that allows you to search for and play music through an AI assistant.
A service based on the Model Context Protocol (MCP) that allows large language models to search, download, and play YouTube music.
The Navidrome MCP server is an AI music assistant that enables intelligent playlist creation, music discovery, and library management through natural language interaction. It supports integration with AI assistants like Claude and ChatGPT.
Mureka Model Context Protocol (MCP) is an official server that supports interaction with powerful APIs for generating lyrics, songs, and background music. The server allows MCP clients (such as Claude Desktop and OpenAI Agents, etc.) to generate lyrics, songs, and background music (instrumental).