Volcano Engine launches Doubao Audio Generation Model 1.0, which supports text or audio input and generates complete audio works end-to-end. The core breakthrough is that a single Prompt can simultaneously generate dialogue, sound effects, and background music, eliminating the need for traditional multi-track editing. This technology greatly simplifies the audio production process, allowing users to efficiently produce professional-quality audio like an 'audio director,' completely eliminating the complex post-production work of manual alignment and mixing.
Volcano Engine launched Doubao Audio Generation Model 1.0, featuring two core technologies: "Multimodal Reference Generation" and "Long-term Voice Consistency," simplifying traditional audio post-production processes. It enables one-stop generation of dialogue, sound effects, and background music, improving creative efficiency.
NetEase Cloud Music's "Miaoshi" AI emotional companion app will shut down on July 14. The company calls it a routine business adjustment. New registrations and top-ups have stopped. Users can request refunds for remaining tokens and memberships, and export AI chat records by August 14.....
AI music video platform 'Liko MV' releases version 1.1, launching web and iPhone versions. The key upgrade introduces an AI video generation module, producing dynamic videos directly, replacing traditional 'image slideshow' processing. This significantly enhances video expressiveness and flexibility, lowering the barrier to MV creation.....
Create amazing AI videos, images, and music, and use it for free immediately!
Free AI music generator. Create songs by inputting text. Get 50 free credits and no subscription required.
Free AI music generator that instantly transforms text prompts into royalty-free songs with vocals.
Free tools designed for independent musicians, covering the entire music release cycle, no registration required.
Alibaba
-
Input tokens/M
Output tokens/M
Context Length
nvidia
Audio Flamingo 3 is an advanced, fully open-source large audio language model that can enhance the reasoning and understanding abilities of speech, sound, and music.
ACE-Step
A hybrid rap vocal model focused on improving the generation quality of Chinese rap/hip-hop music
calcuis
ACE-Step-v1-3.5B is a text-to-audio model that supports high-quality audio generation, suitable for music and sound effects creation.
walterheart
Bark is a Transformer-based text-to-audio model created by Suno, capable of generating highly realistic multilingual speech, music, background noise, and sound effects.
sicto
The SICTO Vocal Separator is a high-quality vocal separation model developed based on the PyTorch framework, specifically designed to extract clear vocal parts from music audio. This model is trained on the musdb18hq dataset and can provide professional-level vocal separation effects for music production and audio editing.
dallinmackay
Stable Diffusion model fine-tuned with screenshots from the movie 'Cats (2019)', capable of transforming characters into the style of the musical 'Cats'
HKUSTAudio
AudioX is a unified diffusion transformer model capable of generating audio and music from arbitrary content. It produces high-quality general audio and musical compositions, offers flexible natural language control, and seamlessly handles multimodal inputs.
m-a-p
YuE is a series of open-source foundational models designed for music generation, particularly for converting lyrics into complete songs (lyrics2song).
awsaf49
Advanced model for end-to-end synthetic song detection, capable of identifying AI-generated complete songs (including vocals, music, lyrics, and style)
facebook
Unified automatic quality assessment model for speech, music, and sound
Gyaneshere
This model is an audio classification model fine-tuned on the GTZAN music genre classification dataset based on DistilHuBERT, with an accuracy of 84%.
Alissonerdx
YuE is a groundbreaking open-source foundational model series specifically designed for music generation, particularly for converting lyrics into complete songs (lyrics2song).
sugarblock
Music genre classification model fine-tuned on the GTZAN dataset with 93% accuracy
wkCircle
This model is an audio classification model based on the Audio Spectrogram Transformer (AST) architecture. After pre-training on the Audioset dataset, it was fine-tuned on the GTZAN music genre classification dataset.
duysal
An audio classification model based on the DistilHuBERT architecture, fine-tuned on the GTZAN dataset for music genre classification tasks.
Doctor-Shotgun
A quantized version based on the m-a-p/YuE-s1-7B-anneal-en-cot model using Exllamav2, suitable for text generation tasks, particularly excelling in music-related fields.
Felguk
This model is used to classify audio clips as either 'Suno' music or 'People' music.
YuE is a series of open-source foundational models specifically designed for music generation, particularly for converting lyrics into complete songs (lyrics2song).
FunAudioLLM
InspireMusic is a unified framework focused on music generation, song generation, and audio generation, combining audio tokenization with autoregressive transformers and flow-matching models to support high-quality long-form audio generation.
InspireMusic is a unified framework focused on music generation, song generation, and audio generation, integrating autoregressive transformers with flow-matching models through audio tokenization technology, supporting high-quality long audio generation.
AbletonMCP is an integration tool that connects Ableton Live and Claude AI. It enables two - way communication via the Model Context Protocol (MCP), allowing AI to directly control and operate Ableton Live for music creation and production.
An open - source short - video automatic generation tool that integrates text - to - speech, automatic subtitles, background videos, and music to create professional short videos from simple text input.
An MCP server for controlling YouTube music playback
The MCP service of Bangumi TV provides access to the BangumiTV API, supporting queries for information on entries such as anime, manga, music, games, and related character and personnel data.
An integration project that enables Claude Desktop to interact with Spotify via the Model Context Protocol (MCP), providing functions such as music search, playback control, and playlist management.
TIDAL MCP is a personalized music recommendation system that filters the TIDAL music library through LLM combined with user - defined conditions, supporting intelligent recommendation and playlist management based on playback history/playlists.
An MCP server that integrates the Spotify API, supporting playlist management, music search, and recommendation retrieval through Claude
ArtistLens is a powerful MCP server that provides access to the Spotify Web API, supporting functions such as music search, artist information retrieval, and playlist management.
An Apple Music API interaction server based on the MCP protocol, providing song search and playback link generation functions.
An MCP server based on ableton-js for real-time interaction and control of Ableton Live, assisting music producers in music creation.
The MCP service of Bangumi TV provides access to the BangumiTV API, supporting queries for information on entries such as anime, manga, music, and games, including entry details, characters, personnel, and related data retrieval functions.
A production - ready MCP server that enables AI - driven music generation through Strudel.cc, providing complete browser automation control, real - time audio analysis, and pattern generation functions.
An audio analysis tool based on MCP and librosa, supporting the analysis of local files, YouTube links, and audio links.
The official MCP server of MusicMCP.AI allows AI assistants (such as Claude) to call an advanced AI music generation platform through natural language instructions, supports song generation in inspiration mode and custom mode, and provides balance query and health check functions.
A QQ Music search test server based on the MCP protocol
A Python - based MCP server project that can collaborate with AI assistants such as Claude to generate local music playlists in .m3u format based on the user's mood or theme and save them to the specified directory.
An MCP server based on YouTube Music that allows you to search for and play music through an AI assistant.
A service based on the Model Context Protocol (MCP) that allows large language models to search, download, and play YouTube music.
Mureka Model Context Protocol (MCP) is an official server that supports interaction with powerful APIs for generating lyrics, songs, and background music. The server allows MCP clients (such as Claude Desktop and OpenAI Agents, etc.) to generate lyrics, songs, and background music (instrumental).
The Navidrome MCP server is an AI music assistant that enables intelligent playlist creation, music discovery, and library management through natural language interaction. It supports integration with AI assistants like Claude and ChatGPT.