Meta Launches SAM Audio: The World's First Multimodal Audio Model Supporting Click-Based Sound Separation, One-Click Extraction of Guitar Sounds, Vocals, or Dog Barks

AIbase基地

Published inAI News · 5 min read · Dec 18, 2025

Meta has officially launched a major breakthrough in audio processing - SAM Audio, the world's first unified multimodal audio separation model. It allows users to "hear with their eyes," extracting any target sound from a mixed video or audio with one click: click on the guitarist in the video, and instantly separate the clean guitar sound; enter "dog bark," and automatically filter out dog noises from an entire podcast; even by selecting a time segment, you can accurately remove interfering sounds. This technology is the first to fully replicate the way humans naturally perceive sound - seeing, speaking, pointing, and selecting - into an AI system.

The core of SAM Audio is its self-developed Perceptual Encoder Audio-Visual (PE-AV), which Meta calls the model's "ear." This engine was developed based on the open-source Meta Perception Encoder computer vision model released in April this year, and it is the first to integrate advanced visual understanding capabilities with audio signals, achieving cross-modal sound localization and separation.

Specifically, SAM Audio supports three intuitive interaction methods, which can be used individually or in combination:

- Text prompts: Enter semantic descriptions such as "vocal singing" or "car horn," and the system automatically extracts the corresponding sound source;

- Visual prompts: Click on the sound source in the video (such as a person speaking or hands drumming), and the system will separate the audio;

- Time segment prompts (industry-first): Mark the time interval when the target sound appears (e.g., "3 minutes 12 seconds to 3 minutes 18 seconds"), and the model automatically processes similar sounds in the entire recording - Meta compares this feature to the "dream" technology in Cyberpunk 2077.

To promote technological standardization, Meta has also open-sourced two key tools:

- SAM Audio-Bench: The first audio separation evaluation benchmark based on real-world scenarios;

- SAM Audio Judge: The world's first automatic evaluation model specifically for audio separation quality, capable of quantitatively assessing the purity and completeness of the separation results.

This newly released PE-AV is not only the underlying engine of SAM Audio but will also empower other Meta AI products, including subtitle generation, video understanding, and intelligent editing systems. Its open-source nature means developers can build their own "synesthetic" AI applications in the future - from automatic noise reduction in meeting recordings to immersive AR audio interactions, and even assistive auditory devices for accessibility.

In today's era of explosive growth in video content, the release of SAM Audio marks the beginning of a new era in audio processing: interactive, editable, and understandable. In the past, we could only passively receive sound; now, Meta gives us the superpower of "selective listening" - and this may just be the first step in multimodal AI reshaping sensory experiences.

Great Changes in the Chip Ecosystem: Google Joins Forces with Meta to Optimize PyTorch Compatibility, Challenging NVIDIA's GPU Dominance

Google is advancing the "TorchTPU" initiative, aimed at improving the compatibility of its TPU chips with the PyTorch framework, to reduce the cost for developers migrating from NVIDIA GPUs to Google's TPU. This move is intended to challenge NVIDIA's dominance in the AI chip market and break the deep integration between PyTorch and NVIDIA CUDA.

Meta Authorizes Employees to Use Competitor AI Tools to Improve Work Efficiency

Meta announced expanding employees' access to competitor AI tools, aiming to integrate AI technology more widely into work processes. The company encourages the use of tools such as ChatGPT-5 and Gemini3Pro to enhance efficiency. The Chief Information Officer emphasized that integrating AI into daily work is a top priority.

MetaAvocado Closed-Source Model Scheduled for Spring 2026, Zuckerberg Supervises the Team

Meta CEO Zuckerberg leads AI commercialization, with the new 'Avocado' model set for Spring 2026 release. It will be API and hosting only, not open-source, integrating third-party tech from Google, OpenAI, and Alibaba for cutting-edge performance. Meta has a $5 billion H100 order with Nvidia for training.....

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

AI Brand Monitoring Tool

AI Search Visibility Checker

GEO Services

AI Model Compatibility Checker

AI Deployment Calculator

Meta Launches SAM Audio: The World's First Multimodal Audio Model Supporting Click-Based Sound Separation, One-Click Extraction of Guitar Sounds, Vocals, or Dog Barks

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Great Changes in the Chip Ecosystem: Google Joins Forces with Meta to Optimize PyTorch Compatibility, Challenging NVIDIA's GPU Dominance

Meta Authorizes Employees to Use Competitor AI Tools to Improve Work Efficiency

Meta Fully Embraces Competitor AI: Employees Can Freely Use ChatGPT-5, Gemini 3 Pro, and Even Use Midjourney for Drawing

Meta Launches New AI Glasses to Help You Hear Conversations More Clearly

Meta AI Glasses Get Major Upgrade: New 'Conversation Focus' and Spotify Visual Music Features

The Fall of Meta's Open-Source Myth: New Closed-Source Model Was Trained Using Alibaba's Qwen, Zuckerberg's Soup for Talent Led to a Strategic Shift

Meta's New AI Model, Avocado, to Be Launched Next Year, Distillation Learning Attracts Industry Attention

Oracle's Increased AI Investment Causes Market Volatility, Stock Plummets Over 11%

MetaAvocado Closed-Source Model Scheduled for Spring 2026, Zuckerberg Supervises the Team

Meta's 'Llama' Coming to an End? New Large Model Code-named Avocado Announced for Q1 2026 or Switching to Closed Source to Directly Compete with OpenAI

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

AI Brand Monitoring Tool

AI Search Visibility Checker

GEO Services​

AI Model Compatibility Checker

AI Deployment Calculator

Meta Launches SAM Audio: The World's First Multimodal Audio Model Supporting Click-Based Sound Separation, One-Click Extraction of Guitar Sounds, Vocals, or Dog Barks

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Great Changes in the Chip Ecosystem: Google Joins Forces with Meta to Optimize PyTorch Compatibility, Challenging NVIDIA's GPU Dominance

Meta Authorizes Employees to Use Competitor AI Tools to Improve Work Efficiency

Meta Fully Embraces Competitor AI: Employees Can Freely Use ChatGPT-5, Gemini 3 Pro, and Even Use Midjourney for Drawing

Meta Launches New AI Glasses to Help You Hear Conversations More Clearly

Meta AI Glasses Get Major Upgrade: New 'Conversation Focus' and Spotify Visual Music Features

The Fall of Meta's Open-Source Myth: New Closed-Source Model Was Trained Using Alibaba's Qwen, Zuckerberg's Soup for Talent Led to a Strategic Shift

Meta's New AI Model, Avocado, to Be Launched Next Year, Distillation Learning Attracts Industry Attention

Oracle's Increased AI Investment Causes Market Volatility, Stock Plummets Over 11%

MetaAvocado Closed-Source Model Scheduled for Spring 2026, Zuckerberg Supervises the Team

Meta's 'Llama' Coming to an End? New Large Model Code-named Avocado Announced for Q1 2026 or Switching to Closed Source to Directly Compete with OpenAI

GEO Services