AIBase
Home
AI NEWS
AI Tools
AI Models
MCP
AI Services
AI Compute
AI Tutorial
Datasets
EN

AI News

View More

SALMONN Framework: Expanding General Auditory Capabilities of Large Language Models

SALMONN is an audio-text multimodal large language model framework designed to expand the understanding and processing capabilities of large language models in the general auditory domain. The framework integrates components such as non-speech BEATs audio encoders, the OpenAI Whisper framework's speech encoders, and window-level Q-Former, achieving high levels of temporal resolution for audio-text alignment. After the activation adjustment phase, SALMONN has achieved competitive performance in tasks such as audio captioning and speech translation, demonstrating general auditory capabilities.

6.4k 12 minutes ago
SALMONN Framework: Expanding General Auditory Capabilities of Large Language Models

AI Products

View More
SALMONN

SALMONN

SALMONN: Speech Audio Language Music Open Neural Network

AI voice recognition
11.7k
AIBase
Empowering the future, your artificial intelligence solution think tank
English简体中文繁體中文にほんご
FirendLinks:
AI Newsletters AI ToolsMCP ServersAI NewsAIBaseLLM LeaderboardAI Ranking
© 2025AIBase
Business CooperationSite Map