AIBase
Home
AI NEWS
AI Tools
AI Models
MCP
AI Services
AI Compute
AI Tutorial
EN

AI News

View More

SALMONN Framework: Expanding General Auditory Capabilities of Large Language Models

SALMONN is an audio-text multimodal large language model framework designed to expand the understanding and processing capabilities of large language models in the general auditory domain. The framework integrates components such as non-speech BEATs audio encoders, the OpenAI Whisper framework's speech encoders, and window-level Q-Former, achieving high levels of temporal resolution for audio-text alignment. After the activation adjustment phase, SALMONN has achieved competitive performance in tasks such as audio captioning and speech translation, demonstrating general auditory capabilities.

7.5k 4 days ago
SALMONN Framework: Expanding General Auditory Capabilities of Large Language Models

AI Products

View More
SALMONN

SALMONN

SALMONN: Speech Audio Language Music Open Neural Network

AI voice recognition
13.1k
AIBase
Empowering the future, your artificial intelligence solution think tank
English简体中文繁體中文にほんご
FirendLinks:
AI Newsletters AI ToolsMCP ServersAI NewsAIBaseLLM LeaderboardAI Ranking
© 2026AIBase
Business CooperationSite Map