AIBase
Home
AI NEWS
AI Tools
GEO & AEO
MCP
AI Models
AI Marketplace
EN

AI News

View More

SALMONN Framework: Expanding General Auditory Capabilities of Large Language Models

SALMONN is an audio-text multimodal large language model framework designed to expand the understanding and processing capabilities of large language models in the general auditory domain. The framework integrates components such as non-speech BEATs audio encoders, the OpenAI Whisper framework's speech encoders, and window-level Q-Former, achieving high levels of temporal resolution for audio-text alignment. After the activation adjustment phase, SALMONN has achieved competitive performance in tasks such as audio captioning and speech translation, demonstrating general auditory capabilities.

9.9k 15 hours ago
SALMONN Framework: Expanding General Auditory Capabilities of Large Language Models

AI Products

View More
SALMONN

SALMONN

SALMONN: Speech Audio Language Music Open Neural Network

AI voice recognition
13.2k
AIBase
Empowering the future, your artificial intelligence solution think tank
English简体中文繁體中文にほんご
FirendLinks:
AI Newsletters AI ToolsMCP ServersAI NewsAIBaseLLM LeaderboardAI Ranking
© 2026AIBase
Business CooperationSite Map