AIbase

video-SALMONN-2

Public

video-SALMONN 2 is a powerful audio-visual large language model (LLM) that generates high-quality audio-visual video captions, which is developed by the Department of Electronic Engineering at Tsinghua University and ByteDance.

Creat2025-06-18T15:05:36
Update2025-07-28T21:43:36
25
Stars
1
Stars Increase

Related projects