This is a multimodal model that supports multilingual processing, covering multiple fields such as natural language processing, code processing, and audio processing. It can perform various tasks such as automatic speech recognition, speech summarization, speech translation, and visual question answering.
Multimodal
SafetensorsMultiple Languages