Ovis
PublicA novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
chatbotllama3multimodalmultimodal-large-language-modelsmultimodalityqwenvision-language-learningvision-language-model
Creat:2024-06-13T14:30:56
Update:2025-03-26T19:46:21
https://huggingface.co/AIDC-AI/Ovis1.6-Gemma2-9B
1.0K
Stars
2
Stars Increase