Groma
Public[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization
Creat:2024-04-21T16:08:59
Update:2025-03-25T16:56:36
https://groma-mllm.github.io/
577
Stars
1
Stars Increase
[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization