Inkscope-Captions-2B-0526
PublicThe Inkscope-Captions-2B-0526 model is a fine-tuned version of Qwen2-VL-2B-Instruct, optimized for image captioning, vision-language understanding, and English-language caption generation. This model was fine-tuned on the conceptual-captions-cc12m-llavanext dataset (first 30k entries) to generate detailed, high-quality captions for images.
Creat:2025-05-29T15:33:54
Update:2025-05-30T14:15:48
https://huggingface.co/prithivMLmods/Inkscope-Captions-2B-0526
1
Stars
0
Stars Increase