Persian-VLM
PublicPersian-VLM: CLIP & Image Captioning for Persian | Implemented Persian CLIP with ParsBERT & contrastive learning + an RNN-based Persian image captioning model. Supports zero-shot object detection, cross-modal retrieval, and more. ?