awesome-unimodal-training
Publictext-only training or language-free training for multimodal tasks (image/audio/video caption, retrieval, text2image)
text-only training or language-free training for multimodal tasks (image/audio/video caption, retrieval, text2image)