AIbase

Image-Captioning-with-ViT-and-BERT

Public

A concise image-captioning pipeline that fine-tunes a ViT encoder with a BERT decoder on Flickr8K for training, plus a standalone script to load the trained model and generate captions on new images.

Creat2025-05-25T20:30:33
Update2025-05-25T21:57:58
1
Stars
0
Stars Increase