Home
Information

AI Dataset Collection

Large-scale datasets and benchmarks for training, evaluating, and testing models to measure

Tools

Intelligent Document Recognition

Comprehensive Text Extraction and Document Processing Solutions for Users

AI Tutorial

VisionGPT2

Public

Combining ViT and GPT-2 for image captioning. Trained on MS-COCO. The model was implemented mostly from scratch.

Creat2023-09-29T03:36:07
Update2025-03-21T05:08:05
https://www.kaggle.com/code/shreydan/visiongpt2-image-captioning-pytorch
44
Stars
0
Stars Increase

Related projects