VisionTransformer
PublicTensorFlow and PyTorch Implementation of Vision Transformers (ViT) as described by "An Image is Worth 16x16 words: Transformers for Image Recognition at Scale - Alexey Dosovitskiy et al."
TensorFlow and PyTorch Implementation of Vision Transformers (ViT) as described by "An Image is Worth 16x16 words: Transformers for Image Recognition at Scale - Alexey Dosovitskiy et al."