Best ViTPose AI Tools & Models - Premium ViTPose News

AI News

Open Source Action Estimation Model ViTPose: It Can Estimate Actions in Every Frame and Label Them

ViTPose is an open-source action estimation model that excels at recognizing human postures, as if it can understand the actions you are performing. The standout feature of this model is its simplicity and efficiency; it does not use complex network structures but directly employs a technique called Vision Transformer. The core of ViTPose uses a pure Vision Transformer, which acts like a powerful 'skeleton' to extract key features from images. Unlike other models, it does not require complexity.

13.3k 16 hours ago

Open Source Action Estimation Model ViTPose: It Can Estimate Actions in Every Frame and Label Them

AI Products

ViTPose

A collection of ViTPose models implemented based on the Transformer architecture.

AI model

7.1k

Models

Vitpose Plus Huge

usyd-community

ViTPose++ is a vision Transformer-based foundational model for human pose estimation, achieving an outstanding performance of 81.1 AP on the MS COCO keypoint test set.

Vitpose Plus Large

usyd-community

ViTPose++ is a vision Transformer-based foundation model for human pose estimation, achieving an outstanding performance of 81.1 AP on the MS COCO keypoint test set.

Vitpose Plus Small

usyd-community

ViTPose++ is a vision Transformer-based human pose estimation model, achieving outstanding performance of 81.1 AP on the MS COCO keypoint detection benchmark.

Synthpose Vitpose Huge Hf

stanfordmimi

SynthPose is a keypoint detection model based on the VitPose huge backbone network, fine-tuned with synthetic data to predict 52 human keypoints, suitable for kinematic analysis.

Synthpose Vitpose Base Hf

stanfordmimi

SynthPose is a 2D human pose estimation model based on VitPose Base, fine-tuned with synthetic data, capable of predicting 52 anatomical keypoints

Vitpose Plus Base

usyd-community

ViTPose is a vision Transformer-based human pose estimation model that achieves an outstanding performance of 81.1 AP on the MS COCO keypoint detection benchmark with a simple design.

Vitpose Base Coco Aic Mpii

usyd-community

ViTPose is a human pose estimation model based on Vision Transformer, achieving outstanding performance on benchmarks like MS COCO through simple architectural design.

Computer Vision

TransformersEnglish

usyd-community

Vitpose Base

usyd-community

A vision Transformer-based human pose estimation model achieving an outstanding performance of 81.1 AP on the MS COCO keypoint test set

Vitpose Base Simple

usyd-community

ViTPose is a human pose estimation model based on Vision Transformer, achieving 81.1 AP accuracy on the MS COCO keypoint test set, with advantages such as model simplicity, scalable size, and flexible training.

Vitpose Base Simple

danelcsb

ViTPose is a baseline model for human pose estimation based on plain vision transformers, achieving high-performance keypoint detection with a simple architecture

Computer Vision

TransformersEnglish

danelcsb