Recently, the National and Local Center jointly launched the world's first, and currently the largest and most complete cross-body visual-tactile multimodal robotic operation dataset: Baihu - VTouch. The release of this dataset marks a breakthrough in robotic visual-tactile perception from a single modality to real cross-body interaction.

image.png

Baihu - VTouch has pioneered a new paradigm for data collection in cross-body visual-tactile multimodal real interaction. It not only includes high-precision visual-tactile sensor data, but also integrates key information such as RGB-D depth vision and joint pose. The total duration of the dataset exceeds 60,000 minutes. To ensure universality, the data collection covers various configurations, including full-size humanoid robots (Qinglong), humanoid wheeled robots, and handheld smart terminals.

In terms of detail presentation, this dataset uses high-performance sensors that support a resolution of 640×480 and a refresh rate of 120Hz, which can accurately capture minute changes when objects come into contact. Currently, the dataset has accumulated about 90.72 million pairs of real object contact samples, focusing on over 260 contact-intensive tasks in four major scenarios: home, catering, industry, and special operations.

Research shows that after introducing visual-tactile perception, nearly 70% of tasks have achieved more continuous contact state descriptions. This will provide crucial underlying data support for robots' fine operation, precise force control, and self-recovery after task failure.