HomeAI Tutorial
Information

AI Dataset Collection

Large-scale datasets and benchmarks for training, evaluating, and testing models to measure

Tools

Intelligent Document Recognition

Comprehensive Text Extraction and Document Processing Solutions for Users

MAVNet-Multimodal-Audio-Visual-Network-for-Cross-Modal-Understanding

Public

MAVNet is a deep learning framework that integrates audio and visual modalities for intelligent perception — enabling tasks like event recognition, autonomous surveillance, and wildlife detection through synchronized sound and vision analysis.

Creat2025-10-21T00:03:52
Update2025-10-21T04:09:08
0
Stars
0
Stars Increase