Tsinghua University and Tencent Jointly Launch Fully Open Source Multi-Modal Architecture Oryx Supporting Ultra-Long Video Input
In today's rapidly advancing field of artificial intelligence, a multi-modal large language model named ORYX is quietly transforming our understanding of AI's ability to perceive the visual world. This AI system, developed collaboratively by researchers from Tsinghua University, Tencent, and Nanyang Technological University, is regarded as the 'Transformers' of visual processing. ORYX, short for Oryx Multi-Modal Large Language Models, is an AI model specifically designed for processing images, videos, and 3D scene time-space understanding.