HomeAI Tutorial
Information

AI Dataset Collection

Large-scale datasets and benchmarks for training, evaluating, and testing models to measure

Tools

Intelligent Document Recognition

Comprehensive Text Extraction and Document Processing Solutions for Users

GLUE-X

Public

We leverage 14 datasets as OOD test data and conduct evaluations on 8 NLU tasks over 21 popularly used models. Our findings confirm that the OOD accuracy in NLP tasks needs to be paid more attention to since the significant performance decay compared to ID accuracy has been found in all settings.

Creat2022-11-19T21:55:14
Update2025-03-12T10:47:15
http://gluexbenchmark.com/
93
Stars
0
Stars Increase

Related projects