Sys2Bench
PublicSys2Bench is a benchmarking suite designed to evaluate reasoning and planning capabilities of large language models across algorithmic, logical, arithmetic, and common-sense reasoning tasks.
Creat:2025-02-17T08:04:12
Update:2025-03-24T21:20:35
https://arxiv.org/abs/2502.12521
26
Stars
0
Stars Increase