4onebench
PublicA minimalist benchmarking tool designed to test the routine-generation capabilities of LLMs.
Creat:2024-11-04T07:15:19
Update:2024-12-15T21:24:57
25
Stars
0
Stars Increase
A minimalist benchmarking tool designed to test the routine-generation capabilities of LLMs.