qwen-humaneval
Public? Automated LLM coding benchmarks with Ollama - HumanEval & MBPP evaluation suite with safe execution, comprehensive logging, and detailed analysis tools
Creat:2025-08-01T10:41:43
Update:2025-08-01T10:41:56
0
Stars
0
Stars Increase