ToolQA
PublicToolQA, a new dataset to evaluate the capabilities of LLMs in answering challenging questions with external tools. It offers two levels (easy/hard) across eight real-life scenarios.
large-language-modelsnatural-language-understandingnatural-lauguage-processingquestion-answeringtools
Creat:2023-06-06T15:09:04
Update:2025-03-06T13:37:36
https://arxiv.org/pdf/2306.13304.pdf
274
Stars
0
Stars Increase