parrot-python

Public

PARROT (Performance Assessment of Reasoning and Responses On Trivia) is a novel benchmarking framework designed to evaluate Large Language Models (LLMs) on real-world, complex, and ambiguous QA tasks.

benchmarking-framework llm-benchmarking llm-datasets llm-inference llm-qa-document

Creat：2024-09-27T04:30:11

Update：2024-10-22T08:22:52

https://huggingface.co/datasets/RedBlock/parrot

Stars

Stars Increase

Related projects

N8n

Hot

Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.

161403

4年前

+680today

Langchain

Hot

Elixir implementation of a LangChain style framework that lets Elixir projects integrate with and leverage LLMs.

121438

1年前

+224today

Dify

Hot

agent

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.

120901

8个月前

+280today

Gin

framework

Gin is a HTTP web framework written in Go (Golang). It features a Martini-like API with much better performance -- up to 40 times faster. If you need smashing performance, get yourself some Gin.

87304

8个月前

+50today