Reinforcement-Learning-for-Human-Feedback-RLHF

Public

This repository contains the implementation of a Reinforcement Learning with Human Feedback (RLHF) system using custom datasets. The project utilizes the trlX library for training a preference model that integrates human feedback directly into the optimization of language models.

language-mo language-model llms reinforcement-learning-from-human-feedback rlhf

Creat：2024-08-17T15:27:37

Update：2025-02-15T21:02:32

Stars

Stars Increase

Related projects

Transformers

Hot

bert

? Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

152763

2年前

+57today

Rust

compiler

Empowering everyone to build reliable and efficient software.

107966

8个月前

+20today

TypeScript

javascript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

106802

8个月前

+14today

Generative Ai For Beginners

21 Lessons, Get Started Building with Generative AI ? https://microsoft.github.io/generative-ai-for-beginners/

102038

9个月前

+28today

Awesome Llm Apps

Hot

llms

Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.

79179

8个月前

+216today

LLMs From Scratch

Hot

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

79046

1年前

+69today

D2l Zh

book

《动手学深度学习》：面向中文读者、能运行、可讨论。中英文版被70多个国家的500多所大学用于教学。

73953

8个月前

+23today

Gpt_academic

academic

为GPT/GLM等LLM大语言模型提供实用化交互接口，特别优化论文阅读/润色/写作体验，模块化设计，支持自定义快捷按钮&函数插件，支持Python和C++等项目剖析&自译解功能，PDF/LaTex论文翻译&总结功能，支持并行问询多种LLM模型，支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。

69702

8个月前

+9today

Vllm

Hot

amd

A high-throughput and memory-efficient inference and serving engine for LLMs

63529

1年前

+80today

LLaMA Factory

Hot

agent

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

62793

8个月前

+86today

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

AI Brand Monitoring Tool

GEO Services

AI Search Visibility Checker

AI Model Compatibility Checker

AI Deployment Calculator

AI Dataset Collection

Intelligent Document Recognition

Reinforcement-Learning-for-Human-Feedback-RLHF

Related projects

Transformers

Rust

TypeScript

Generative Ai For Beginners

Awesome Llm Apps

LLMs From Scratch

D2l Zh

Gpt_academic

Vllm

LLaMA Factory

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

AI Brand Monitoring Tool

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Deployment Calculator

AI Dataset Collection

Intelligent Document Recognition

Reinforcement-Learning-for-Human-Feedback-RLHF

Related projects

Transformers

Rust

TypeScript

Generative Ai For Beginners

Awesome Llm Apps

LLMs From Scratch

D2l Zh

Gpt_academic

Vllm

LLaMA Factory

GEO Services