RLHF-CustomData

Public

Building an LLM with RLHF involves fine-tuning using human-labeled preferences. Based on Learning to Summarize from Human Feedback, it uses supervised learning, reward modeling, and PPO to improve response quality and alignment.

fine-tuning flan-t5 policy-optimization-algorithms ppo-agent reinforcement-learning-algorithms reward-modeling supervised-finetuning transformer

Creat：2025-03-19T21:50:53

Update：2025-03-24T21:42:11

Stars

Stars Increase

Related projects

LLaMA Factory

Hot

agent

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

47790

1个月前

+169today

Llama_index

Hot

agents

LlamaIndex is the leading framework for building LLM-powered agents over your data.

41266

1个月前

+60today

Unsloth

Hot

deepseek

Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! ?

37618

1个月前

+97today

OpenLLM

bentoml

Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.

11193

1个月前

+14today

H2o Llmstudio

H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://docs.h2o.ai/h2o-llmstudio/

4285

1个月前

+1today

Cognita

agent

RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry

4031

1个月前

+11today

ChatGLM Efficient Tuning

alpaca

Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调

3699

1个月前

Kiln

The easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.

3423

2年前

+11today

Hands On Llms

3-pipeline-design

? ????? about ????, ??????, and ?????? ??? for free by designing, training, and deploying a real-time financial advisor LLM system ~ ?????? ???? + ????? & ??????? ?????????

3245

1个月前

+3today

UER Py

albert

Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo

3064

1个月前

+2today

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

RLHF-CustomData

Related projects

LLaMA Factory

Llama_index

Unsloth

OpenLLM

H2o Llmstudio

Cognita

ChatGLM Efficient Tuning

Kiln

Hands On Llms

UER Py