AIbase

alpaca-rlhf

Public

Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat

Creat2023-04-12T16:19:46
Update2025-03-24T17:19:53
https://88aeeb3aef5040507e.gradio.live/
114
Stars
0
Stars Increase

Related projects