robot-rlhf
PublicRobot Learning from Human Feedback. Inspired by advancements in NLP, we train a robot policy via reinforcement learning using a reward function learned exclusively from human preferences.
作成時間:2023-04-16T10:39:45
更新時間:2024-10-25T10:44:16
6
Stars
0
Stars Increase