RewardModelingBeyondBradleyTerry
Publicofficial implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and Alternatives
inverse-reinforcement-learninglarge-language-modelslargelanguagemodelsllm-aligmentllmalignmentrewardreward-modelingreward-modelsrlhf
Creat:2024-09-19T05:11:40
Update:2025-03-25T21:26:07
https://sites.google.com/view/rewardmodels
65
Stars
0
Stars Increase