SPPO
PublicThe official implementation of Self-Play Preference Optimization (SPPO)
Creat:2024-06-13T14:13:29
Update:2025-03-27T00:46:19
https://uclaml.github.io/SPPO/
575
Stars
0
Stars Increase
The official implementation of Self-Play Preference Optimization (SPPO)