HomeAI Tutorial

unsloth-rlhf

Public

RLHF experiments with Unsloth: DPO, ORPO, SimPO, KTO. Training scripts, notebooks, and quick evaluation utilities.

Creat2025-08-18T22:32:29
Update2025-11-10T20:06:32
0
Stars
0
Stars Increase