HomeAI Marketplace

mlx-grpo

Public

? Train your own DeepSeek-R1 style reasoning model on Mac! First MLX implementation of GRPO - the breakthrough technique behind R1's o1-matching performance. Build mathematical reasoning AI without expensive RLHF. Apple Silicon optimized. ?

Creat2025-05-29T05:15:26
Update2025-06-17T10:02:35
25
Stars
0
Stars Increase

Related projects