HomeAI Tutorial

cap-rlvr

Public

CAP RLVR: Reinforcement Learning from Human Feedback for Legal Reasoning using Caselaw Access Project data. Complete GRPO training pipeline with OpenAI Gym environments, deterministic reward functions, and multi-stage curriculum learning for legal LLM development.

Creat2025-07-22T20:39:09
Update2025-07-24T07:12:06
3
Stars
0
Stars Increase