Home
Information

AI Dataset Collection

Large-scale datasets and benchmarks for training, evaluating, and testing models to measure

Tools

Intelligent Document Recognition

Comprehensive Text Extraction and Document Processing Solutions for Users

AI Tutorial

quantile-reward-policy-optimization

Public

Official codebase for "Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions" (Matrenok et al. 2025).

Creat2025-07-14T22:48:30
Update2025-07-15T05:49:26
https://claire-labo.github.io/quantile-reward-policy-optimization/
27
Stars
0
Stars Increase

Related projects