Home
Information

AI Dataset Collection

Large-scale datasets and benchmarks for training, evaluating, and testing models to measure

Tools

Intelligent Document Recognition

Comprehensive Text Extraction and Document Processing Solutions for Users

AI Tutorial

Nano-R1

Public

This project demonstrates the process of fine-tuning the Qwen2.5-3B-Instruct model using GRPO (Generalized Reward Policy Optimization) on the GSM8K dataset.

Creat2025-04-04T14:00:58
Update2025-04-04T15:28:25
https://huggingface.co/Akshint47/Nano_R1_Model
5
Stars
0
Stars Increase

Related projects