fine-tuning-and-reinforcement-learning-on-llms
Publicsupervised fine tuning and RLAIF on DeepSeek-math-7b-base using LoRA adapters and GRPO training objective
Discover Popular AI-MCP Services - Find Your Perfect Match Instantly
Easy MCP Client Integration - Access Powerful AI Capabilities
Master MCP Usage - From Beginner to Expert
Top MCP Service Performance Rankings - Find Your Best Choice
Publish & Promote Your MCP Services
supervised fine tuning and RLAIF on DeepSeek-math-7b-base using LoRA adapters and GRPO training objective