Thrillcrazyer
Qwen-1.5B_THIP is a mathematical reasoning model fine-tuned using the GRPO method on the DeepMath-103k mathematical dataset based on DeepSeek-R1-Distill-Qwen-1.5B with the TRL framework. This model is specifically optimized for solving mathematical problems and has strong mathematical reasoning abilities.
prithivMLmods
A compact multilingual reasoning model fine-tuned based on Qwen-1.5B, excelling in mathematical problem solving, logical reasoning, code generation, and general tasks