Guru-7B is a reinforcement learning reasoning model based on Qwen2.5-7B, proposed in the paper 'Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective'. This model demonstrates excellent reasoning abilities in multiple domains such as mathematics, code, science, and logical reasoning, and performs well in multiple benchmark tests.
Natural Language Processing
Transformers