24-Game-Reasoning

Public

超简单复现Deepseek-R1-Zero和Deepseek-R1，以「24点游戏」为例。通过zero-RL、SFT以及SFT+RL，以激发LLM的自主验证反思能力。

24game alignment cot deepseek llm long-cot o1 post-training r1 r1-zero

Creat：2025-02-26T15:46:13

Update：2025-03-24T20:54:55

Stars

Stars Increase

Related projects

Imgui

Hot

api

Dear ImGui: Bloat-free Graphical User interface for C++ with minimal dependencies

69950

1年前

+118today

Pandas

alignment

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

47277

1年前

+15today

Pixijs

canvas

The HTML5 Creation Engine: Create beautiful digital content with the fastest, most flexible 2D WebGL renderer.

46187

1年前

+15today

Insightface

age-estimation

State-of-the-art 2D and 3D Face Analysis Project

27257

1年前

+31today

Libgdx

Desktop/Android/HTML5/iOS Java game development framework

24589

1年前

+14today

Parlant

ai-agents

Control GenAI interactions with power, precision, and consistency using LLM-native Conversation Design paradigms

16632

1年前

+31today

Magictools

art

:video_game: :pencil: A list of Game Development resources to make magic happen.

15819

1年前

+29today

Bullet3

computer-animation

Bullet Physics SDK: real-time collision detection and multi-physics simulation for VR, games, visual effects, robotics, machine learning etc.

14038

1年前

+6today

PaddleSpeech

asr

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

12414

1年前

+6today

Warriorjs

? An exciting game of programming and Artificial Intelligence

9503

1年前

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

GEO Brand Visibility

AI Visibility Audit

AI Search Visibility Checker

GEO Ranking Monitor

AI Conversation Insight

GEO Promotion Link Detection

Website AI Friendliness Detection

GEO Ranking Optimization System

GEO Ranking Optimization

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

LLM API Hub

AI Models Finder

Model Providers

LLM Leaderboard

LLM API Proxy Checker

Compare LLMs

LLM Cost Calculator

LLM Arena

AI Model Compatibility Checker

AI Deployment Calculator