Online RL with Simple Reward Enables Training VLA Models with Only One Trajectory
强化学习中文教程(蘑菇书?),在线阅读地址:https://datawhalechina.github.io/easy-rl/
Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
Tutorials, assignments, and competitions for MIT Deep Learning related courses.
An elegant PyTorch deep reinforcement learning library.
A visual playground for agentic workflows: Iterate over your agents 10x faster
Massively Parallel Deep Reinforcement Learning. ?
TypeDB: the power of programming, in your database
An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)
OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.
Reasoning in LLMs: Papers and Resources, including Chain-of-Thought, OpenAI o1, and DeepSeek-R1 ?