omega
PublicA number of agents (PPO, MuZero) with a Perceiver-based NN architecture that can be trained to achieve goals in nethack/minihack environments.
flaxjaxmctsminihackmodel-based-reinforcement-learningmodel-based-rlmuzeronethackreinforcement-learningrlax
Creat:2021-06-18T05:12:21
Update:2025-01-30T21:40:43
41
Stars
0
Stars Increase