proximal-policy-optimization
PublicAn implementation from the state-of-the-art family of reinforcement learning algorithms Proximal Policy Optimization using normalized Generalized Advantage Estimation and optional batch mode training. The loss function incorporates an entropy bonus.
actor-criticdeep-learningentropygaegeneralized-advantage-estimationmachine-learningneural-networkopen-aiopen-ai-gymoptimization
Creat:2022-12-26T20:25:02
Update:2022-01-13T14:08:52
3
Stars
0
Stars Increase