jax-policy-gradient
PublicJAX implementation of REINFORCE policy gradient with baseline subtraction, entropy regularization, and gradient clipping for stable training.
Creat:2025-07-08T11:26:20
Update:2025-07-09T02:44:46
4
Stars
0
Stars Increase
JAX implementation of REINFORCE policy gradient with baseline subtraction, entropy regularization, and gradient clipping for stable training.