BAC Reinforcement Learning algorithm for continuous stochastic control.
Tested on MountainCarContinuous-v0 in Gym environment Python.
Learnt policy demo after 500 BAC gradient estimation and policy updates.
See my GitHub repo: https://github.com/SSubhnil/BAC-DAC-gym