Reinforcement Learning

We have implemented a few learning examples.


Policy optimization is performed using the reinforcement-learning algorithm augmented random search (ARS) to optimize static linear policies for locomotion. The insect-like robot has rewards on forward velocity and survival and costs on control usage and contact forces.


A very basic random-sampling algorithm is used to find parameters for the periodic gait of a quadruped.


We have modified the cartpole example in the ReinforcementLearning package to use Dojo's dynamics. This allows us to combine advanced learning algorithms with accurate dynamics simulation.