Using OpenAI Gym to train an open-source 3D printed robot

栏目: IT技术 · 发布时间: 4年前

内容简介：The goal of this project is to train an open-source 3D printed quadruped robot exploringThis project is mostly inspired by the incredible works done by Boston Dynamics.This repository contains different

Rex: an open-source domestic robot

The goal of this project is to train an open-source 3D printed quadruped robot exploring Reinforcement Learning and OpenAI Gym . The aim is to let the robot learns domestic and generic tasks in the simulations and then successfully transfer the knowledge ( Control Policies ) on the real robot without any other manual tuning.

This project is mostly inspired by the incredible works done by Boston Dynamics.

Rex-gym: OpenAI Gym environments and tools

This repository contains different OpenAI Gym Environments used to train Rex, the Rex URDF model, the learning agent and some scripts to start the training session and visualise the learned Control Polices .

Installation

Create a Python 3.7 virtual environment, e.g. using Anaconda

conda create -n rex python=3.7 anaconda
conda activate rex

PyPI package

Install the public rex-gym package:

pip install rex_gym

Install from source

Alternately, clone this repository and run from the root of the project:

pip install .

Run a pre-trained agent

To start a pre-trained agent:

python -m rex_gym.playground.policy_player --env ENV-NAME-HERE

Environment	Flag
Run	rex_galloping
Walk	rex_walk
Turn	rex_turn
Stand up	rex_standup

Run single training simulation

To start a training simulation test ( agents=1 , render=True ):

python -m rex_gym.playground.training_player --config rex_reactive --logdir YOUR_LOG_DIR_PATH

Where YOUR_LOG_DIR_PATH is the output path.

Set the Environment with the --config flag:

Environment	Flag
Run	rex_galloping
Walk	rex_walk
Turn	rex_turn
Stand up	rex_standup

Start a new batch training simulation

To start a new batch training session:

python -m rex_gym.agents.scripts.train --config rex_reactive --logdir YOUR_LOG_DIR_PATH

Where YOUR_LOG_DIR_PATH is the output path.

Set the Environment with the --config flag:

Environment	Flag
Run	rex_galloping
Walk	rex_walk
Turn	rex_turn
Stand up	rex_standup

PPO Agent configuration

You may want to edit the PPO agent's default configuration, especially the number of parallel agents launched during the simulation.

Edit the num_agents variable in the agents/scripts/configs.py script:

def default():
    """Default configuration for PPO."""
    # General
    ...
    num_agents = 20

Install rex_gym from source. This configuration will launch 20 agents (threads) in parallel to train your model.

Robot platform

The robot used for this experiment is the Spotmicro made by Deok-yeon Kim .

Using OpenAI Gym to train an open-source 3D printed robot

I've printed the components using a Creality Ender3 3D printer, with PLA and TPU+.

The idea is to extend the robot adding components like a robotic arm on the top of the rack and a LiDAR sensor.

Simulation model

Rex is a 12 joints robot with 3 motors ( Shoulder , Leg and Foot ) for each leg. The poses signals (see /model/rex.py ) set the 12 motor angles and allow Rex to stand up.

The robot model is imported in pyBullet using an URDF file .

Using OpenAI Gym to train an open-source 3D printed robot

Tasks

This is the list of tasks this experiment will cover:

Basic controls
1. Run/Walk straight on - forward/backward
2. Turn left/right on the spot
3. Stand up/Sit down
4. Steer - Run/Walk
5. Side swipe
Fall recovery
Reach a specific point in a map
Grab an object

Basic Controls: Run

Goal: how to run straight on.

Gym Environment

There is a good number of papers on quadrupeds locomotion, some of them with sample code. Probably, the most complete collection of examples is the Minitaur folder in the Bullet3 repository. For this task, the Minitaur Reactive Environment explained in the paper Sim-to-Real: Learning Agile Locomotion For Quadruped Robots is a great example.

Galloping gait - from scratch

In this very first experiment, I let the system learn from scratch: giving the feedback component large output bounds [−0.6,0.6] radians. The leg model (see galloping_env.py ) forces legs and foots movements (positive or negative direction, depending on the leg) influencing the learning score and time. In this first version, the leg model holds the Shoulder motors in the start position (0 degrees).

As in the Minitaur example, I'm using the Proximal Policy Optimization (PPO).

Using OpenAI Gym to train an open-source 3D printed robot

The emerged galloping gait shows the chassis tilled up and some unusual positions/movements (especially starting from the initial pose) during the locomotion. The leg model needs improvements.

Galloping gait - bounded feedback

To improve the gait, in this second simulation, I've worked on the leg model :

Using OpenAI Gym to train an open-source 3D printed robot

I set bounds for both Leg and Foot angles, keeping the Shoulder in the initial position.

Using OpenAI Gym to train an open-source 3D printed robot

The emerged gait now looks more clear.

Galloping gait - balanced feedback

Another test was made using a balanced feedback:

Using OpenAI Gym to train an open-source 3D printed robot

The Action Space dimension is equals to 4, the same angle is assigned to both the front legs and a different one to the rear ones. The very same was done for the foot angles.

The simulation score is massively improved (about 10x) as the learning time while the emerged gait is very similar to the bounded feedback model. The Tensorflow score with this model, after ~500k attempts, is the same after ~4M attempts using any other models.

Basic Controls: Walk

Goal: how to walk straight on.

Gym Environment

Starting from Minitaur Alternating Leg environment, I've used a sinusoidal signal as leg_model alternating the Rex legs during the locomotion. The feedback component has small bounds [-0.1,0.1] as in the original script.

Using OpenAI Gym to train an open-source 3D printed robot

Basic Controls: Turn left/right

Goal: How to reach a certain orientation turning on the spot.

Gym Environment

In this environment the leg_model applies a 'steer-on-the-spot' gait, allowing Rex to moving towards a specific orientation. The reward function takes the chassis position/orientation and compares it with a fixed target position/orientation. When this difference is less than 0.1 radiant, the leg_model is set to the stand up. In order to make the learning more robust, the Rex starting orientation is randomly chosen (every 'Reset' step).

Using OpenAI Gym to train an open-source 3D printed robot

Basic Controls: Stand up

Goal: Reach the base standing position starting from the rest position

Gym Environment

This environment introduces the rest_postion , ideally the position assumed when Rex is in stand-by. The leg_model is the stand_low position, while the signal function applies a 'brake' forcing Rex to assume an halfway position before completing the movement.

Using OpenAI Gym to train an open-source 3D printed robot

Credits

Sim-to-Real: Learning Agile Locomotion For Quadruped Robots and all the related papers. Google Brain, Google X, Google DeepMind - Minitaur Ghost Robotics.

Deok-yeon Kim creator of SpotMini.

The great work in rendering the robot platform done by the SpotMicroAI community.

以上就是本文的全部内容，希望本文的内容对大家的学习或者工作能带来一定的帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

Using OpenAI Gym to train an open-source 3D printed robot

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

阿里铁军

宋金波、韩福东 / 中信出版集团 / 2017-7 / 58

【编辑推荐】互联网地推天团，马云口中的中国电商“黄埔军校”，是如何铸造的？超强执行力来自何处，价值观如何创造万亿价值？阿里铁军的团队建设、销售技巧、文化与价值观的创建与传播，深度剖析与分享。阿里铁军，不仅走出过阿里巴巴集团的诸多高管，彭蕾、戴姗、蒋芳、孙彤宇、蔡崇信……，还走出过互联网江湖中的众多显赫人物，国内O2O战场，一度成为“铁军内战”：程维（滴滴打车创始人兼CEO）......一起来看看《阿里铁军》这本书的介绍吧!

码农工具

Using OpenAI Gym to train an open-source 3D printed robot

Rex: an open-source domestic robot

Rex-gym: OpenAI Gym environments and tools

Installation

PyPI package

Install from source

Run a pre-trained agent

Run single training simulation

Start a new batch training simulation

PPO Agent configuration

Robot platform

Simulation model

Tasks

Basic Controls: Run

Gym Environment

Galloping gait - from scratch

Galloping gait - bounded feedback

Galloping gait - balanced feedback

Basic Controls: Walk

Gym Environment

Basic Controls: Turn left/right

Gym Environment

Basic Controls: Stand up

Gym Environment

Credits

阿里铁军

Markdown 在线编辑器

RGB HSV 转换

RGB CMYK 转换工具