Unity-ML Agents: The Mayan Adventure

栏目: IT技术 · 发布时间: 4年前

内容简介:That’s all for today,You’ve just trained IndieNow that we have good results we can try some experiments.Remember that the

Unity-ML Agents Course

Unity-ML Agents: The Mayan Adventure

Train an agent to get the golden statue in this dangerous environment.

Unity-ML Agents: The Mayan Adventure

This article is the third chapter of a new free course on Deep Reinforcement Learning with Unity. Where we’ll create agents that learn to play video games using the Unity game engine . Check the syllabus here .

If you never study Deep Reinforcement Learning before, you need to check the free course Deep Reinforcement Learning with Tensorflow.

In the last two articles, you learned to use ML-Agents and trained two agents. The first was able to jump over walls , and the second learned to destroy a pyramid to get the golden brick . It’s time to do something harder.

When I was thinking about creating a custom environment, I remembered the famous scene in Indiana Jones, where Indy needs to get the golden statue and avoid a lot of traps to survive.

I was thinking: could my agent could be as good as him? Spoiler alert: it is, as you can see in this replay video!

That’s why during the last two weeks, I developed in Unity an open-source reinforcement learning environment called the Mayan Adventure. A dangerous modular environment full of traps. And, in order to train the agent, I used a learning strategy called curriculum learning.

Unity-ML Agents: The Mayan Adventure

So today, you’ll learn about curriculum learning, and you’ll train our agent to attain the golden statue and beat the game .

So let’s get started!

Introducing the Mayan Adventure

Hello Indie

Unity-ML Agents: The Mayan Adventure

As I said, the Mayan Adventure is an open-source reinforcement learning environment for ML-Agents.

In this environment, you train your agent (Indie) to beat every dangerous trap to get the golden statue.

Unity-ML Agents: The Mayan Adventure

During the learning process, our agent starts to avoid falling off the ground and get the golden statue. As he’s becoming better we add some difficulty , thanks to the two pressure buttons, your agent can transform itself into rock or wood.

He will need to transform itself into wood to cross the wooden bridge otherwise, the bridge will collapse.

Unity-ML Agents: The Mayan Adventure

Then into a rock to cross the fire level.

Unity-ML Agents: The Mayan Adventure

The reward system is:

Unity-ML Agents: The Mayan Adventure

In terms of observation, our agent does not use computer vision but ray perception tensors . Think of them as lasers that will detect if it passes through an object.

We used two of themdetecting safe platform, to rock button, to wood button, fire, wood bridge, goal, and gem. The second ray detects also the void (in order to train our agent to avoid falling from the environment).

Moreover, we added 3 observations: a boolean informing if the agent is rock or not, and x and z velocity of the agent.

The action space is discrete with:

Unity-ML Agents: The Mayan Adventure

Behind the hood: how was it done?

The Mayan Adventure started with this prototyped version.

Unity-ML Agents: The Mayan Adventure

The prototype

But it was not really appealing, in order to create the environment we used different free packages:

  • 3D Game Kit : A fantastic environment created by Unity, I use their rock platforms, buttons, vegetations and pillars elements.
  • Unity Particle Pack : I used it for the fire system.
  • Creator kit: puzzle : I used it only for the win particle fireworks.
  • The other elements, wood bridge, and animation, rock heads, golden statue, fedora, etc were made in Blender.

Unity-ML Agents: The Mayan Adventure

Unity-ML Agents: The Mayan Adventure

I designed the project to be as modular as possible.It means that you will be able to create new levels and new obstacles. Moreover, this is still a work in progress, it implies that corrections and improvements will be done and new levels will be made.

To help me create the environment in terms of code, I followed the very good course made by Immersive Limit on Unity Learn.

The question you may ask now is how we’re going to train this agent?

Let’s train Indie

To train Indie, we’re going to use PPO with a Curriculum Learning strategy. If you don’t know what PPO is, you can check my article .

What is Curriculum Learning?

Curriculum learning is a reinforcement learning technique to better train an agent when it needs to learn a complicated task.

Suppose you’re 6yo, and I say to you that we’re going to start learning Multivariable Calculus. Then you’ll be overwhelmed and unable to do it. Because this is too hard for the beginning: you’ll fail.

Unity-ML Agents: The Mayan Adventure

A better strategy would be to learn simple mathematics first and then add complexity as you get better with the basics to be able, at the end, to complete this advanced course.

So you start by arithmetic lesson then when you’re good you continue with Algebra lesson, then Complex Algebra lesson then Calculus lesson and finally Multivariable calculus lesson. Thanks to this strategy, you’ll be able to succeed in Multivariable calculus.

Unity-ML Agents: The Mayan Adventure

This is the same strategy we’re going to use to train our agents. Instead of giving our agent the whole environment once, we train it by adding a level of difficulty as he gets better.

In order to do that in ML-Agents, we need to specify our curricula: a YAML config file that will specify when to change environment parameters (in our case, increase the level) based on some metrics (the average cumulative reward). Think of this curricula as a syllabus that our agent needs to follow.

Unity-ML Agents: The Mayan Adventure

The Curricula goes like this:

  • In the first level, the agent needs to learn to get the Golden Statue and avoid falling off the platform.
  • Then, in the second level, the agent needs to interact with the physics buttons and turns itself to wood to cross this big wood bridge.
  • In the third level, the agent needs t o transform itself to rock in order to cross this fire.
  • Finally, in the last level, the agent needs to learn to transform itself into wood and to cross a slimmer bridge without falling from it.

Let’s get this golden statue!

You need to download the project from the GitHub repository and follow the Installation process defined in the doc .

If you don’t want to train the golden statue, the trained models are in the Unity Project in the Folder “The Mayan Adventure Saved Models”.

So now we understood how the Mayan Adventure environment and curriculum learning work let’s train our agent to beat the game.

The code is divided into 2 main scripts:

  • MayanAdventureArea. cs: that controls the level generation, agent, and goal spawn.
  • MayanAdventureAgent.cs : that controls the agent movement, handle events (what happens when you’re rock and you’re in the wood bridge etc), and the reward system.

First, we need to define our curricula,

To do that, you need to go to your ML-Agents folder/config/curricula folder and create a new folder called mayanAdventure and inside create your MayanAdventureLearning.yaml file.

Unity-ML Agents: The Mayan Adventure

  • The key indicator we will use to measure the progress of our agent is the reward.
  • Then, in the thresholds section, we define the average reward to go to the next level . For instance, to go to level 1, our agent needs to have an average cumulative reward of 0.1
  • The min_lesson_length specifies the minimum number of episodes an agent must do before changing the level . It helps us to avoid the risk that our agent is lucky during one episode and changing the level too fast.
  • Finally, we define the parameter which is here the level number.

Now that we’ve defined our curricula, we can configure our agent. Our agent is an instance from a prefab. Consequently, to modify all at once we’ll modify the prefab directly.

In the prefab MayanAdventureArea , you need to check that training is true. We created this bool to differentiate the training phase from the testing phase where we added some events such as activating a winning panel, wood bridge destroy animation and also display the firework when you win.

Unity-ML Agents: The Mayan Adventure

Then in the Prefab go to the Agent , first in the Behavior parameters , if there is one, remove the model.

Unity-ML Agents: The Mayan Adventure

After that, you need to define the observation stacked vectors. This will depend if you use/does not use a recurrent memory (that will be defined in the trainer config) if not, you should stack 4 to 6 frames. If yes, only 3 .

Unity-ML Agents: The Mayan Adventure

It’s important that both Vector Observations and Ray Perception observations have the same stack number.

Now we can define the hyperparameters. This is the configuration I wrote that gave me the best results.

Unity-ML Agents: The Mayan Adventure

Finally don’t forget to deactivate the Main Camera that is here for Replay purposes only.

We’re now ready to train. You need to open your terminal, go where ml-agents-master is and type this:

mlagents-learn ./config/trainer_config.yaml --curriculum=config/curricula/mayanAdventure/MayanAdventureLearning.yaml --run-id TheMayanAdventure_beta_train --train

Here, we defined:

  • Where the trainer_config is: ./config/trainer_config.yaml
  • Our curricula: — curriculum=config/curricula/mayanAdventure/MayanAdventureLearning.yaml
  • The id of this training: — run-id TheMayanAdventure_beta_train
  • And don’t forget the — train flag.

It will ask you to run the Unity scene,

Press the :arrow_forward: button at the top of the Editor.

You can monitor your training by launching Tensorboard using this command:

tensorboard — logdir=summaries

My results

Before obtaining good results, I’ve made about 20 trainings in order to find a good set of hyperparameters.

I give you the two best trained saved models, the first with a recurrent memory, the other without. The training took me about 2h10 with 30 parallel environments and only CPU on a MacBook Pro Intel Core i5.

Unity-ML Agents: The Mayan Adventure

We see that the two agents have quite the same results , the training without memory is quite better. This is the one I used for the video recording.

Unity-ML Agents: The Mayan Adventure

Replay

Now that you’ve trained the agent. You need to move the saved models files contained in ml-agents-master/models to the Mayan Adventure Saved Models of the Unity Project.

Then, you need to deactivate all the instances of the MayanAdventureArea except MayanAdventureArea (1).

In fact, as we do in classical Deep Reinforcement Learning when we launched multiple instances of a game (for instance 128 parallel environments) we do the same hereby copy and paste the agents, in order to have more various states. But we need only one for the replay.

And don’t forget to activate the Main Camera.

Unity-ML Agents: The Mayan Adventure

Now, you need to go back to the MayanAdventureArea prefab and deselect Training.

Unity-ML Agents: The Mayan Adventure

Finally in Agent Behavior Parameters , drag the model file to Model Placeholder.

Unity-ML Agents: The Mayan Adventure

Then, press the :arrow_forward: button at the top of the Editor and voila!

Unity-ML Agents: The Mayan Adventure

If you want to record your results, you just need to go to Window>General>Recorder>RecorderWindow and click on Start Recording with these parameters:

Unity-ML Agents: The Mayan Adventure

The Next Steps

The Mayan Adventure is a work in progress project , it means that corrections and improvements will be done and new levels will be made. Here some next steps we’re going to take.

Ray casts maybe not sufficient: we need to give him vision ability

What we discover during the training is that our agent is good but some challenges will definitely require vision.

The “problem” with vision is that it increases the state size exponentially . It means that the next version will only be trained on GPU instead of CPU.

The idea could be to have an orthographic upper view of the environment as input like in Unity’s Gridworld example.

Unity-ML Agents: The Mayan Adventure

Source: ML-Agents Documentation

New Levels on the row and timed events

Because the Mayan Environment is an RL research environment, we want to add more complex levels to train our agents to learn long-term strategies.

Consequently, in addition to work on the vision version, we’re currently working on adding more complex levels. Such as the rolling ball trap.

But also some timed events such as turning on and off the fire level every 3 seconds.

Adding randomness in the generation of the level

Currently, our generator always outputs the same order for the levels. We want to improve that by adding some randomness in the level generation process.

That’s all for today,

You’ve just trained Indie that beat all the traps and reached the golden statue. And you’ve also learned about Curriculum Learning. That’s awesome!

Now that we have good results we can try some experiments.Remember that the best way to learn is to be active by experimenting. So you should try to make some hypotheses and verify them.

See you next time!

If you have any thoughts, comments, questions, feel free to comment below or send me an email: hello@simoninithomas.com, or tweet me @ThomasSimonini .

Keep learning, stay awesome!


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

B2B品牌管理

B2B品牌管理

(美)菲利普·科特勒、(德)弗沃德 / 楼尊 / 2008-1 / 35.00元

《B2B品牌管理》是第一本专门系统地阐述B2B品牌化的专业书籍,由营销大师菲利普•科特勒与弗沃德教授合作而成。他们以非凡的智慧和深厚的经验告诫B2B企业如何运用目标明确、重点突出的品牌化战略取得市场竞争优势地位,从而更加接近顾客,也更接近成功。在众多关于品牌的书籍中,《B2B品牌管理》的独特价值在于其根据实际环境探讨B2B品牌和品牌化问题,重点介绍和分析前沿的思想和最佳实践;通过与B2C企业的品牌......一起来看看 《B2B品牌管理》 这本书的介绍吧!

随机密码生成器
随机密码生成器

多种字符组合密码

RGB HSV 转换
RGB HSV 转换

RGB HSV 互转工具

HSV CMYK 转换工具
HSV CMYK 转换工具

HSV CMYK互换工具