Fast implementation of DeepMind's AlphaZero algorithm in Julia

栏目: IT技术 · 发布时间: 5年前

内容简介:This package provides aBeyond its much publicized success in attaining superhuman level at games such as Chess and Go, DeepMind's AlphaZero algorithm illustrates a more general methodology of combining learning and search to explore large combinatorial spa

AlphaZero.jl

This package provides a generic , simple and fast implementation of Deepmind's AlphaZero algorithm:

  • The core algorithm is only 2,000 lines of pure, hackable Julia code.
  • Generic interfaces make it easy to add support for new games or new learning frameworks.
  • Being between one and two orders of magnitude faster than competing alternatives written in Python, this implementation enables to solve nontrivial games on a standard desktop computer with a GPU.

Why should I care about AlphaZero?

Beyond its much publicized success in attaining superhuman level at games such as Chess and Go, DeepMind's AlphaZero algorithm illustrates a more general methodology of combining learning and search to explore large combinatorial spaces effectively. We believe that this methodology can have exciting applications in many different research areas.

Why should I care about this implementation?

Because AlphaZero is resource-hungry, successful open-source implementations (such as Leela Zero ) are written in low-level languages (such as C++) and optimized for highly distributed computing environments. This makes them hardly accessible for students, researchers and hackers.

The motivation for this project is to provide an implementation of AlphaZero that is simple enough to be widely accessible, while also being sufficiently powerful and fast to enable meaningful experiments on limited computing resources. We found the Julia language to be instrumental in achieving this goal.

Training a Connect Four Agent

To download AlphaZero.jl and start training a Connect Four agent, just run:

git clone https://github.com/jonathan-laurent/AlphaZero.jl.git
cd AlphaZero.jl
julia --project -e "import Pkg; Pkg.instantiate()"
julia --project --color=yes scripts/alphazero.jl --game connect-four train

Fast implementation of DeepMind's AlphaZero algorithm in Julia Fast implementation of DeepMind's AlphaZero algorithm in Julia

Each training iteration takes between one and two hours on a desktop computer with an Intel Core i5 9600K processor and an 8GB Nvidia RTX 2070 GPU. We plot below the evolution of the win rate of our AlphaZero agent against two baselines (a vanilla MCTS baseline and a minmax agent that plans at depth 5 using a handcrafted heuristic):

Fast implementation of DeepMind's AlphaZero algorithm in Julia

Note that the AlphaZero agent is not exposed to the baselines during training and learns purely from self-play, without any form of supervision or prior knowledge.

We also evaluate the performances of the neural network alone against the same baselines. Instead of plugging it into MCTS, we play the action that is assigned the highest prior probability at each state:

Fast implementation of DeepMind's AlphaZero algorithm in Julia

Unsurprisingly, the network alone is initially unable to win a single game. However, it ends up significantly stronger than the minmax baseline despite not being able to perform any search.

For more information on training a Connect Four agent using AlphaZero.jl, see our full tutorial .

Resources

Contributing

Contributions to AlphaZero.jl are most welcome. Many contribution ideas are available in our contribution guide . Please do not hesitate to open a Github issue to share any idea, feedback or suggestion.

Acknowledgements

This material is based upon work supported by the United States Air Force and DARPA under Contract No. FA8750-18-C-0092. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the United States Air Force and DARPA.


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

从问题到程序-用Python学编程和计算

从问题到程序-用Python学编程和计算

裘宗燕 / 机械工业出版社 / 2017-6-1

本书是以Python为编程语言、面向计算机科学教育中的程序设计基础课程与编程初学者的入门教材和自学读物。本书以Python为工具,详细讨论了与编程有关的各方面问题,介绍了从初级到高级的许多重要编程技术。本书特别强调编程中的分析和思考、问题的严格化和逐步分解、语言结构的正确选择、程序结构的良好组织,以及程序的正确和安全。书中通过大量实例及其开发过程,展示了好程序的特征和正确的编程工作方法。此外,书中......一起来看看 《从问题到程序-用Python学编程和计算》 这本书的介绍吧!

JSON 在线解析
JSON 在线解析

在线 JSON 格式化工具

URL 编码/解码
URL 编码/解码

URL 编码/解码

SHA 加密
SHA 加密

SHA 加密工具