NeurIPS 2018值得一读的强化学习论文清单

栏目: 数据库 · 发布时间: 7年前

内容简介:这个列表中的论文主要是关于

这个列表中的论文主要是关于 深度强化学习 和RL / AI,希望它对大家有所帮助。有关NeurIPS 2018中强化学习论文的清单如下,按第一作者姓氏的字母顺序排列。

  1. Brandon Amos, Ivan Jimenez, Jacob Sacks, Byron Boots, and J. Zico Kolter.

    Differentiable MPC for end-to-end planning and control.

  2. Yusuf Aytar, Tobias Pfaff, David Budden, Thomas Paine, Ziyu Wang, and Nando de Freitas.

    Playing hard exploration games by watching YouTube.

  3. Jacob Buckman, Danijar Hafner, George Tucker, Eugene Brevdo, and Honglak Lee.

    Sample-efficient reinforcement learning with stochastic ensemble value expansion.

  4. Kurtland Chua, Roberto Calandra, Rowan McAllister, and Sergey Levine.

    Data-efficient model-based reinforcement learning with deep probabilistic dynamics models.

  5. Filipe de Avila Belbute-Peres, Kevin Smith, Kelsey Allen, Josh Tenenbaum, and J. Zico Kolter.

    End-to-end differentiable physics for learning and control.

  6. Amir massoud Farahmand.

    Iterative value-aware model learning.

  7. Justin Fu, Sergey Levine, Dibya Ghosh, Larry Yang, and Avi Singh.

    An event-based framework for task specification and control.

  8. Vikash Goel, Jameson Weng, and Pascal Poupart.

    Unsupervised video object segmentation for deep reinforcement learning.

  9. Abhishek Gupta, Russell Mendonca, YuXuan Liu, Pieter Abbeel, and Sergey Levine.

    Meta-reinforcement learning of structured exploration strategies.

  10. David Ha and Jürgen Schmidhuber.

    Recurrent world models facilitate policy evolution.

  11. Nick Haber, Damian Mrowca, Stephanie Wang, Li Fei-Fei, and Daniel Yamins. Learning to play with intrinsically-motivated, self-aware agents.

  12. Rein Houthooft, Yuhua Chen, Phillip Isola, Bradly Stadie, Filip Wolski, Jonathan Ho, and Pieter Abbeel.

    Evolved policy gradients.

  13. Zhiting Hu, Zichao Yang, Ruslan Salakhutdinov, LIANHUI Qin, Xiaodan Liang, Haoye Dong, and Eric Xing.

    Deep generative models with learnable knowledge constraints.

  14. Jiexi Huang, Fa Wu,Doina Precup, and Yang Cai.

    Learning safe policies with expert guidance.

  15. Kwang-Sung Jun, Lihong Li, Yuzhe Ma, and Xiaojin Zhu.

    Adversarial attacks on stochastic bandits.

  16. Raksha Kumaraswamy, Matthew Schlegel, Adam White, and Martha White. Context-dependent upper-confidence bounds for directed exploration.

  17. Isaac Lage, Andrew Ross, Samuel J Gershman, Been Kim, and Finale Doshi-Velez.

    Human-in-the-loop interpretability prior.

  18. Marc Lanctot, Sriram Srinivasan, Vinicius Zambaldi, Julien Perolat, karl Tuyls, Remi Munos, and Michael Bowling.

    Actor-critic policy optimization in partially observable multiagent environments.

  19. Nevena Lazic, Craig Boutilier, Tyler Lu, Eehern Wong, Binz Roy, MK Ryu, and Greg Imwalle.

    Data center cooling using model-predictive control.

  20. Jan Leike, Borja Ibarz, Dario Amodei, Geoffrey Irving, andShane Legg.

    Reward learning from human preferences and demonstrations in Atari.

  21. Shuang Li, Shuai Xiao, Shixiang Zhu, Nan Du, Yao Xie, and Le Song.

    Learning temporal point processes via reinforcement learning.

  22. Yuan Li, Xiaodan Liang, Zhiting Hu, and Eric Xing.

    Hybrid retrieval-generation reinforced agent for medical image report generation.

  23. Chen Liang, Mohammad Norouzi, Jonathan Berant, Quoc V Le, and Ni Lao. Memory augmented policy optimization for program synthesis with generalization.

  24. Qiang Liu, Lihong Li, Ziyang Tang, and Denny Zhou.

    Breaking the curse of horizon: Infinite-horizon off-policy estimation.

  25. Yao Liu, Omer Gottesman, Aniruddh Raghu, Matthieu Komorowski, Aldo A Faisal, Finale Doshi-Velez, and Emma Brunskill.

    Representation balancing MDPs for off-policy policy evaluation.

  26. Tyler Lu, Craig Boutilier, and Dale Schuurmans.

    Non-delusionalQ-learningand value-iteration.

  27. Mario Lucic, Karol Kurach, Marcin Michalski, Sylvain Gelly, and Olivier Bousquet.

    Are GANs created equal? a large-scale study.

  28. David Alvarez Melis and Tommi Jaakkola.

    Towards robust interpretability with self-explaining neural networks.

  29. Robin Manhaeve, Sebastijan Dumancic, Angelika Kimmig, Thomas Demeester, and Luc De Raedt.

    DeepProbLog: Neural probabilistic logic programming.

  30. Horia Mania, Aurelia Guy, and Benjamin Recht.

    Simple random search of static linear policies is competitive for reinforcement learning.

  31. Damian Mrowca, Chengxu Zhuang, Elias Wang, Nick Haber, Li Fei-Fei, Josh Tenenbaum, and Daniel Yamins.

    A flexible neural representation for physics prediction.

  32. Ofir Nachum, Shixiang Gu, Honglak Lee, and Sergey Levine.

    Data-efficient hierarchical reinforcement learning.

  33. Ashvin Nair, Vitchyr Pong, Shikhar Bahl, Sergey Levine, Steven Lin, and Murtaza Dalal.

    Visual goal-conditioned reinforcement learning by representation learning.

  34. Matthew O’Kelly, Aman Sinha, Hongseok Namkoong, Russ Tedrake, and John C Duchi.

    Scalable end-to-end autonomous vehicle testing via rare-event simulation.

  35. Ian Osband, John S Aslanides, and Albin Cassirer.

    Randomized prior functions for deep reinforcement learning.

  36. Matthew Riemer, Miao Liu, and Gerald Tesauro.

    Learning abstract options.

  37. Adam Santoro, Ryan Faulkner, David Raposo, Jack Rae, Mike Chrzanowski, Theophane Weber, Daan Wierstra, Oriol Vinyals, Razvan Pascanu, and Tim Lillicrap.

    Relational recurrent neural networks.

  38. Shibani Santurkar, Dimitris Tsipras, Andrew Ilyas, and Aleksander Madry.

    How does batch normalization help optimization? (no, it is not about internal covariate shift).

  39. Ozan Sener and Vladlen Koltun.

    Multi-task learning as multi-objective optimization.

  40. Jiaming Song, Hongyu Ren, Dorsa Sadigh, and Stefano Ermon.

    Multi-agent generative adversarial imitation learning.

  41. Wen Sun, Geoffrey Gordon, Byron Boots, and J. Bagnell.

    Dual policy iteration.

  42. Aviv Tamar, Pieter Abbeel, Ge Yang, Thanard Kurutach, and Stuart Russell. Learning plannable representations with causal InfoGAN.

  43. Andrew Trask, Felix Hill, Scott Reed, Jack Rae, Chris Dyer, and Phil Blunsom. Neural arithmetic logic units.

  44. Tongzhou Wang, YI WU, David Moore, and Stuart Russell.

    Meta-learning MCMC proposals.

  45. Catherine Wong, Neil Houlsby, Yifeng Lu, and Andrea Gesmundo.

    Transfer learning with neural AutoML.

  46. Kelvin Xu, Chelsea Finn, and Sergey Levine.

    Uncertainty-aware few-shot learning with probabilistic model-agnostic meta-learning.

  47. Zhongwen Xu, Hado van Hasselt, and David Silver.

    Meta-gradient reinforcement learning.

  48. Kexin Yi, Jiajun Wu, Chuang Gan, Antonio Torralba, Pushmeet Kohli, and Josh Tenenbaum.

    Neural-Symbolic VQA: Disentangling reasoning from vision and language understanding.

  49. Lisa Zhang, Gregory Rosenblatt, Ethan Fetaya, Renjie Liao, William Byrd, Matthew Might, Raquel Urtasun, and Richard Zemel.

    Neural guided con- straint logic programming for program synthesis.

  50. Yu Zhang, Ying Wei, and Qiang Yang.

    Learning to multitask.

  51. Zeyu Zheng, Junhyuk Oh, and Satinder Singh.

    On learning intrinsic rewards for policy gradient methods.

信息来源: https://medium.com/@yuxili/nips-2018-rl-papers-to-read-5bc1edb85a28


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

从零开始学创业大全集

从零开始学创业大全集

阳飞扬 / 中国华侨出版社 / 2011-10-1 / 29.80元

为了让每一个怀揣梦想走上创业之路的有志者能在最短的时间内叩开创业的大门,了解创业的流程和方法,从而找到适合自己的创业之路,我们精心编写了这本《从零开始学创业大全集》。阳飞扬编著的《从零开始学创业大全集(超值白金版)》从创业准备、创业团队的组建、创业项目和商业模式的选择、创业计划书的制作、创业资金的筹集、企业的经营策略、资本运作以及产品营销方法、危机应对策略等方面,全面系统地阐述了创业的基本理论与实......一起来看看 《从零开始学创业大全集》 这本书的介绍吧!

在线进制转换器
在线进制转换器

各进制数互转换器

HTML 编码/解码
HTML 编码/解码

HTML 编码/解码