DeepMind found an AI learning technique also works in human brains

栏目: IT技术 · 发布时间: 4年前

内容简介:Developments in artificial intelligence often draw inspiration from how humans think, but now AI has turned the tables to teach us about how brains learn.Will Dabney at tech firmDeepMind in London and his colleagues have found that a recent development in
DeepMind found an AI learning technique also works in human brains
Ooh, a reward

Tetra Images/Getty Images

Developments in artificial intelligence often draw inspiration from how humans think, but now AI has turned the tables to teach us about how brains learn.

Will Dabney at tech firmDeepMind in London and his colleagues have found that a recent development in machine learning called distributional reinforcement learning also provides a new explanation for how the reward pathways in the brain work. These pathways govern our response to pleasurable events and are mediated by neurons that release the brain chemical dopamine.

“Dopamine in the brain is a type of surprise signal,” says Dabney. “When things turn out better than expected, more dopamine gets released.”

Advertisement

It was previously thought that these dopamine neurons all responded identically. “Kind of like a choir but where everyone’s singing the exact same note,” says Dabney.

But the team found that individual dopamine neurons actually seem to vary – each is tuned to a different level of optimism or pessimism.

Read more: DeepMind AI beats humans at deciphering damaged ancient Greek tablets

“They all end up signalling at different levels of surprise,” says Dabney. “More like a choir all singing different notes, harmonising together.”

The finding drew inspiration from a process known as distributional reinforcement learning, which is one of the techniquesAI has used to master games such asGoand Starcraft II .

At its simplest, reinforcement learning is the idea that a reward reinforces the behaviour that led to its acquisition. It requires an understanding about how a current action leads to a future reward. For example, a dog may learn the command “sit” because it is rewarded with a treat when it does so.

Previously, models of reinforcement learning in both AI and neuroscience focused on learning to predict an “average” future reward. “But this doesn’t reflect reality as we experience it,” says Dabney.

“When someone plays the lottery, for example, they expect to win or they expect to lose, but they don’t expect this halfway average outcome that doesn’t necessarily really occur,” he says.

When the future is uncertain, the possible outcomes can instead be represented as a probability distribution: some are positive, others negative. AIs that use distributional reinforcement learning algorithms are able to predict the full spectrum of possible rewards.

Read more: It’s too soon to tell if DeepMind’s medical AI will save any lives

To test whether the brain’s dopamine reward pathways also work via a distribution, the team recorded responses from individual dopamine neurons in mice. The mice were trained to perform a task and were given rewards of varying and unpredictable sizes.

The researchers found that different dopamine cells showed reliably different levels of surprise.

“Associating rewards to certain stimuli or actions is of critical importance for survival,” says Raul Vicente at University of Tartu, Estonia. “The brain cannot afford to throw away any valuable information about rewards.”

“At large scale, the study is in line with the current view that to operate efficiently the brain has to represent not only the average value of a variable but how often a variable takes different values,” says Vicente. “It is a nice example of how computational algorithms can guide us in what to look for in neural responses.”

However, adds Vicente, more research is needed to demonstrate whether the results apply to other species or regions of the brain.

Journal reference: Nature , 10.1038/s41586-019-1924-6


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

Unix/Linux编程实践教程

Unix/Linux编程实践教程

Bruce Molay、杨宗源、黄海涛 / 杨宗源、黄海涛 / 清华大学出版社 / 2004-10-1 / 56.00元

操作系统是计算机最重要的系统软件。Unix操作系统历经了几十年,至今仍是主流的操作系统。本书通过解释Unix的工作原理,循序渐进地讲解实现Unix中系统命令的方法,让读者理解并逐步精通Unix系统编程,进而具有编制Unix应用程序的能力。书中采用启发式、举一反三、图示讲解等多种方法讲授,语言生动、结构合理、易于理解。每一章后均附有大量的习题和编程练习,以供参考。 本书适合作为高等院校计算机及......一起来看看 《Unix/Linux编程实践教程》 这本书的介绍吧!

HTML 压缩/解压工具
HTML 压缩/解压工具

在线压缩/解压 HTML 代码

XML、JSON 在线转换
XML、JSON 在线转换

在线XML、JSON转换工具

RGB CMYK 转换工具
RGB CMYK 转换工具

RGB CMYK 互转工具