BLiTZ — A Bayesian Neural Network library for PyTorch

栏目: IT技术 · 发布时间: 4年前

内容简介:Blitz — Bayesian Layers in Torch Zoo is a simple and extensible library to create Bayesian Neural Network layers on the top of PyTorch.This is a post on the usage of a library for Deep Bayesian Learning. If you are new to the theme, you may want to seek on

BLiTZ — A Bayesian Neural Network library for PyTorch

Blitz — Bayesian Layers in Torch Zoo is a simple and extensible library to create Bayesian Neural Network layers on the top of PyTorch.

BLiTZ — A Bayesian Neural Network library for PyTorch

Illustration for Bayesian Regression. Source: https://ericmjl.github.io/bayesian-deep-learning-demystified/images/linreg-bayesian.png (Accessed in 2020–03–30)

This is a post on the usage of a library for Deep Bayesian Learning. If you are new to the theme, you may want to seek one of the many posts on medium about it or just the documentation section on Bayesian DL of our lib repo .

As there is a rising need for gathering uncertainty over neural network predictions, using Bayesian Neural Network layers became one of the most intuitive approaches — and that can be confirmed by the trend of Bayesian Networks as a study field on Deep Learning.

It occurs that, despite the trend of PyTorch as a main Deep Learning framework (for research, at least), no library lets the user introduce Bayesian Neural Network layers intro their models with as ease as they can do it with nn.Linear and nn.Conv2d , for example.

Logically, that causes a bottleneck for anyone that wants to iterate flexibly with Bayesian approaches for their data modeling, as the user has to develop the whole part of Bayesian Layers for its use rather than focusing on the architecture of its model.

BLiTZ was created to change to solve this bottleneck. By being fully integrated with PyTorch (including with nn.Sequential modules) and easy to extend as a Bayesian Deep Learning library, BLiTZ lets the user introduce uncertainty on its neural networks with no more effort than tuning its hyper-parameters.

In this post, we discuss how to create, train and infer over uncertainty-introduced Neural Networks, using BLiTZ layers and sampling utilities.

Bayesian Deep Learning layers

As we know, the main idea on Bayesian Deep Learning is that, rather than having deterministic weights, at each feed-forward operation, the Bayesian layers samples its weights from a normal distribution.

Consequently, the trainable parameters of the layer are the ones that determine the mean and variance of this distribution.

Mathematically, the operations would go from:

Deterministic “vanilla” neural network feed-forward operation.

To:

Feed-forward operation for Bayesian Neural Network layer.

Implementing layers where our ρ and μ are the trainable parameters may be hard on Torch, and beyond that, creating hyper-parameter tunable layers may be even harder to craft. BLiTZ has a built-in BayesianLinear layer which can be introduced into the model this easy:

It works as a normal Torch nn.Module network, but its BayesianLinear modules perform training and inference with the previously explained uncertainty on its weights.

Loss calculation

As proposed in its original paper, Bayesian Neural Networks cost function is a combination of a “complexity cost” with a “fitting-to-data cost”. After all the algebra wrangling, for each feed-forward operation, we have:

Cost function for Bayesian Neural Networks.

It occurs that the complexity cost (P(W)) consists of the sum of the probability density function of the sampled weights (of each Bayesian layer on the network) relative to a much-simpler, predefined pdf function. By doing that, we ensure that, while optimizing, our model variance over its predictions will diminish.

To do that, BLiTZ brings us the variational_estimator decorator that introduces some methods, as nn_kl_divergence method into our nn.Module . Given data points, its labels, and a criterion, we could get the loss over a prediction by doing:

Easy model optimzing

Bayesian Neural Networks are often optimized by sampling the loss many times on the same batch before optimizing and proceeding, which occurs to compensate the randomness over the weights and avoid optimizing them over a loss influenced by outliers.

BLiTZ’s variational_estimator decorator also powers the neural network with the sample_elbo method. Given the inputs , outputs , criterion and sample_nbr , it estimates does the iterative process on calculating the loss over the batch sample_nbr times and gathers its mean, returning the sum of the complexity loss with the fitting one.

It is very easy to optimize a Bayesian Neural Network model:

Going through one example:

We are now going through this example , to use BLiTZ to create a Bayesian Neural Network to estimate confidence intervals for the house prices of the Boston housing sklearn built-in dataset. If you want to seek other examples, there are more on the repository

Necessary imports

Besides the known modules, we will bring from BLiTZ the variational_estimator decorator, which helps us to handle the Bayesian layers on the module keeping it fully integrated with the rest of Torch, and, of course, BayesianLinear , which is our layer that features weight uncertainty.

Loading and scaling data

Nothing new under the sun here, we are importing and standard-scaling the data to help with the training.

Creating our regressor class

We can create our class with inheriting from nn.Module , as we would do with any Torch network. Our decorator introduces the methods to handle the Bayesian features, calculating the complexity cost of the Bayesian Layers and doing many feedforwards (sampling different weights on each one) to sample our loss.

Defining a confidence interval evaluating function

This function does create a confidence interval for each prediction on the batch on which we are trying to sample the label value. We then can measure the accuracy of our predictions by seeking how much of the prediction distributions did include the correct label for the datapoint.

Creating our regressor and loading data

Notice here that we create our BayesianRegressor as we would do with other neural networks.

Our main training and evaluating loop

We do a training loop that only differs from a common Torch training by having its loss sampled by its sample_elbo method. All the other stuff can be done normally, as our purpose with BLiTZ is to ease your life on iterating on your data with different Bayesian Neural Networks without trouble.

Here is our very simple training loop:


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

jQuery基础教程 (第4版)

jQuery基础教程 (第4版)

[美] Jonathan Chaffer、[美] Karl Swedberg / 李松峰 / 人民邮电出版社 / 2013-10 / 59.00

本书由jQuery API网站维护者亲自撰写,第一版自2008上市以来,一版再版,累计重印14次,是国内首屈一指的jQuery经典著作! 作为最新升级版,本书涵盖jQuery 1.10.x和jQuery 2.0.x。本书前6章以通俗易懂的方式讲解了jQuery的核心组件,包括jQuery的选择符、事件、动画、DOM操作、Ajax支持等。第7章和第8章介绍了jQuery UI、jQuery M......一起来看看 《jQuery基础教程 (第4版)》 这本书的介绍吧!

HTML 压缩/解压工具
HTML 压缩/解压工具

在线压缩/解压 HTML 代码

MD5 加密
MD5 加密

MD5 加密工具

RGB HSV 转换
RGB HSV 转换

RGB HSV 互转工具