Batch Normalization In Neural Networks (Code)

栏目: IT技术 · 发布时间: 5年前

内容简介：The first step is to import tools and libraries that will be utilized to either implement or support the implementation of the neural network. The tools that are utilized are as follow:The dataset we’ll be utilizing is the trivialThe fashion-MNIST dataset

Code

The first step is to import tools and libraries that will be utilized to either implement or support the implementation of the neural network. The tools that are utilized are as follow:

TensorFlow : An open-source platform for the implementation, training, and deployment of machine learning models.

Keras : An open-source library used for the implementation of neural network architectures that run on both CPUs and GPUs.

import tensorflow as tf
from tensorflow import keras

The dataset we’ll be utilizing is the trivial fashion-MNIST dataset .

The fashion-MNIST dataset contains 70,000 images of clothing. More specifically, it includes 60,000 training examples and 10,000 testing examples, that are all grayscale images with dimension 28 x 28 categorized into ten classes.

Preparation of the dataset includes the normalization of the training image and test images by dividing each pixel value by 255.0. This places the pixel value within the range 0 and 1.

A validation portion of the dataset is also created at this stage. This group of the dataset is utilized during training to assess the performance of the network at various iterations.

(train_images, train_labels),  (test_images, test_labels) = keras.datasets.fashion_mnist.load_data()
train_images = train_images / 255.0
test_images = test_images / 255.0
validation_images = train_images[:5000]
validation_labels = train_labels[:5000]

Keras provides tools required to implement the classification model. Keras presents a Sequential API for stacking layers of the neural network in a consecutive manner.

Below is some information on the layers that will be implemented to make up our neural network.

Flatten : Takes an input shape and flattens the input image data into a one-dimensional array.

Dense : A dense layer has an embedded number of arbitrary units/neurons within. Each neuron is a perceptron.

A Perceptron is a fundamental component of an artificial neural network, and it was invented by Frank Rosenblatt in 1958. A perceptron utilizes operations based on the threshold logic unit.

Batch Normalization : Batch Normalization layer works by performing a series of operations on the incoming input data. The set of operations involves standardization, normalization, rescaling and shifting of offset of input values coming into the BN layer.

Activation Layer : This performs a specified operation on the inputs within the neural network. This layer introduces non -linearity within the network. The model implemented in this article will be utilizing the activation functions: Rectified Linear Unit(ReLU) and softmax .

The transformation imposed by ReLU on values from a neuron is represented by the formula y=max(0,x). The ReLU activation function clamps down any negative values from the neuron to 0, and positive values remain unchanged. The result of this mathematical transformation is utilized as the activation of the current layer, and as input to the next.

# Placing batch normalization layer before the activation layers
model = keras.models.Sequential([
    keras.layers.Flatten(input_shape=[28,28]),
    keras.layers.Dense(300, use_bias=False),
    keras.layers.BatchNormalization(),
    keras.layers.Activation(keras.activations.relu),
    keras.layers.Dense(200, use_bias=False),
    keras.layers.BatchNormalization(),
    keras.layers.Activation(keras.activations.relu),
    keras.layers.Dense(100, use_bias=False),
    keras.layers.BatchNormalization(),
    keras.layers.Activation(keras.activations.relu),
    keras.layers.Dense(10, activation=keras.activations.softmax)
])

Let’s take a look at the internal components of a BN layer

Merely accessing the layer at index two will provide information into the variables and their contents within the first BN layer,

model.layers[2].variables

I won’t go into too many details here, but take note of the variable names ‘gamma’, and ‘beta’, the values held within these variables are responsible for the rescaling and offsetting of activations within the layer.

for variable in model.layers[2].variables:
    print(variable.name)>> batch_normalization/gamma:0
>> batch_normalization/beta:0
>> batch_normalization/moving_mean:0
>> batch_normalization/moving_variance:0

Thisarticle goes into more detail in regards to the operations within BN layers.

Within the dense layers, the bias component is set to false. The omission of bias is as a result of the cancellation of constant values that occurs due to mean subtraction during normalization of activations.

Below is a snippet of a twitter post by Andrej Karpathy, current Director of AI at Tesla. His tweet was based on the topic of neural network mistakes that are often made, not setting bias to false when using BN was on the list.

In the next snippet of code we set and specify the optimization algorithm to train the implemented neural network with, along with the loss function and hyperparameters such as learning rate and the number of epochs.

sgd = keras.optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss="sparse_categorical_crossentropy", optimizer=sgd, metrics=["accuracy"])

Now we train the network using the model’s sequential API’s ‘ fit ’ method provides the tools to train the implemented network. We will skip some details in regards to how the neural network model is trained. For further information on a detailed explanation on the training and implementation of neural networks, refer to the link below.

(In-depth) Machine Learning Image Classification With TensorFlow 2.0

Understand the processes involved in implementing neural networks for image classification.

towardsdatascience.com

model.fit(train_images, train_labels, epochs=60, validation_data=(validation_images, validation_labels))

The evaluation of the model performance is conducted using the test data set aside earlier.

With evaluation results, you can decide to fine-tune the network hyperparameters or move forward to production after observing the accuracy of the evaluation over the test dataset.

model.evaluate(test_images, test_labels)

During the training phase, you might notice that each epoch takes longer to train in comparison to a training a network without batch normalization layers. This is since the batch normalization adds a layer of complexity to the neural network, along with extra parameters required for the model to learn during training.

Although the increase in each epoch time is balanced with the fact that Batch Normalization reduces the time taken for the model to converge to an optimal solution.

The model implemented in this article is too shallow for us to notice the full benefits of utilizing batch normalization within a neural network architecture. Typically, batch normalization is found in deeper convolutional neural networks such as Xception , ResNet50 and Inception V3 .

Extra

The neural network implemented above has the Batch Normalization layer just before the activation layers. But it is entirely possible to add BN layers after activation layers.

# Placing batch normalization layer after the activation layers
model = keras.models.Sequential([
    keras.layers.Flatten(input_shape=[28,28]),
    keras.layers.Dense(300, use_bias=False),
    keras.layers.Activation(keras.activations.relu),
    keras.layers.BatchNormalization(),
    keras.layers.Dense(200, use_bias=False),
    keras.layers.Activation(keras.activations.relu),
    keras.layers.BatchNormalization(),
    keras.layers.Dense(100, use_bias=False),
    keras.layers.Activation(keras.activations.relu),
    keras.layers.BatchNormalization(),
    keras.layers.Dense(10, activation=keras.activations.softmax)
])

There has been some extensive work done by researchers on the Batch Normalization technique. For example Batch Renormalization and Self Normalizing Neural Networks

以上所述就是小编给大家介绍的《Batch Normalization In Neural Networks (Code)》，希望对大家有所帮助，如果大家有任何疑问请给我留言，小编会及时回复大家的。在此也非常感谢大家对码农网的支持！

查看所有标签

猜你喜欢:

Batch Normalization In Neural Networks (Code)

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

洞察人心

Steve Portigal / 张振东、蒋晓、戴传庆、孙启玉 / 电子工业出版社 / 2015-10 / 65.00元

用户在哪里，有什么需求？他们为什么会选用竞争对手的产品而不是你的？从大数据中固然能得出一些结论，但是要搞清楚作为地球上顶级复杂生物的人的真实想法，还是走近他们，面对面访谈更直接有效。用户访谈是一项技能，与一般的交谈有本质上的区别，需要遵从一定的步骤和方法。优秀的采访者用最自然的方式和用户进行交流，看似不经意，而实际上该说什么、何时说、如何说以及什么时候应该沉默，都有精准的权衡，都试图在闲聊......一起来看看《洞察人心》这本书的介绍吧!

码农工具