Implementing an Autoencoder in PyTorch

栏目: IT技术 · 发布时间: 4年前

Implementing an Autoencoder in PyTorch

Building an autoencoder model for reconstruction

Implementing an Autoencoder in PyTorch

Logo retrieved from Wikimedia Commons .

T his is the PyTorch equivalent of my previous article on implementing an autoencoder in TensorFlow 2.0, which you may read through the following link,

First, to install PyTorch, you may use the following pip command,

pip install torch torchvision

The torchvision package contains the image data sets that are ready for use in PyTorch.

More details on its installation through this guide from pytorch.org .

Autoencoder

Since the linked article above already explains what is an autoencoder, we will only briefly discuss what it is.

An autoencoder is a type of neural network that finds the function mapping the features x to itself. This objective is known as reconstruction , and an autoencoder accomplishes this through the following process: (1) an encoder learns the data representation in lower-dimension space, i.e. extracting the most salient features of the data, and (2) a decoder learns to reconstruct the original data based on the learned representation by the encoder.

Mathematically, process (1) learns the data representation z from the input features x , which then serves as an input to the decoder.

Implementing an Autoencoder in PyTorch
z is the learned data representation by the encoder f(h(x)) from input data x.

Then, process (2) tries to reconstruct the data based on the learned data representation z .

Implementing an Autoencoder in PyTorch
x-hat is the reconstructed data by the decoder f(h(z)) based on the learned representation z .

The encoder and the decoder are neural networks that build the autoencoder model, as depicted in the following figure,

Implementing an Autoencoder in PyTorch

Illustrated using NN-SVG . An autoencoder is an artificial neural network that aims to learn how to reconstruct a data.

To simplify the implementation, we write the encoder and decoder layers in one class as follows,

The autoencoder model written as a custom torch.nn.Module.

Explaining some of the components in the code snippet above,

  • The torch.nn.Linear layer creates a linear function ( θx + b ), with its parameters initialized (by default) with He/Kaiming uniform initialization , as it can be confirmed here . This means we will call an activation/non-linearity for such layers.
  • The in_features parameter dictates the feature size of the input tensor to a particular layer, e.g. in self.encoder_hidden_layer , it accepts an input tensor with the size of [N, input_shape] where N is the number of examples, and input_shape is the number of features in one example.
  • The out_features parameter dictates the feature size of the output tensor of a particular layer. Hence, in the self.decoder_output_layer , the feature size is kwargs["input_shape"] , denoting that it reconstructs the original data input.
  • The forward() function defines the forward pass for a model, similar to call in tf.keras.Model . This is the function invoked when we pass input tensors to an instantiated object of a torch.nn.Module class.

To optimize our autoencoder to reconstruct data, we minimize the following reconstruction loss,

Implementing an Autoencoder in PyTorch
The reconstruction error in this case is the mean-squared error function that you’re likely to be familiar with.

We instantiate an autoencoder class, and move (using the to() function) its parameters to a torch.device , which may be a GPU ( cuda device, if one exists in your system) or a CPU (lines 2 and 6 in the code snippet below).

Then, we create an optimizer object (line 10) that will be used to minimize our reconstruction loss (line 13).

Instantiating an autoencoder model, an optimizer, and a loss function for training.

For this article, let’s use our favorite dataset, MNIST. In the following code snippet, we load the MNIST dataset as tensors using the torchvision.transforms.ToTensor() class. The dataset is downloaded ( download=True ) to the specified directory ( root=<directory> ) when it is not yet present in our system.

Loading the MNIST dataset, and creating a data loader object for it.

After loading the dataset, we create a torch.utils.data.DataLoader object for it, which will be used in model computations.

Finally, we can train our model for a specified number of epochs as follows,

Training loop for the autoencoder model.

In our data loader, we only need to get the features since our goal is reconstruction using autoencoder (i.e. an unsupervised learning goal). The features loaded are 3D tensors by default, e.g. for the training data, its size is [60000, 28, 28] . Since we defined our in_features for the encoder layer above as the number of features, we pass 2D tensors to the model by reshaping batch_features using the .view(-1, 784) function (think of this as np.reshape() in NumPy), where 784 is the size for a flattened image with 28 by 28 pixels such as MNIST.

At each epoch, we reset the gradients back to zero by using optimizer.zero_grad() , since PyTorch accumulates gradients on subsequent passes. Of course, we compute a reconstruction on the training examples by calling our model on it, i.e. outputs = model(batch_features) . Subsequently, we compute the reconstruction loss on the training examples, and perform backpropagation of errors with train_loss.backward() , and optimize our model with optimizer.step() based on the current gradients computed using the .backward() function call.

To see how our training is going, we accumulate the training loss for each epoch ( loss += training_loss.item() ), and compute the average training loss across an epoch ( loss = loss / len(train_loader) ).

Results

For this article, the autoencoder model was trained for 20 epochs, and the following figure plots the original (top) and reconstructed (bottom) MNIST images.

Implementing an Autoencoder in PyTorch

Plotted using matplotlib . Results on MNIST handwritten digit dataset. Images at the top row are the original ones while images at the bottom row are the reconstructed ones.

In case you want to try this autoencoder on other datasets, you can take a look at the available image datasets from torchvision .

Closing Remarks

I hope this has been a clear tutorial on implementing an autoencoder in PyTorch. To further improve the reconstruction capability of our implemented autoencoder, you may try to use convolutional layers ( torch.nn.Conv2d ) to build a convolutional neural network-based autoencoder.

The corresponding notebook to this article is available here . In case you have any feedback, you may reach me through Twitter .

References

  1. A.F. Agarap, Implementing an Autoencoder in TensorFlow 2.0 (2019). Towards Data Science.
  2. I. Goodfellow, Y. Bengio, & A. Courville, Deep learning (2016). MIT press.
  3. A. Paszke, et al. PyTorch: An imperative style, high-performance deep learning library (2019). Advances in Neural Information Processing Systems.
  4. PyTorch Documentation. https://pytorch.org/docs/stable/nn.html .

以上所述就是小编给大家介绍的《Implementing an Autoencoder in PyTorch》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

这还是马云

这还是马云

陈伟 / 浙江人民出版社 / 2013-5 / 39.80元

“幽默马云”、“开心马云”、“顽皮马云”、“狂妄马云”……《这还是马云(全新升级版)》由陈伟所著,《这还是马云(全新升级版)》从各个角度揭开了实际生活中“千面马云”的真面目,告诉你一个与想象中大不一样的马云。这不只是一本书,更像一部喜剧电影,让你通过声音、色彩、表情等诸多要素走近马云,感受阿里巴巴。没有冗长的说教,只有让人忍俊不禁的细节;没有高深的理论,只有通俗、诚恳的陈述。作者借幽默平常的琐事,......一起来看看 《这还是马云》 这本书的介绍吧!

RGB转16进制工具
RGB转16进制工具

RGB HEX 互转工具

XML 在线格式化
XML 在线格式化

在线 XML 格式化压缩工具