内容简介:DenseNet paper link:DenseNet (Dense Convolutional Network) is an architecture that focuses on making the deep learning networks go even deeper, but at the same time making them more efficient to train, by using shorter connections between the layers. Dense
DenseNet paper link: https://arxiv.org/pdf/1608.06993.pdf
DenseNet (Dense Convolutional Network) is an architecture that focuses on making the deep learning networks go even deeper, but at the same time making them more efficient to train, by using shorter connections between the layers. DenseNet is a convolutional neural network where each layer is connected to all other layers that are deeper in the network, that is, the first layer is connected to the 2nd, 3rd, 4th and so on, the second layer is connected to the 3rd, 4th, 5th and so on. This is done to enable maximum information flow between the layers of the network. To preserve the feed-forward nature, each layer obtains inputs from all the previous layers and passes on its own feature maps to all the layers which will come after it. Unlike Resnets it does not combine features through summation but combines the features by concatenating them. So the ‘ith’ layer has ‘i’ inputs and consists of feature maps of all its preceding convolutional blocks. Its own feature maps are passed on to all the next ‘I-i’ layers. This introduces ‘(I(I+1))/2’ connections in the network, rather than just ‘I’ connections as in traditional deep learning architectures. It hence requires fewer parameters than traditional convolutional neural networks, as there is no need to learn unimportant feature maps.
DenseNet consists of two important blocks other than the basic convolutional and pooling layers. they are the Dense Blocks and the Transition layers.
Next, we look at how all these blocks and layers look, and how to implement them in python.
DenseNet starts with a basic convolution and pooling layer. Then there is a dense block followed by a transition layer, another dense block followed by a transition layer, another dense block followed by a transition layer, and finally a dense block followed by a classification layer.
The first convolution block has 64 filters of size 7×7 and a stride of 2. It is followed by a MaxPooling layer with 3×3 max pooling and a stride of 2. These two lines can be represented with the following code in python.
input = Input (input_shape) x = Conv2D(64, 7, strides = 2, padding = 'same')(input) x = MaxPool2D(3, strides = 2, padding = 'same')(x)
Defining the convolutional block— Each convolutional block after the input has the following sequence: BatchNormalization, followed by ReLU activation and then the actual Conv2D layer. To implement that, we can write the following function.
#batch norm + relu + conv
def bn_rl_conv(x,filters,kernel=1,strides=1):
x = BatchNormalization()(x)
x = ReLU()(x)
x = Conv2D(filters, kernel, strides=strides,padding = 'same')(x)
return x
Defining the Dense block— As seen in figure 3, Every dense block has two convolutions, with 1×1 and 3×3 sized kernels. In dense block 1, this is repeated 6 times, in dense block 2 it is repeated 12 times, in dense block 3, 24 times and finally in dense block 4, 16 times.
In dense block, each of the 1×1 convolutions has 4 times the number of filters. So we use 4*filters, but 3×3 filters are only present once. Also, we have to concatenate the input with the output tensor.
Each block is run for the 6,12,24,16 repetitions respectively, using the ‘for loop’.
def dense_block(x, repetition):
for _ in range(repetition):
y = bn_rl_conv(x, 4*filters)
y = bn_rl_conv(y, filters, 3)
x = concatenate([y,x])
return x
Defining the transition layer— In the transition layer, we are to reduce the number of channels to half of the existing channels. There are a 1×1 convolutional layer and a 2×2 average pooling layer with a stride of 2. kernel size of 1×1 is already set in the function, bn_rl_conv, so we do not explicitly need to define it again.
In the transition layers, we have to remove channels to half of the existing channels. We have the input tensor x, and we want to find how many channels there are, and we need to get half of them. So we can use Keras backend (K) to take the tensor x and return a tuple with the dimension of x. And, we only require the last number of that shape, that is, the number of the filters. So we add [-1]. Finally, we can just divide this number of filters by 2 to get the desired result.
def transition_layer(x):
x = bn_rl_conv(x, K.int_shape(x)[-1] //2 )
x = AvgPool2D(2, strides = 2, padding = 'same')(x)
return x
So we are done with defining the dense blocks and transition layers. Now we need to stack the dense blocks and transition layers together. So we write a for loop to run through the 6,12,24,16 repetitions. So the loop runs 4 times, each time using one of the values from 6,12,24 or 16. This completes the 4 dense blocks and transition layers.
for repetition in [6,12,24,16]:
d = dense_block(x, repetition)
x = transition_layer(d)
In the end, there is GlobalAveragePooling, followed by the final output layer. As we see in the above code block, the dense block is defined by ‘d’, and in the final layer, after Dense block 4, there is no transition layer 4, but it directly goes into the classification layer. So, ‘d’ is the connection on which GlobalAveragePooling is applied, and not on ‘x’. Another alternative is to remove the ‘for’ loop from the code above and stack the layers one after the other without the final transition layer.
x = GlobalAveragePooling2D()(d) output = Dense(n_classes, activation = 'softmax')(x)
Now that we have all the blocks together, let’s merge them to see the entire DenseNet architecture.
Complete DenseNet 121 architecture:
import tensorflow as tf
from tensorflow.keras.layers import Input, Conv2D, BatchNormalization, Dense
from tensorflow.keras.layers import AvgPool2D, GlobalAveragePooling2D, MaxPool2D
from tensorflow.keras.models import Model
from tensorflow.keras.layers import ReLU, concatenate
import tensorflow.keras.backend as K# Creating Densenet121def densenet(input_shape, n_classes, filters = 32):
#batch norm + relu + conv
def bn_rl_conv(x,filters,kernel=1,strides=1):
x = BatchNormalization()(x)
x = ReLU()(x)
x = Conv2D(filters, kernel, strides=strides,padding = 'same')(x)
return x
def dense_block(x, repetition):
for _ in range(repetition):
y = bn_rl_conv(x, 4*filters)
y = bn_rl_conv(y, filters, 3)
x = concatenate([y,x])
return x
def transition_layer(x):
x = bn_rl_conv(x, K.int_shape(x)[-1] //2 )
x = AvgPool2D(2, strides = 2, padding = 'same')(x)
return x
input = Input (input_shape)
x = Conv2D(64, 7, strides = 2, padding = 'same')(input)
x = MaxPool2D(3, strides = 2, padding = 'same')(x)
for repetition in [6,12,24,16]:
d = dense_block(x, repetition)
x = transition_layer(d) x = GlobalAveragePooling2D()(d)
output = Dense(n_classes, activation = 'softmax')(x)
model = Model(input, output)
return modelinput_shape = 224, 224, 3
n_classes = 3model = densenet(input_shape,n_classes)
model.summary()
Output: (Assuming 3 final classes — last few lines of the model summary)
To view the architecture diagram , the following code can be used.
from tensorflow.python.keras.utils.vis_utils import model_to_dot
from IPython.display import SVG
import pydot
import graphviz
SVG(model_to_dot(
model, show_shapes=True, show_layer_names=True, rankdir='TB',
expand_nested=False, dpi=60, subgraph=False
).create(prog='dot',format='svg'))
Output — first few blocks of the diagram
And that’s how we can implement the DenseNet 121 architecture.
References:
- 1. Gao Huang and Zhuang Liu and Laurens van der Maaten and Kilian Q. Weinberger, Densely Connected Convolutional Networks, arXiv 1608.06993 (2016)
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
零基础学Java Web开发
刘聪 编 / 机械工业出版社 / 2008-1 / 59.00元
《零基础学Java Web开发:JSP+Servlet+Sfruts+Spring+Hibernte》全面讲解Java Web应用开发的编程技术,并详细介绍Java Web开发中各种常用的技术,可作为Java Web开发技术的学习指南。 《零基础学Java Web开发:JSP+Servlet+Sfruts+Spring+Hibernte》共17章,分为3篇,其中第1~12章是基础篇,讲解了......一起来看看 《零基础学Java Web开发》 这本书的介绍吧!
CSS 压缩/解压工具
在线压缩/解压 CSS 代码
HTML 编码/解码
HTML 编码/解码