内容简介:DenseNet paper link:DenseNet (Dense Convolutional Network) is an architecture that focuses on making the deep learning networks go even deeper, but at the same time making them more efficient to train, by using shorter connections between the layers. Dense
DenseNet paper link: https://arxiv.org/pdf/1608.06993.pdf
DenseNet (Dense Convolutional Network) is an architecture that focuses on making the deep learning networks go even deeper, but at the same time making them more efficient to train, by using shorter connections between the layers. DenseNet is a convolutional neural network where each layer is connected to all other layers that are deeper in the network, that is, the first layer is connected to the 2nd, 3rd, 4th and so on, the second layer is connected to the 3rd, 4th, 5th and so on. This is done to enable maximum information flow between the layers of the network. To preserve the feed-forward nature, each layer obtains inputs from all the previous layers and passes on its own feature maps to all the layers which will come after it. Unlike Resnets it does not combine features through summation but combines the features by concatenating them. So the ‘ith’ layer has ‘i’ inputs and consists of feature maps of all its preceding convolutional blocks. Its own feature maps are passed on to all the next ‘I-i’ layers. This introduces ‘(I(I+1))/2’ connections in the network, rather than just ‘I’ connections as in traditional deep learning architectures. It hence requires fewer parameters than traditional convolutional neural networks, as there is no need to learn unimportant feature maps.
DenseNet consists of two important blocks other than the basic convolutional and pooling layers. they are the Dense Blocks and the Transition layers.
Next, we look at how all these blocks and layers look, and how to implement them in python.
DenseNet starts with a basic convolution and pooling layer. Then there is a dense block followed by a transition layer, another dense block followed by a transition layer, another dense block followed by a transition layer, and finally a dense block followed by a classification layer.
The first convolution block has 64 filters of size 7×7 and a stride of 2. It is followed by a MaxPooling layer with 3×3 max pooling and a stride of 2. These two lines can be represented with the following code in python.
input = Input (input_shape) x = Conv2D(64, 7, strides = 2, padding = 'same')(input) x = MaxPool2D(3, strides = 2, padding = 'same')(x)
Defining the convolutional block— Each convolutional block after the input has the following sequence: BatchNormalization, followed by ReLU activation and then the actual Conv2D layer. To implement that, we can write the following function.
#batch norm + relu + conv
def bn_rl_conv(x,filters,kernel=1,strides=1):
x = BatchNormalization()(x)
x = ReLU()(x)
x = Conv2D(filters, kernel, strides=strides,padding = 'same')(x)
return x
Defining the Dense block— As seen in figure 3, Every dense block has two convolutions, with 1×1 and 3×3 sized kernels. In dense block 1, this is repeated 6 times, in dense block 2 it is repeated 12 times, in dense block 3, 24 times and finally in dense block 4, 16 times.
In dense block, each of the 1×1 convolutions has 4 times the number of filters. So we use 4*filters, but 3×3 filters are only present once. Also, we have to concatenate the input with the output tensor.
Each block is run for the 6,12,24,16 repetitions respectively, using the ‘for loop’.
def dense_block(x, repetition):
for _ in range(repetition):
y = bn_rl_conv(x, 4*filters)
y = bn_rl_conv(y, filters, 3)
x = concatenate([y,x])
return x
Defining the transition layer— In the transition layer, we are to reduce the number of channels to half of the existing channels. There are a 1×1 convolutional layer and a 2×2 average pooling layer with a stride of 2. kernel size of 1×1 is already set in the function, bn_rl_conv, so we do not explicitly need to define it again.
In the transition layers, we have to remove channels to half of the existing channels. We have the input tensor x, and we want to find how many channels there are, and we need to get half of them. So we can use Keras backend (K) to take the tensor x and return a tuple with the dimension of x. And, we only require the last number of that shape, that is, the number of the filters. So we add [-1]. Finally, we can just divide this number of filters by 2 to get the desired result.
def transition_layer(x):
x = bn_rl_conv(x, K.int_shape(x)[-1] //2 )
x = AvgPool2D(2, strides = 2, padding = 'same')(x)
return x
So we are done with defining the dense blocks and transition layers. Now we need to stack the dense blocks and transition layers together. So we write a for loop to run through the 6,12,24,16 repetitions. So the loop runs 4 times, each time using one of the values from 6,12,24 or 16. This completes the 4 dense blocks and transition layers.
for repetition in [6,12,24,16]:
d = dense_block(x, repetition)
x = transition_layer(d)
In the end, there is GlobalAveragePooling, followed by the final output layer. As we see in the above code block, the dense block is defined by ‘d’, and in the final layer, after Dense block 4, there is no transition layer 4, but it directly goes into the classification layer. So, ‘d’ is the connection on which GlobalAveragePooling is applied, and not on ‘x’. Another alternative is to remove the ‘for’ loop from the code above and stack the layers one after the other without the final transition layer.
x = GlobalAveragePooling2D()(d) output = Dense(n_classes, activation = 'softmax')(x)
Now that we have all the blocks together, let’s merge them to see the entire DenseNet architecture.
Complete DenseNet 121 architecture:
import tensorflow as tf
from tensorflow.keras.layers import Input, Conv2D, BatchNormalization, Dense
from tensorflow.keras.layers import AvgPool2D, GlobalAveragePooling2D, MaxPool2D
from tensorflow.keras.models import Model
from tensorflow.keras.layers import ReLU, concatenate
import tensorflow.keras.backend as K# Creating Densenet121def densenet(input_shape, n_classes, filters = 32):
#batch norm + relu + conv
def bn_rl_conv(x,filters,kernel=1,strides=1):
x = BatchNormalization()(x)
x = ReLU()(x)
x = Conv2D(filters, kernel, strides=strides,padding = 'same')(x)
return x
def dense_block(x, repetition):
for _ in range(repetition):
y = bn_rl_conv(x, 4*filters)
y = bn_rl_conv(y, filters, 3)
x = concatenate([y,x])
return x
def transition_layer(x):
x = bn_rl_conv(x, K.int_shape(x)[-1] //2 )
x = AvgPool2D(2, strides = 2, padding = 'same')(x)
return x
input = Input (input_shape)
x = Conv2D(64, 7, strides = 2, padding = 'same')(input)
x = MaxPool2D(3, strides = 2, padding = 'same')(x)
for repetition in [6,12,24,16]:
d = dense_block(x, repetition)
x = transition_layer(d) x = GlobalAveragePooling2D()(d)
output = Dense(n_classes, activation = 'softmax')(x)
model = Model(input, output)
return modelinput_shape = 224, 224, 3
n_classes = 3model = densenet(input_shape,n_classes)
model.summary()
Output: (Assuming 3 final classes — last few lines of the model summary)
To view the architecture diagram , the following code can be used.
from tensorflow.python.keras.utils.vis_utils import model_to_dot
from IPython.display import SVG
import pydot
import graphviz
SVG(model_to_dot(
model, show_shapes=True, show_layer_names=True, rankdir='TB',
expand_nested=False, dpi=60, subgraph=False
).create(prog='dot',format='svg'))
Output — first few blocks of the diagram
And that’s how we can implement the DenseNet 121 architecture.
References:
- 1. Gao Huang and Zhuang Liu and Laurens van der Maaten and Kilian Q. Weinberger, Densely Connected Convolutional Networks, arXiv 1608.06993 (2016)
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
网站运维技术与实践
饶琛琳 / 电子工业出版社 / 2014-3 / 69.00元
网站运维工作,一向以内容繁杂、覆盖面广著称。《网站运维技术与实践》选取日常工作涉及的监测调优、日志分析、集群规划、自动化部署、存储和数据库等方面,力图深入阐述各项工作的技术要点及协议原理,并介绍相关开源产品的实践经验。在技术之外,作者也分享了一些关于高效工作及个人成长方面的心得。 《网站运维技术与实践》适合Linux 系统管理员、中大型网站运维工程师及技术负责人、DevOps 爱好者阅读。同......一起来看看 《网站运维技术与实践》 这本书的介绍吧!