CIFAR-10 数据集实战——构建ResNet18神经网络

栏目: IT技术 · 发布时间: 4年前

内容简介:如果不了解ResNet的同学可以先看我的这篇博客ResNet论文阅读首先实现一个Residual Block

如果不了解ResNet的同学可以先看我的这篇博客ResNet论文阅读

CIFAR-10 数据集实战——构建ResNet18神经网络

首先实现一个Residual Block

import torch
from torch import nn
from torch.nn import functional as F

class ResBlk(nn.Module):
    def __init__(self, ch_in, ch_out, stride=1):
        super(ResBlk, self).__init__()
        self.conv1 = nn.Conv2d(ch_in, ch_out, kernel_size=3, stride=stride, padding=1)
        self.bn1 = nn.BatchNorm2d(ch_out)
        
        self.conv2 = nn.Conv2d(ch_out, ch_out, kernel_size=3, stride=1, padding=1)
        self.bn2 = nn.BatchNorm2d(ch_out)
        
        if ch_out == ch_in:
            self.extra = nn.Sequential()
        else:
            self.extra = nn.Sequential(
                
                # 1×1的卷积作用是修改输入x的channel
                # [b, ch_in, h, w] => [b, ch_out, h, w]
                nn.Conv2d(ch_in, ch_out, kernel_size=1, stride=stride),
                nn.BatchNorm2d(ch_out),
            )
        
    def forward(self, x):
        out = F.relu(self.bn1(self.conv1(x)))
        out = self.bn2(self.conv2(out))

        # short cut
        out = self.extra(x) + out
        out = F.relu(out)
        
        return out

Block中进行了正则化处理,以使train过程更快更稳定。同时要考虑,如果两元素的ch_in和ch_out不匹配,进行加法时会报错,因此需要判断一下,如果不想等,就用1×1的卷积调整一下

测试一下

blk = ResBlk(64, 128, stride=2)
tmp = torch.randn(2, 64, 32, 32)
out = blk(tmp)
print(out.shape)

输出的shape大小是 torch.Size([2, 128, 16, 16])

这里解释一下,为什么有的层要专门设置stride。先不考虑别的层,对于一个Residual block,channel从64增大到128,如果所有的stride都是1,padding也是1,那么图片的w和h也不会变,但是channel增大了,此时就会导致整个网络的参数增多。而这才仅仅一个Block,更不用说后面的FC以及更多Block了,所以stride不能全部设置为1,不要让网络的参数一直增大

然后我们搭建完整的ResNet-18

class ResNet18(nn.Module):
    def __init__(self):
        super(ResNet18, self).__init__()
        
        self.conv1 = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=3, stride=3, padding=0),
            nn.BatchNorm2d(64),
        )
        # followed 4 blocks
        
        # [b, 64, h, w] => [b, 128, h, w]
        self.blk1 = ResBlk(64, 128, stride=2)
        # [b, 128, h, w] => [b, 256, h, w]
        self.blk2 = ResBlk(128, 256, stride=2)
        # [b, 256, h, w] => [h, 512, h, w]
        self.blk3 = ResBlk(256, 512, stride=2)
        # [b, 512, h, w] => [h, 1024, h, w]
        self.blk4 = ResBlk(512, 512, stride=2)
        
        self.outlayer = nn.Linear(512*1*1, 10)
    
    def forward(self, x):
        x = F.relu(self.conv1(x))
        
        # 经过四个blk以后 [b, 64, h, w] => [b, 1024, h, w]
        x = self.blk1(x)
        x = self.blk2(x)
        x = self.blk3(x)
        x = self.blk4(x)
        
        x = self.outlayer(x)
        
        return x

测试一下

x = torch.randn(2, 3, 32, 32)
model = ResNet18()
out = model(x)
print("ResNet:", out.shape)

结果报错了,错误信息如下

size mismatch, m1: [2048 x 2], m2: [512 x 10] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:961

问题在于我们最后定义线性层的输入维度,和上一层Block的输出维度不匹配,在ResNet18的最后一个Block运行结束后打印一下当前x的shape,结果是 torch.Size([2, 512, 2, 2])

解决办法有很多,可以修改线性层的输入进行匹配,也可以在最后一层Block后面再进行一些操作,使其与512匹配

先给出修改后的代码,在做解释

class ResNet18(nn.Module):
    def __init__(self):
        super(ResNet18, self).__init__()
        
        self.conv1 = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=3, stride=3, padding=0),
            nn.BatchNorm2d(64),
        )
        # followed 4 blocks
        
        # [b, 64, h, w] => [b, 128, h, w]
        self.blk1 = ResBlk(64, 128, stride=2)
        # [b, 128, h, w] => [b, 256, h, w]
        self.blk2 = ResBlk(128, 256, stride=2)
        # [b, 256, h, w] => [h, 512, h, w]
        self.blk3 = ResBlk(256, 512, stride=2)
        # [b, 512, h, w] => [h, 1024, h, w]
        self.blk4 = ResBlk(512, 512, stride=2)
        
        self.outlayer = nn.Linear(512*1*1, 10)
    
    def forward(self, x):
        x = F.relu(self.conv1(x))
        
        # 经过四个blk以后 [b, 64, h, w] => [b, 1024, h, w]
        x = self.blk1(x)
        x = self.blk2(x)
        x = self.blk3(x)
        x = self.blk4(x)
        
        # print("after conv:", x.shape) # [b, 512, 2, 2]
        
        # [b, 512, h, w] => [b, 512, 1, 1]
        x = F.adaptive_avg_pool2d(x, [1, 1])
        
        x = x.view(x.size(0), -1) # [b, 512, 1, 1] => [b, 512*1*1]
        x = self.outlayer(x)
        
        return x

这里我采用的是第二种方法,在最后一个Block结束以后,接了一个自适应的pooling层,这个pooling的作用是将不论输入的宽高是多少,全部输出称宽高都是1的tensor,其他维度保持不变。然后再做一个reshape操作,将 [batchsize, 512, 1, 1] reshape成 [batchsize, 512*1*1] 大小的tensor,这样就和接下来的线性层对上了,线性层的输入大小是512,输出是10。因此整个网络最终输出的shape就是 [batchsize, 10]

最后我们把之前训练LeNet5的代码拷贝过来,将里面的 model=LeNet5() 改为 model=ResNet18() 就行了。完整代码如下

import torch
from torch import nn, optim
import torch.nn.functional as F
from torch.utils.data import DataLoader
from torchvision import datasets, transforms


batch_size=32
cifar_train = datasets.CIFAR10(root='cifar', train=True, transform=transforms.Compose([
    transforms.Resize([32, 32]),
    transforms.ToTensor(),
]), download=True)

cifar_train = DataLoader(cifar_train, batch_size=batch_size, shuffle=True)

cifar_test = datasets.CIFAR10(root='cifar', train=False, transform=transforms.Compose([
    transforms.Resize([32, 32]),
    transforms.ToTensor(),
]), download=True)
    
cifar_test = DataLoader(cifar_test, batch_size=batch_size, shuffle=True)      

class ResBlk(nn.Module):
    def __init__(self, ch_in, ch_out, stride=1):
        super(ResBlk, self).__init__()
        self.conv1 = nn.Conv2d(ch_in, ch_out, kernel_size=3, stride=stride, padding=1)
        self.bn1 = nn.BatchNorm2d(ch_out)
        
        self.conv2 = nn.Conv2d(ch_out, ch_out, kernel_size=3, stride=1, padding=1)
        self.bn2 = nn.BatchNorm2d(ch_out)
        
        if ch_out == ch_in:
            self.extra = nn.Sequential()
        else:
            self.extra = nn.Sequential(
                
                # 1×1的卷积作用是修改输入x的channel
                # [b, ch_in, h, w] => [b, ch_out, h, w]
                nn.Conv2d(ch_in, ch_out, kernel_size=1, stride=stride),
                nn.BatchNorm2d(ch_out),
            )
        
    def forward(self, x):
        out = F.relu(self.bn1(self.conv1(x)))
        out = self.bn2(self.conv2(out))

        # short cut
        out = self.extra(x) + out
        out = F.relu(out)
        
        return out
        
class ResNet18(nn.Module):
    def __init__(self):
        super(ResNet18, self).__init__()
        
        self.conv1 = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=3, stride=3, padding=0),
            nn.BatchNorm2d(64),
        )
        # followed 4 blocks
        
        # [b, 64, h, w] => [b, 128, h, w]
        self.blk1 = ResBlk(64, 128, stride=2)
        # [b, 128, h, w] => [b, 256, h, w]
        self.blk2 = ResBlk(128, 256, stride=2)
        # [b, 256, h, w] => [h, 512, h, w]
        self.blk3 = ResBlk(256, 512, stride=2)
        # [b, 512, h, w] => [h, 1024, h, w]
        self.blk4 = ResBlk(512, 512, stride=2)
        
        self.outlayer = nn.Linear(512*1*1, 10)
    
    def forward(self, x):
        x = F.relu(self.conv1(x))
        
        # 经过四个blk以后 [b, 64, h, w] => [b, 1024, h, w]
        x = self.blk1(x)
        x = self.blk2(x)
        x = self.blk3(x)
        x = self.blk4(x)
        
        # print("after conv:", x.shape) # [b, 512, 2, 2]
        
        # [b, 512, h, w] => [b, 512, 1, 1]
        x = F.adaptive_avg_pool2d(x, [1, 1])
        
        x = x.view(x.size(0), -1) # [b, 512, 1, 1] => [b, 512*1*1]
        x = self.outlayer(x)
        
        return x

def main():

    ##########  train  ##########
    #device = torch.device('cuda')
    #model = ResNet18().to(device)
    criteon = nn.CrossEntropyLoss()
    model = ResNet18()
    optimizer = optim.Adam(model.parameters(), 1e-3)
    for epoch in range(1000):
        model.train()
        for batchidx, (x, label) in enumerate(cifar_train):
            #x, label = x.to(device), label.to(device)
            logits = model(x)
            # logits: [b, 10]
            # label:  [b]
            loss = criteon(logits, label)
            
            # backward
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
        
        print('train:', epoch, loss.item())
        
        ########## test  ##########
        model.eval()
        with torch.no_grad():
            total_correct = 0
            total_num = 0
            for x, label in cifar_test:
                # x, label = x.to(device), label.to(device)

                # [b]
                logits = model(x)
                # [b]
                pred = logits.argmax(dim=1)
                # [b] vs [b]
                total_correct += torch.eq(pred, label).float().sum().item()
                total_num += x.size(0)
            acc = total_correct / total_num
            print('test:', epoch, acc)

if __name__ == '__main__':
    main()

CIFAR-10 数据集实战——构建ResNet18神经网络

ResNet和LeNet相比,准确率提升的很快,但是由于层数增加,不可避免的会导致运行时间增加,如果没有GPU,运行一个epoch大概要15分钟。读者同样可以在此基础上修改网络结构,运用一些tricks,比方说一开始就对图片做一个Normalize等


以上所述就是小编给大家介绍的《CIFAR-10 数据集实战——构建ResNet18神经网络》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

Chinese Authoritarianism in the Information Age

Chinese Authoritarianism in the Information Age

Routledge / 2018-2-13 / GBP 115.00

This book examines information and public opinion control by the authoritarian state in response to popular access to information and upgraded political communication channels among the citizens in co......一起来看看 《Chinese Authoritarianism in the Information Age》 这本书的介绍吧!

图片转BASE64编码
图片转BASE64编码

在线图片转Base64编码工具

HTML 编码/解码
HTML 编码/解码

HTML 编码/解码

RGB CMYK 转换工具
RGB CMYK 转换工具

RGB CMYK 互转工具