PyTorch深度学习实践Part10——卷积神经网络(高级篇)

Inception

寻找超参数是十分困难的,GoogleNet把不同的模型作成块Inception,在训练时优秀的超参数模块权重自然增加。

Concatenate拼接四个分支算出来的张量。

不同的分支,可以有不同的channel,但要有相同的width、height。

pooling也可以设置padding=1、stride=1来保证输出大小一样。

1*1卷积,其数量取决于输入张量的通道。

1*1卷积的信息融合,是在每一个像素点多通道方面的融合。

1*1卷积主要解决运算量过大的问题,可以减少下一层输入通道数量。

GoogleNet

image-20210120113458095

最后输出的大小一般会去掉线性层,先实例化跑一遍输出size。

在写网络时,要加上一个存盘功能,即每次准确率达到新高时做一次模型数据备份,防止意外。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
import torch
import torch.nn as nn
from torchvision import transforms
from torchvision import datasets
from torch.utils.data import DataLoader
import torch.nn.functional as F
import torch.optim as optim

# prepare dataset

batch_size = 64
# Compose参数列表:转为张量;归一化,均值和方差
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])

train_dataset = datasets.MNIST(root='../dataset/mnist/', train=True, download=False, transform=transform)
train_loader = DataLoader(train_dataset, shuffle=True, batch_size=batch_size)
test_dataset = datasets.MNIST(root='../dataset/mnist/', train=False, download=False, transform=transform)
test_loader = DataLoader(test_dataset, shuffle=False, batch_size=batch_size)


# design model using class
# network in network
class InceptionA(nn.Module):
def __init__(self, in_channels):
super(InceptionA, self).__init__()
# 4条分支
# 1. 1*1卷积
self.branch1x1 = nn.Conv2d(in_channels, 16, kernel_size=1)

# 2. 1*1卷积+5*5卷积
self.branch5x5_1 = nn.Conv2d(in_channels, 16, kernel_size=1)
self.branch5x5_2 = nn.Conv2d(16, 24, kernel_size=5, padding=2) # 保持图像大小不变,kernel=5,则padding=2

# 3. 1*1卷积+3*3卷积+3*3卷积
self.branch3x3_1 = nn.Conv2d(in_channels, 16, kernel_size=1)
self.branch3x3_2 = nn.Conv2d(16, 24, kernel_size=3, padding=1) # 保持图像大小不变,kernel=3,则padding=1
self.branch3x3_3 = nn.Conv2d(24, 24, kernel_size=3, padding=1)

# 4, 池化(函数)+1*1卷积
self.branch_pool = nn.Conv2d(in_channels, 24, kernel_size=1)

def forward(self, x):
branch1x1 = self.branch1x1(x)

branch5x5 = self.branch5x5_1(x)
branch5x5 = self.branch5x5_2(branch5x5)

branch3x3 = self.branch3x3_1(x)
branch3x3 = self.branch3x3_2(branch3x3)
branch3x3 = self.branch3x3_3(branch3x3)

# 池化是函数,不需要训练,只在forward中调用
# 池化也可以使图像大小不变。1. kernel_size=3,则padding=1;2. stride=1
branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)
branch_pool = self.branch_pool(branch_pool)

# Concatenate
outputs = [branch1x1, branch5x5, branch3x3, branch_pool]
return torch.cat(outputs, dim=1) # b,c,w,h c对应的是dim=1,沿着channel的维度拼接


class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
self.conv2 = nn.Conv2d(88, 20, kernel_size=5) # 88 = 24x3 + 16

self.incep1 = InceptionA(in_channels=10) # 与conv1 中的10对应
self.incep2 = InceptionA(in_channels=20) # 与conv2 中的20对应

self.mp = nn.MaxPool2d(2) # MaxPooling
self.fc = nn.Linear(1408, 10) # FullConnecting

def forward(self, x):
in_size = x.size(0)
x = F.relu(self.mp(self.conv1(x)))
x = self.incep1(x)
x = F.relu(self.mp(self.conv2(x)))
x = self.incep2(x)
x = x.view(in_size, -1) # -1指在不告诉函数有多少列的情况下,根据原tensor数据和batch自动分配列数
x = self.fc(x)

return x


model = Net()

# construct loss and optimizer
criterion = torch.nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)


# training cycle
def train(epoch):
running_loss = 0.0
# enumerate()用于可迭代\可遍历的数据对象组合为一个索引序列,同时列出数据和数据下标
for batch_idx, data in enumerate(train_loader, 0):
# data里面包含图像数据inputs(tensor)和标签labels(tensor)
inputs, target = data
optimizer.zero_grad()
# forward
outputs = model(inputs)
loss = criterion(outputs, target)
# backward
loss.backward()
# update
optimizer.step()

running_loss += loss.item()
if batch_idx % 300 == 299:
print('[%d, %5d] loss: %.3f' % (epoch + 1, batch_idx + 1, running_loss / 300))
running_loss = 0.0


def test():
correct = 0
total = 0
with torch.no_grad(): # 不需要计算张量
for data in test_loader:
images, labels = data
outputs = model(images)
_, predicted = torch.max(outputs.data, dim=1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print('accuracy on test set: %d %% ' % (100 * correct / total))


if __name__ == '__main__':
for epoch in range(10):
train(epoch)
test()

``torch.max()[0]` 只返回最大值的每个数

troch.max()[1] 只返回最大值的每个索引

torch.max()[1].data 只返回variable中的数据部分(去掉Variable containing:)

torch.max()[1].data.numpy() 把数据转化成numpy ndarry

torch.max()[1].data.numpy().squeeze() 把数据条目中维度为1 的删除掉`

残差网络(Residual Net)

随着网络层数增加,越靠近输入模块的梯度更新就越慢,很可能导致梯度消失

为了解决梯度消失,会在激活之前加入一个跳连接。

在使用Residual Block时要保持输入和输出通道相同。

Residual Block相当于把一串Weight Layer包裹起来。

写神经网络也要写测试方法,检验输出是否和预计相同,逐步增加网络规模(增量式开发)。

image-20210120182501998

  1. 从数学和工程学方面重新理解深度学习理论。《深度学习》花书。
  2. 通读PyTorch文档。
  3. 复现经典工作、论文。读代码→写代码
  4. 扩充视野

两篇论文:

  1. He K, Zhang X, Ren S, et al. Identity Mappings in Deep Residual Networks[C]
  2. Huang G, Liu Z, Laurens V D M, et al. Densely Connected Convolutional Networks[J]. 2016:2261-2269.