PyTorch深度学习实践Part9——多分类问题

二分类与多分类

  1. 多输出之间会有抑制关系,不能用二分类分别对n个目标输出n次。

  2. 二分类对0/1只需要求对一个的概率就行,但是多分类需要研究分布差异。

  3. 中间层用Sigmoid变换,最终输出层用Softmax输出一个分布,将每个最终输出z都变化成大于0且和为1(先转正,再归一)。

    image-20210118152538935

    image-20210118152129056

    image-20210118153734193

  4. 在使用交叉熵损失时,最后一层线性输出不用做激活变换。

    image-20210118153824744

  5. 要理解 CrossEntropyLoss 和 LogSoftmax + NLLLoss 之间的区别

    https://pytorch.org/docs/stable/nn.html#crossentropyloss

    https://pytorch.org/docs/stable/nn.html#nllloss

代码实现

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
import torch
from torchvision import transforms # 针对图像进行的处理
from torchvision import datasets
from torch.utils.data import DataLoader
import torch.nn.functional as F
import torch.optim as optim

# prepare dataset

batch_size = 64
# 神经网络输入值在[-1,1]效果最好,服从正态分布
# 构建的是Compose类的对象,参数是列表[]
# transforms.ToTensor():PIL Image => PyTorch Tensor,单通道变多通道
# transforms.Normalize((mean,), (std,):归一化,正态分布需要的期望和标准差,映射到[0,1]分布。数据是算好的,换成标准之后可以解决梯度爆炸问题
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])

train_dataset = datasets.MNIST(root='../dataset/mnist/', train=True, download=True, transform=transform)
train_loader = DataLoader(train_dataset, shuffle=True, batch_size=batch_size)
test_dataset = datasets.MNIST(root='../dataset/mnist/', train=False, download=True, transform=transform)
test_loader = DataLoader(test_dataset, shuffle=False, batch_size=batch_size)


# design model using class

class Net(torch.nn.Module):
def __init__(self):
super(Net, self).__init__()
self.linear1 = torch.nn.Linear(784, 512)
self.linear2 = torch.nn.Linear(512, 256)
self.linear3 = torch.nn.Linear(256, 128)
self.linear4 = torch.nn.Linear(128, 64)
self.linear5 = torch.nn.Linear(64, 10)

def forward(self, x):
x = x.view(-1, 784) # -1自动获取mini-batch:N。把样本[N,1,28,28]转变成[N,784]
x = F.relu(self.linear1(x))
x = F.relu(self.linear2(x))
x = F.relu(self.linear3(x))
x = F.relu(self.linear4(x))
return self.linear5(x) # 最后一层不做非线性变换


model = Net()

# construct loss and optimizer
criterion = torch.nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5) # momentum冲量


# training cycle

def train(epoch):
running_loss = 0.0
for batch_idx, data in enumerate(train_loader, 0):
inputs, target = data
optimizer.zero_grad()

# forward
outputs = model(inputs)
loss = criterion(outputs, target)
# backward
loss.backward()
# update
optimizer.step()

running_loss += loss.item()
if batch_idx % 300 == 299:
print('[%d, %5d] loss: %.3f' % (epoch + 1, batch_idx + 1, running_loss / 300))
running_loss = 0


def test():
correct = 0
total = 0
with torch.no_grad(): # 包裹的一部分不会构建计算图
for data in test_loader:
images, labels = data
outputs = model(images)
_, predicted = torch.max(outputs.data, dim=1) # 列是dim=0,行是dim=1。返回最大值和最大值的下标
total += labels.size(0)
correct += (predicted == labels).sum().item() # 序列求和,一共猜对的数量
print('accuracy on test set: %d %% ' % (100 * correct / total))


if __name__ == '__main__':
for epoch in range(10):
train(epoch)
test()

# [1, 300] loss: 2.244
# [1, 600] loss: 1.005
# [1, 900] loss: 0.437
# ..........................
# [8, 300] loss: 0.045
# [8, 600] loss: 0.051
# [8, 900] loss: 0.048
# accuracy on test set: 97 %
# [9, 300] loss: 0.034
# [9, 600] loss: 0.039
# [9, 900] loss: 0.044
# accuracy on test set: 97 %
# [10, 300] loss: 0.030
# [10, 600] loss: 0.027
# [10, 900] loss: 0.036
# accuracy on test set: 96 %

  1. torch.no_grad() Python中with的用法
  2. Python中各种下划线的操作
  3. torch.max( )的用法 torch.max( )使用讲解
  4. 用全连接神经网络训练图像会忽略局部信息的利用,在距离很远的两个点都会产生联系,而这个是没必要的。
  5. 图像的特征提取:傅里叶变换(缺点:都是正弦波)、Wavelet、小波。但是这些都是人工提取。自动提取的有:CNN