下面的例子是简单的定义,forward
写得有点好看。
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(conv1(x))
return F.relu(conv2(x))
A sequential container.
Modules will be added to it in the order they are passed in the constructor. Alternatively, an ordered dict of modules can also be passed in.
例子里面给了用OrderedDict
传入modules的方法,好处是可以起名。
# Example of using Sequential
model = nn.Sequential(
nn.Conv2d(1, 20, 5),
nn.ReLU(),
nn.Conv2d(20, 64, 5),
nn.ReLU()
)
# Example of using Sequential with OrderedDict
model = nn.Sequential(OrderedDict([
('conv1', nn.Conv2d(1, 20, 5)),
('ReLU1', nn.ReLU()),
('conv2', nn.Conv2d(20, 64, 5)),
('ReLU2', nn.ReLU())
]))
Holds submodules in a list.
ModuleList can be indexed like a regular Python list, but modules it contains are properly registered, and will be visible by all Module methods.
ModuleList
里面可以用循环定义,forward
时也可以用enumerate
套娃。
class MyModel(nn.Module):
def __init__(self):
super(MyModel, self).__init__()
self.linears = nn.ModuleList([nn.linear for i in range(10)])
# ModuleList can act as an iterable, or be indexed using ints
def forward(self, x):
for i, l in enumerate(self.linears):
x = self.linears[i // 2](x) + l(x)
return x
nn.Sequential
里面的模块按照顺序进行排列的,所以必须确保前一个模块的输出大小和下一个模块的输入大小是一致的。下面的例子:
import torch.nn as nn
class net_seq(nn.Module):
def __init__(self):
super(net_seq, self).__init__()
self.seq = nn.Sequential(
nn.Conv2d(1,20,5),
nn.ReLU(),
nn.Conv2d(20,64,5),
nn.ReLU()
)
def forward(self, x):
return self.seq(x)
net_seq = net_seq()
print(net_seq)
输出的结果是:
net_seq(
(seq): Sequential(
(0): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))
(1): ReLU()
(2): Conv2d(20, 64, kernel_size=(5, 5), stride=(1, 1))
(3): ReLU()
)
)
这里还是序号。
nn.Sequential
中可以使用OrderedDict
来指定每个module的名字,而不是采用默认的命名方式(按序号 0,1,2,3…),例子:
from collections import OrderedDict
class net_seq(nn.Module):
def __init__(self):
super(net_seq, self).__init__()
self.seq = nn.Sequential(OrderedDict([
('conv1', nn.Conv2d(1,20,5)),
('relu1', nn.ReLU()),
('conv2', nn.Conv2d(20,64,5)),
('relu2', nn.ReLU())
]))
def forward(self, x):
return self.seq(x)
net_seq = net_seq()
print(net_seq)
结果为:
net_seq(
(seq): Sequential(
(conv1): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))
(relu1): ReLU()
(conv2): Conv2d(20, 64, kernel_size=(5, 5), stride=(1, 1))
(relu2): ReLU()
)
)
这里已经显示名字而不是序号了。
nn.ModuleList
,它是一个储存不同 module,并自动将每个 module 的 parameters 添加到网络之中的容器。你可以把任意 nn.Module
的子类 (比如 nn.Conv2d
, nn.Linear
之类的) 加到这个 list 里面,方法和 Python 自带的 list 一样,无非是 extend
,append
等操作。但不同于一般的 list,加入到 nn.ModuleList
里面的 module 是会自动注册到整个网络上的,同时 module 的 parameters 也会自动添加到整个网络中。若使用python的list,则会出问题。下面看一个例子:
class net_modlist(nn.Module):
def __init__(self):
super(net_modlist, self).__init__()
self.modlist = nn.ModuleList([
nn.Conv2d(1, 20, 5),
nn.ReLU(),
nn.Conv2d(20, 64, 5),
nn.ReLU()
])
def forward(self, x):
for m in self.modlist:
x = m(x)
return x
net_modlist = net_modlist()
print(net_modlist)
运行结果是:
net_modlist(
(modlist): ModuleList(
(0): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))
(1): ReLU()
(2): Conv2d(20, 64, kernel_size=(5, 5), stride=(1, 1))
(3): ReLU()
)
)
再来看看这个网络的权重 (weithgs) 和偏置 (bias):
for param in net_modlist.parameters():
print(type(param.data), param.size())
这些para都被注册了:
<class 'torch.Tensor'> torch.Size([20, 1, 5, 5])
<class 'torch.Tensor'> torch.Size([20])
<class 'torch.Tensor'> torch.Size([64, 20, 5, 5])
<class 'torch.Tensor'> torch.Size([64])
但是用Python 自带的 list就会有问题:
class net_modlist(nn.Module):
def __init__(self):
super(net_modlist, self).__init__()
self.modlist = [
nn.Conv2d(1, 20, 5),
nn.ReLU(),
nn.Conv2d(20, 64, 5),
nn.ReLU()
]
def forward(self, x):
for m in self.modlist:
x = m(x)
return x
net_modlist = net_modlist()
print(net_modlist)
for param in net_modlist.parameters():
print(type(param.data), param.size())
输出的结果是:
net_modlist()
显然这里面的参数都没有被注册进网络。当然,我们还是可以使用 forward 来计算输出结果。但是如果用其实例化的网络进行训练的时候,因为这些层的parameters不在整个网络之中,所以其网络参数也不会被更新,也就是无法训练。
继续剽窃例子:
seq = nn.Sequential(
nn.Conv2d(1,20,5),
nn.ReLU(),
nn.Conv2d(20,64,5),
nn.ReLU()
)
input = torch.randn(16, 1, 20, 20)
print(seq(input).size())
输出为:
torch.Size([16, 64, 12, 12])
可行。
继承nn.Module
类的话,就要写出forward
函数:
class net1(nn.Module):
def __init__(self):
super(net1, self).__init__()
self.seq = nn.Sequential(
nn.Conv2d(1,20,5),
nn.ReLU(),
nn.Conv2d(20,64,5),
nn.ReLU()
)
def forward(self, x):
return self.seq(x)
input = torch.randn(16, 1, 20, 20)
net1 = net1()
print(net1(input).shape)
输出结果为:
torch.Size([16, 64, 12, 12])
否则则会报出NotImplementedError
错误。
当然对于贯序模型,也可用循环的方式定义forward
函数:
def forward(self, x):
for s in self.seq:
x = s(x)
return x
而对于nn.ModuleList
,不写forward
方法就会出现NotImplementedError
错误:
modlist = nn.ModuleList([
nn.Conv2d(1, 20, 5),
nn.ReLU(),
nn.Conv2d(20, 64, 5),
nn.ReLU()
])
input = torch.randn(16, 1, 20, 20)
print(modlist(input))
所以要定义forward
函数:
class net2(nn.Module):
def __init__(self):
super(net2, self).__init__()
self.modlist = nn.ModuleList([
nn.Conv2d(1, 20, 5),
nn.ReLU(),
nn.Conv2d(20, 64, 5),
nn.ReLU()
])
def forward(self, x):
for m in self.modlist:
x = m(x)
return x
input = torch.randn(16, 1, 20, 20)
net2 = net2()
print(net2(input).shape)
注意只能按照下面利用for循环的方式,不能用下面这种方式,否则还是会出NotImplementedError
错误:
def forward(self, x):
return self.modlist(x)
如果完全直接用nn.Sequential
,确实是可以的,但这么做的代价就是失去了部分灵活性,不能自己去定制 forward
函数里面的内容了。
一般情况下 nn.Sequential
的用法是来组成卷积块 (block),然后像拼积木一样把不同的 block 拼成整个网络,让代码更简洁,更加结构化。
nn.Sequential
里面的模块按照顺序进行排列的,所以必须确保前一个模块的输出大小和下一个模块的输入大小是一致的。而nn.ModuleList
并没有定义一个网络,它只是将不同的模块储存在一起,这些模块之间并没有什么先后顺序可言。例子如下:
class net3(nn.Module):
def __init__(self):
super(net3, self).__init__()
self.linears = nn.ModuleList([nn.Linear(10,20), nn.Linear(20,30), nn.Linear(5,10)])
def forward(self, x):
x = self.linears[2](x)
x = self.linears[0](x)
x = self.linears[1](x)
return x
net3 = net3()
print(net3)
input = torch.randn(32, 5)
print(net3(input).shape)
输出的结果为:
net3(
(linears): ModuleList(
(0): Linear(in_features=10, out_features=20, bias=True)
(1): Linear(in_features=20, out_features=30, bias=True)
(2): Linear(in_features=5, out_features=10, bias=True)
)
)
torch.Size([32, 30])
可以看到,print(net3)
中的结果是按照self.linears
中的module存放顺序来的,实际上和网络的执行顺序(拓扑)不一样,否则就会报输入输出不匹配的错误。
用 for 循环来创建网络中相似或者重复的层,需要用到ModuleList
:
class net4(nn.Module):
def __init__(self):
super(net4, self).__init__()
layers = [nn.Linear(10, 10) for i in range(5)]
self.linears = nn.ModuleList(layers)
def forward(self, x):
for layer in self.linears:
x = layer(x)
return x
net = net4()
print(net)
输出的结果为:
net4(
(linears): ModuleList(
(0): Linear(in_features=10, out_features=10, bias=True)
(1): Linear(in_features=10, out_features=10, bias=True)
(2): Linear(in_features=10, out_features=10, bias=True)
(3): Linear(in_features=10, out_features=10, bias=True)
(4): Linear(in_features=10, out_features=10, bias=True)
)
)