如何在PyTorch中初始化权重?

我如何初始化网络的权重和偏差(通过例如He或Xavier初始化)?

当前回答

因为到目前为止我还没有足够的声誉，我不能在下面添加评论

prosti在19年6月26日13:16发布的答案。

    def reset_parameters(self):
        init.kaiming_uniform_(self.weight, a=math.sqrt(3))
        if self.bias is not None:
            fan_in, _ = init._calculate_fan_in_and_fan_out(self.weight)
            bound = 1 / math.sqrt(fan_in)
            init.uniform_(self.bias, -bound, bound)

但我想指出的是，实际上我们知道何开明的论文《深入研究整流器:在ImageNet分类上超越人类的性能》中的一些假设是不合适的，尽管看起来刻意设计的初始化方法在实践中取得了成功。

例如，在反向传播案例的分段中，它们假设$w_l$和$\delta y_l$是相互独立的。但我们都知道，以分数映射$\delta y^L_i$为例，如果我们使用典型的交叉熵损失函数目标，它通常是$y_i-softmax(y^L_i)=y_i-softmax(w^L_ix^L_i)$。

所以我认为，为什么他的初始化工作得很好，真正的潜在原因还有待解开。因为每个人都见证了它在促进深度学习训练方面的力量。

2020-03-09 02:00:56

其他回答

单层

要初始化单个图层的权重，请使用torch.nn.init中的函数。例如:

conv1 = torch.nn.Conv2d(...)
torch.nn.init.xavier_uniform(conv1.weight)

或者，您可以通过写入conv1.weight来修改参数。data(它是torch.Tensor)。例子:

conv1.weight.data.fill_(0.01)

这同样适用于偏见:

conv1.bias.data.fill_(0.01)

神经网络。顺序或自定义nn。模块

将初始化函数传递给torch.nn.Module.apply。它将初始化整个nn中的权重。递归地模块。

apply(fn):将fn递归应用到每个子模块(由.children()返回)和self。典型的用法包括初始化模型的参数(参见torch-nn-init)。

例子:

def init_weights(m):
    if isinstance(m, nn.Linear):
        torch.nn.init.xavier_uniform(m.weight)
        m.bias.data.fill_(0.01)

net = nn.Sequential(nn.Linear(2, 2), nn.Linear(2, 2))
net.apply(init_weights)

2018-03-22 16:34:42

这是更好的方法，传递你的整个模型

import torch.nn as nn
def initialize_weights(model):
    # Initializes weights according to the DCGAN paper
    for m in model.modules():
        if isinstance(m, (nn.Conv2d, nn.ConvTranspose2d, nn.BatchNorm2d)):
            nn.init.normal_(m.weight.data, 0.0, 0.02)
        # if you also want for linear layers ,add one more elif condition

2021-02-14 07:52:39

要初始化层，通常不需要做任何事情。

PyTorch会为你做这件事。仔细想想，这就说得通了。为什么我们要初始化层，当PyTorch可以遵循最新的趋势时?

例如，线性层的__init__方法将进行开明河初始化:

init.kaiming_uniform_(self.weight, a=math.sqrt(5))
if self.bias is not None:
    fan_in, _ = init._calculate_fan_in_and_fan_out(self.weight)
    bound = 1 / math.sqrt(fan_in) if fan_in > 0 else 0
    init.uniform_(self.bias, -bound, bound)

类似地，这也适用于其他层类型。例如，Conv2d，检查这里。

注意:适当的初始化的好处是更快的训练速度。如果您的问题需要特殊的初始化，您仍然可以在之后进行初始化。

2019-06-26 13:16:33

如果您想要一些额外的灵活性，还可以手动设置权重。

假设你有所有1的输入:

import torch
import torch.nn as nn

input = torch.ones((8, 8))
print(input)

tensor([[1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.]])

你想要做一个没有偏差的密集层(所以我们可以可视化):

d = nn.Linear(8, 8, bias=False)

将所有权重设置为0.5(或任何其他值):

d.weight.data = torch.full((8, 8), 0.5)
print(d.weight.data)

权重:

Out[14]: 
tensor([[0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000],
        [0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000],
        [0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000],
        [0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000],
        [0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000],
        [0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000],
        [0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000],
        [0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000]])

你的重量现在都是0.5。传递数据:

d(input)

Out[13]: 
tensor([[4., 4., 4., 4., 4., 4., 4., 4.],
        [4., 4., 4., 4., 4., 4., 4., 4.],
        [4., 4., 4., 4., 4., 4., 4., 4.],
        [4., 4., 4., 4., 4., 4., 4., 4.],
        [4., 4., 4., 4., 4., 4., 4., 4.],
        [4., 4., 4., 4., 4., 4., 4., 4.],
        [4., 4., 4., 4., 4., 4., 4., 4.],
        [4., 4., 4., 4., 4., 4., 4., 4.]], grad_fn=<MmBackward>)

请记住，每个神经元接收8个输入，所有输入的权重都为0.5，值为1(并且没有偏差)，因此每个神经元的总和为4。

2019-12-22 03:43:07

抱歉这么晚才来，希望我的回答能有所帮助。

用正态分布初始化权重:

torch.nn.init.normal_(tensor, mean=0, std=1)

或者使用常数分布:

torch.nn.init.constant_(tensor, value)

或者使用均匀分布:

torch.nn.init.uniform_(tensor, a=0, b=1) # a: lower_bound, b: upper_bound

你可以用其他方法来初始化张量

2018-09-27 22:12:24

如何在PyTorch中初始化权重?

推荐文章

最新文章

标签