numpy nn.Linear给出`只有浮点和复杂dtype的Tensor才需要梯度`

i5desfxk  于 2023-11-18  发布在  其他
关注(0)|答案(1)|浏览(272)

我试图理解当输入是一个热向量时,hon nn.Embedding作为nn.Linear工作。
考虑一下,输入是[0,0,0,1,0,0],这是一个对应于索引3的热向量。所以,我首先创建了两个:

_in = torch.tensor([0,0,0,1,0,0]).long() # used later
_index = torch.LongTensor([3])

字符串
然后我试着nn.Embedding

customEmb = torch.tensor([[1,1,1,1],[2,2,2,2],[3,3,3,3],[4,4,4,4],[5,5,5,5],[6,6,6,6]]).float()
emb = nn.Embedding(d_vocabSize, emb_size) # 6x4
emb.weight = torch.nn.Parameter(customEmb)
print(emb.weight)

print('\n----- embedding(input) ------')
hidden = emb(_index)
print(hidden)


其正确地输出(在嵌入[4., 4., 4., 4.]中选择的第3行):

----- hiddent layer / embedding -----
Parameter containing:
tensor([[1., 1., 1., 1.],
        [2., 2., 2., 2.],
        [3., 3., 3., 3.],
        [4., 4., 4., 4.],
        [5., 5., 5., 5.],
        [6., 6., 6., 6.]], requires_grad=True)

----- embedding(input) ------
tensor([[4., 4., 4., 4.]], grad_fn=<EmbeddingBackward0>)


我尝试了类似的nn.Linear

print('\n----- hiddent layer / embedding -----')
customEmb = torch.tensor([[1,1,1,1],[2,2,2,2],[3,3,3,3],[4,4,4,4],[5,5,5,5],[6,6,6,6]]).long()
emb = nn.Linear(d_vocabSize, emb_size) # 6x4
emb.weight = torch.nn.Parameter(customEmb.T)
print(emb.weight)

print('\n----- embedding(input) ------')
hidden = emb(_in) 
print(hidden)


对于上面的代码,它给出了以下错误:

----- hiddent layer / embedding -----
RuntimeError
# ...
---> 16 emb.weight = torch.nn.Parameter(customEmb.T)
# ... 
RuntimeError: Only Tensors of floating point and complex dtype can require gradients


因此,我尝试调用.float()而不是.long()customEmb,但得到以下错误:

----- hiddent layer / embedding -----
Parameter containing:
tensor([[1., 2., 3., 4., 5., 6.],
        [1., 2., 3., 4., 5., 6.],
        [1., 2., 3., 4., 5., 6.],
        [1., 2., 3., 4., 5., 6.]], requires_grad=True)

----- embedding(input) ------
RuntimeError
# ...
---> 21 hidden = emb(_in) 
# ... 
RuntimeError: expected scalar type Long but found Float

PS

我希望nn.Linear返回如下内容:

[[0.,0.,0.,0.],
 [0.,0.,0.,0.],
 [0.,0.,0.,0.],
 [4.,4.,4.,4.],     
 [0.,0.,0.,0.],
 [0.,0.,0.,0.],
]

更新

我尝试将_incustomEmb都转换为float,错误消失了:

_in = torch.tensor([0,0,0,1,0,0]).float()
customEmb = torch.tensor([[1,1,1,1],[2,2,2,2],[3,3,3,3],[4,4,4,4],[5,5,5,5],[6,6,6,6]]).float()
emb = nn.Linear(d_vocabSize, emb_size) # 6x4
emb.weight = torch.nn.Parameter(customEmb.T)
hidden = emb(_in) 
print(hidden)


上面写着:

tensor([3.6693, 4.3959, 3.9726, 4.3447], grad_fn=<AddBackward0>)

**Q2.**现在我猜为什么它不是[4.,4.,4.,4.]?我完全搞砸了假设/我的理解?

eoxn13cs

eoxn13cs1#

这是我最近看到的一篇旧文章,
错误Only Tensors of floating point and complex dtype can require gradients是因为nn.Linear模块只接受浮点和复数Tensor作为输入。这是因为线性层的权重和偏置也是浮点或复数Tensor。
要修复此错误,您需要在将xTensor传递给nn.Linear层之前确保xTensor是浮点型或复杂dtype。您可以使用**torch.float()torch.complex()**方法将Tensor转换为所需的dtype。
因此,如果输入Tensor为非浮点型,请确保在向前传递时转换数据类型:

x = layer(input.float())

字符串

相关问题