我很熟悉einsum在NumPy中的工作方式。PyTorch也提供了类似的功能:torch.einsum()。在功能或性能方面有什么相似之处和不同之处?PyTorch文档中提供的信息相当少,没有提供任何关于这方面的见解。
einsum
uplii1fm1#
由于torch文档中对einsum的描述很少,我决定写这篇文章来记录,比较和对比torch.einsum()和numpy.einsum()的行为。
torch.einsum()
numpy.einsum()
差异:
[a-zA-Z]
[a-z]
TypeError
nd-arrays
optimize
以下是PyTorch和NumPy中一些示例的实现:
# input tensors to work with In [16]: vec Out[16]: tensor([0, 1, 2, 3]) In [17]: aten Out[17]: tensor([[11, 12, 13, 14], [21, 22, 23, 24], [31, 32, 33, 34], [41, 42, 43, 44]]) In [18]: bten Out[18]: tensor([[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]])
字符串
1)矩阵乘法
PyTorch:torch.matmul(aten, bten) ; aten.mm(bten)NumPy:np.einsum("ij, jk -> ik", arr1, arr2)
torch.matmul(aten, bten)
aten.mm(bten)
np.einsum("ij, jk -> ik", arr1, arr2)
In [19]: torch.einsum('ij, jk -> ik', aten, bten) Out[19]: tensor([[130, 130, 130, 130], [230, 230, 230, 230], [330, 330, 330, 330], [430, 430, 430, 430]])
型
2)沿主对角线沿着提取元素
PyTorch:torch.diag(aten)NumPy:np.einsum("ii -> i", arr)
torch.diag(aten)
np.einsum("ii -> i", arr)
In [28]: torch.einsum('ii -> i', aten) Out[28]: tensor([11, 22, 33, 44])
3)Hadamard乘积(即两个Tensor的元素乘积)
PyTorch:aten * btenNumPy:np.einsum("ij, ij -> ij", arr1, arr2)
aten * bten
np.einsum("ij, ij -> ij", arr1, arr2)
In [34]: torch.einsum('ij, ij -> ij', aten, bten) Out[34]: tensor([[ 11, 12, 13, 14], [ 42, 44, 46, 48], [ 93, 96, 99, 102], [164, 168, 172, 176]])
4)元素平方
PyTorch:aten ** 2NumPy:np.einsum("ij, ij -> ij", arr, arr)
aten ** 2
np.einsum("ij, ij -> ij", arr, arr)
In [37]: torch.einsum('ij, ij -> ij', aten, aten) Out[37]: tensor([[ 121, 144, 169, 196], [ 441, 484, 529, 576], [ 961, 1024, 1089, 1156], [1681, 1764, 1849, 1936]])
nth
n
# NumPy: np.einsum('ij, ij, ij, ij -> ij', arr, arr, arr, arr) In [38]: torch.einsum('ij, ij, ij, ij -> ij', aten, aten, aten, aten) Out[38]: tensor([[ 14641, 20736, 28561, 38416], [ 194481, 234256, 279841, 331776], [ 923521, 1048576, 1185921, 1336336], [2825761, 3111696, 3418801, 3748096]])
5)迹(即主对角线元素的和)
PyTorch:torch.trace(aten)NumPy einsum:np.einsum("ii -> ", arr)
torch.trace(aten)
np.einsum("ii -> ", arr)
In [44]: torch.einsum('ii -> ', aten) Out[44]: tensor(110)
6)矩阵转置
PyTorch:torch.transpose(aten, 1, 0)NumPy einsum:np.einsum("ij -> ji", arr)
torch.transpose(aten, 1, 0)
np.einsum("ij -> ji", arr)
In [58]: torch.einsum('ij -> ji', aten) Out[58]: tensor([[11, 21, 31, 41], [12, 22, 32, 42], [13, 23, 33, 43], [14, 24, 34, 44]])
7)外积(向量)
PyTorch:torch.ger(vec, vec)NumPy einsum:np.einsum("i, j -> ij", vec, vec)
torch.ger(vec, vec)
np.einsum("i, j -> ij", vec, vec)
In [73]: torch.einsum('i, j -> ij', vec, vec) Out[73]: tensor([[0, 0, 0, 0], [0, 1, 2, 3], [0, 2, 4, 6], [0, 3, 6, 9]])
8)(向量的)内积PyTorch:torch.dot(vec1, vec2)
torch.dot(vec1, vec2)
NumPy einsum:np.einsum("i, i -> ", vec1, vec2)
np.einsum("i, i -> ", vec1, vec2)
In [76]: torch.einsum('i, i -> ', vec, vec) Out[76]: tensor(14)
9)沿沿着轴0求和
PyTorch:torch.sum(aten, 0)NumPy einsum:np.einsum("ij -> j", arr)
torch.sum(aten, 0)
np.einsum("ij -> j", arr)
In [85]: torch.einsum('ij -> j', aten) Out[85]: tensor([104, 108, 112, 116])
10)沿沿着轴1求和
PyTorch:torch.sum(aten, 1)NumPy einsum:np.einsum("ij -> i", arr)
torch.sum(aten, 1)
np.einsum("ij -> i", arr)
In [86]: torch.einsum('ij -> i', aten) Out[86]: tensor([ 50, 90, 130, 170])
11)批量矩阵乘法
PyTorch:torch.bmm(batch_tensor_1, batch_tensor_2)NumPy:np.einsum("bij, bjk -> bik", batch_tensor_1, batch_tensor_2)
torch.bmm(batch_tensor_1, batch_tensor_2)
np.einsum("bij, bjk -> bik", batch_tensor_1, batch_tensor_2)
# input batch tensors to work with In [13]: batch_tensor_1 = torch.arange(2 * 4 * 3).reshape(2, 4, 3) In [14]: batch_tensor_2 = torch.arange(2 * 3 * 4).reshape(2, 3, 4) In [15]: torch.bmm(batch_tensor_1, batch_tensor_2) Out[15]: tensor([[[ 20, 23, 26, 29], [ 56, 68, 80, 92], [ 92, 113, 134, 155], [ 128, 158, 188, 218]], [[ 632, 671, 710, 749], [ 776, 824, 872, 920], [ 920, 977, 1034, 1091], [1064, 1130, 1196, 1262]]]) # sanity check with the shapes In [16]: torch.bmm(batch_tensor_1, batch_tensor_2).shape Out[16]: torch.Size([2, 4, 4]) # batch matrix multiply using einsum In [17]: torch.einsum("bij, bjk -> bik", batch_tensor_1, batch_tensor_2) Out[17]: tensor([[[ 20, 23, 26, 29], [ 56, 68, 80, 92], [ 92, 113, 134, 155], [ 128, 158, 188, 218]], [[ 632, 671, 710, 749], [ 776, 824, 872, 920], [ 920, 977, 1034, 1091], [1064, 1130, 1196, 1262]]]) # sanity check with the shapes In [18]: torch.einsum("bij, bjk -> bik", batch_tensor_1, batch_tensor_2).shape
12)沿沿着轴2求和
PyTorch:torch.sum(batch_ten, 2)NumPy einsum:np.einsum("ijk -> ij", arr3D)
torch.sum(batch_ten, 2)
np.einsum("ijk -> ij", arr3D)
In [99]: torch.einsum("ijk -> ij", batch_ten) Out[99]: tensor([[ 50, 90, 130, 170], [ 4, 8, 12, 16]])
13)对nDTensor中的所有元素求和
PyTorch:torch.sum(batch_ten)NumPy einsum:np.einsum("ijk -> ", arr3D)
torch.sum(batch_ten)
np.einsum("ijk -> ", arr3D)
In [101]: torch.einsum("ijk -> ", batch_ten) Out[101]: tensor(480)
14)多轴求和(即边缘化)
PyTorch:torch.sum(arr, dim=(dim0, dim1, dim2, dim3, dim4, dim6, dim7))NumPy:np.einsum("ijklmnop -> n", nDarr)
torch.sum(arr, dim=(dim0, dim1, dim2, dim3, dim4, dim6, dim7))
np.einsum("ijklmnop -> n", nDarr)
# 8D tensor In [103]: nDten = torch.randn((3,5,4,6,8,2,7,9)) In [104]: nDten.shape Out[104]: torch.Size([3, 5, 4, 6, 8, 2, 7, 9]) # marginalize out dimension 5 (i.e. "n" here) In [111]: esum = torch.einsum("ijklmnop -> n", nDten) In [112]: esum Out[112]: tensor([ 98.6921, -206.0575]) # marginalize out axis 5 (i.e. sum over rest of the axes) In [113]: tsum = torch.sum(nDten, dim=(0, 1, 2, 3, 4, 6, 7)) In [115]: torch.allclose(tsum, esum) Out[115]: True
15)双点积/Frobenius inner product(同:torch.sum(hadamard积)参见3)
PyTorch:torch.sum(aten * bten)NumPy:np.einsum("ij, ij -> ", arr1, arr2)
torch.sum(aten * bten)
np.einsum("ij, ij -> ", arr1, arr2)
In [120]: torch.einsum("ij, ij -> ", aten, bten) Out[120]: tensor(1300)
sz81bmfz2#
目前的答案似乎已经过时了,因为关于torch.einsum的一些不准确之处(例子是一样的):1.您可以同时使用大小写字母1.保持不变(通过设计)1.不存在optimize关键字参数,因为它已经使用了Tensor网络收缩的优化顺序。此外,np.einsum的其他关键字与输出[out, dtype, order, casting]相关。有一个值得阅读的注解是关于torch.einsum docs中的最佳收缩顺序的,它还展示了如果你想禁用优化顺序的话如何禁用。
torch.einsum
np.einsum
[out, dtype, order, casting]
附加:子列表表示法
自从这篇文章发表以来,np和torch都增加了对子列表表示法的支持:(来自pytorch文档的示例)
np
torch
>>> # with sublist format and ellipsis >>> torch.einsum(As, [..., 0, 1], Bs, [..., 1, 2], [..., 0, 2]) tensor([[[-1.0564, -1.5904, 3.2023, 3.1271], [-1.6706, -0.8097, -0.8025, -2.1183]], [[ 4.2239, 0.3107, -0.5756, -0.2354], [-1.4558, -0.3460, 1.5087, -0.8530]], [[ 2.8153, 1.8787, -4.3839, -1.2112], [ 0.3728, -2.1131, 0.0921, 0.8305]]])
字符串这个例子还使用了省略号...,它表示所有不需要作为下标的维度(这在答案的注解中提到,但没有解释)。再次参考torch文档以深入解释其工作原理。
...
2条答案
按热度按时间uplii1fm1#
由于torch文档中对einsum的描述很少,我决定写这篇文章来记录,比较和对比
torch.einsum()
和numpy.einsum()
的行为。差异:
[a-zA-Z]
用于“* 下标字符串 *”,而PyTorch只允许小写字母[a-z]
。TypeError
。nd-arrays
之外还支持很多关键字参数(例如optimize
),而PyTorch还没有提供这样的灵活性。以下是PyTorch和NumPy中一些示例的实现:
字符串
1)矩阵乘法
PyTorch:
torch.matmul(aten, bten)
;aten.mm(bten)
NumPy:
np.einsum("ij, jk -> ik", arr1, arr2)
型
2)沿主对角线沿着提取元素
PyTorch:
torch.diag(aten)
NumPy:
np.einsum("ii -> i", arr)
型
3)Hadamard乘积(即两个Tensor的元素乘积)
PyTorch:
aten * bten
NumPy:
np.einsum("ij, ij -> ij", arr1, arr2)
型
4)元素平方
PyTorch:
aten ** 2
NumPy:
np.einsum("ij, ij -> ij", arr, arr)
型
nth
幂可以通过重复下标字符串和Tensorn
次来实现。例如,计算Tensor的元素级4次幂可以使用:型
5)迹(即主对角线元素的和)
PyTorch:
torch.trace(aten)
NumPy einsum:
np.einsum("ii -> ", arr)
型
6)矩阵转置
PyTorch:
torch.transpose(aten, 1, 0)
NumPy einsum:
np.einsum("ij -> ji", arr)
型
7)外积(向量)
PyTorch:
torch.ger(vec, vec)
NumPy einsum:
np.einsum("i, j -> ij", vec, vec)
型
8)(向量的)内积PyTorch:
torch.dot(vec1, vec2)
NumPy einsum:
np.einsum("i, i -> ", vec1, vec2)
型
9)沿沿着轴0求和
PyTorch:
torch.sum(aten, 0)
NumPy einsum:
np.einsum("ij -> j", arr)
型
10)沿沿着轴1求和
PyTorch:
torch.sum(aten, 1)
NumPy einsum:
np.einsum("ij -> i", arr)
型
11)批量矩阵乘法
PyTorch:
torch.bmm(batch_tensor_1, batch_tensor_2)
NumPy:
np.einsum("bij, bjk -> bik", batch_tensor_1, batch_tensor_2)
型
12)沿沿着轴2求和
PyTorch:
torch.sum(batch_ten, 2)
NumPy einsum:
np.einsum("ijk -> ij", arr3D)
型
13)对nDTensor中的所有元素求和
PyTorch:
torch.sum(batch_ten)
NumPy einsum:
np.einsum("ijk -> ", arr3D)
型
14)多轴求和(即边缘化)
PyTorch:
torch.sum(arr, dim=(dim0, dim1, dim2, dim3, dim4, dim6, dim7))
NumPy:
np.einsum("ijklmnop -> n", nDarr)
型
15)双点积/Frobenius inner product(同:torch.sum(hadamard积)参见3)
PyTorch:
torch.sum(aten * bten)
NumPy:
np.einsum("ij, ij -> ", arr1, arr2)
型
sz81bmfz2#
目前的答案似乎已经过时了,因为关于
torch.einsum
的一些不准确之处(例子是一样的):1.您可以同时使用大小写字母
1.保持不变(通过设计)
1.不存在
optimize
关键字参数,因为它已经使用了Tensor网络收缩的优化顺序。此外,np.einsum
的其他关键字与输出[out, dtype, order, casting]
相关。有一个值得阅读的注解是关于
torch.einsum
docs中的最佳收缩顺序的,它还展示了如果你想禁用优化顺序的话如何禁用。附加:子列表表示法
自从这篇文章发表以来,
np
和torch
都增加了对子列表表示法的支持:(来自pytorch文档的示例)字符串
这个例子还使用了省略号
...
,它表示所有不需要作为下标的维度(这在答案的注解中提到,但没有解释)。再次参考torch文档以深入解释其工作原理。