go crypto/cipher: ARM架构上"crypto"库的性能优化思路

7kqas0il  于 6个月前  发布在  Go
关注(0)|答案(2)|浏览(86)

请问您使用的Go版本是(go version)?

$ go version 1.14.4

当我在amd64和arm64平台上运行AES-CBC性能分析时,我发现arm64架构下的函数:func xorBytes(dst, a, b []byte) intfunc safeXORBytes(dst, a, b []byte, n int) (在crypto/cipher/xor_generic.go中)总是出现在pprof列表的前15名。与amd64架构相比,这个函数在func xorBytesSSE2(dst, a, b *byte, n int)中使用了SSE2 SIMD指令。

```bash
(pprof) top10
Showing nodes accounting for 700ms, 55.12% of 1270ms total
Showing top 10 nodes out of 113
flat  flat%   sum%        cum   cum%
170ms 13.39% 13.39%      530ms 41.73%  runtime.mallocgc
90ms  7.09% 20.47%       90ms  7.09%  crypto/cipher.safeXORBytes
90ms  7.09% 27.56%      130ms 10.24%  syscall.Syscall
80ms  6.30% 33.86%       80ms  6.30%  runtime.nextFreeFast (inline)
60ms  4.72% 38.58%       60ms  4.72%  runtime.publicationBarrier
50ms  3.94% 42.52%       50ms  3.94%  crypto/aes.expandKeyAsm
50ms  3.94% 46.46%      140ms 11.02%  crypto/cipher.xorBytes
40ms  3.15% 49.61%       40ms  3.15%  runtime.acquirem (inline)
40ms  3.15% 52.76%       40ms  3.15%  runtime.memclrNoHeapPointers
30ms  2.36% 55.12%       30ms  2.36%  crypto/internal/subtle.InexactOverlap

我在考虑是否可以使用arm64 SIMD指令来优化这个函数的性能?

w8f9ii69

w8f9ii691#

https://golang.org/cl/142537提到了这个问题:crypto/cipher: use Neon for xor on arm64

wsewodh2

wsewodh22#

请查看我的PR #53154,其中添加了针对ARM的非NEON和NEON版本的xorBytes实现。这将填补与ARM64之间的差距。

相关问题