crypto/rsa: linux/arm64 Go 1.9性能比OpenSSL慢10倍,

u1ehiz5o  于 4个月前  发布在  Go
关注(0)|答案(2)|浏览(133)

请在提交问题之前回答以下问题。谢谢!

您正在使用的Go版本是什么( go version )?

go version go1.9.2 linux/arm64

这个问题在最新版本的发布中是否重现?

是的

您正在使用什么操作系统和处理器架构( go env )?

GOARCH="arm64"
GOBIN=""
GOEXE=""
GOHOSTARCH="arm64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH=""
GORACE=""
GOROOT="/usr/lib/go-1.6"
GOTOOLDIR="/usr/lib/go-1.6/pkg/tool/linux_arm64"
GO15VENDOREXPERIMENT="1"
CC="gcc"
GOGCCFLAGS="-fPIC -pthread -fmessage-length=0"
CXX="g++"
CGO_ENABLED="1"

您做了什么?

go test crypto/rsa -bench .

您期望看到什么?

性能可以与OpenSSL相当( https://blog.cloudflare.com/content/images/2017/11/pub_key_1_core-2.png )

您实际看到了什么?

比OpenSSL慢了10倍( https://blog.cloudflare.com/content/images/2017/11/go_pub_key_1_core.png )

exdqitrt

exdqitrt1#

Go 1.11beta1在Cavium ThunderX / Packet c1.large.arm("Type 2A")上的速度比Go 1.10.2快得多。

i7uq4tfw

i7uq4tfw2#

关于第三代AWS Graviton(c7g)主机的一些更新数据:

$ go version
go version devel go1.21-b94dc384ca Sat Mar 4 00:00:01 2023 +0000 linux/arm64
$ go test crypto/rsa -bench .
goos: linux
goarch: arm64
pkg: crypto/rsa
BenchmarkDecryptPKCS1v15/2048-32         	     597	   2000184 ns/op
BenchmarkDecryptPKCS1v15/3072-32         	     200	   5976582 ns/op
BenchmarkDecryptPKCS1v15/4096-32         	      88	  13397414 ns/op
BenchmarkEncryptPKCS1v15/2048-32         	    6457	    185655 ns/op
BenchmarkDecryptOAEP/2048-32             	     603	   1990501 ns/op
BenchmarkEncryptOAEP/2048-32             	    6457	    185121 ns/op
BenchmarkSignPKCS1v15/2048-32            	     583	   2048907 ns/op
BenchmarkVerifyPKCS1v15/2048-32          	    6528	    183649 ns/op
BenchmarkSignPSS/2048-32                 	     583	   2052886 ns/op
BenchmarkVerifyPSS/2048-32               	    6442	    185743 ns/op
PASS
ok  	crypto/rsa	14.990s
$ cat /proc/cpuinfo | head -n 9
processor	: 0
BogoMIPS	: 2100.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs paca pacg dcpodp svei8mm svebf16 i8mm bf16 dgh rng
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0xd40
CPU revision	: 1

以及在M1 Max上的数据:

$ go version
go version devel go1.21-b94dc384ca Sat Mar 4 00:00:01 2023 +0000 darwin/arm64
$ go test crypto/rsa -bench . -cpu 1
goos: darwin
goarch: arm64
pkg: crypto/rsa
BenchmarkDecryptPKCS1v15/2048         	    1040	   1217645 ns/op
BenchmarkDecryptPKCS1v15/3072         	     303	   3562839 ns/op
BenchmarkDecryptPKCS1v15/4096         	     148	   8073468 ns/op
BenchmarkEncryptPKCS1v15/2048         	    8928	    130840 ns/op
BenchmarkDecryptOAEP/2048             	    1023	   1146886 ns/op
BenchmarkEncryptOAEP/2048             	    8979	    131854 ns/op
BenchmarkSignPKCS1v15/2048            	     994	   1194395 ns/op
BenchmarkVerifyPKCS1v15/2048          	    9250	    131157 ns/op
BenchmarkSignPSS/2048                 	     997	   1199584 ns/op
BenchmarkVerifyPSS/2048               	    9013	    131653 ns/op
PASS
ok  	crypto/rsa	15.288s

与c7g.8xlarge相比,AWS c6i.8xlarge(Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz):

name                     old time/op  new time/op  delta
DecryptPKCS1v15/2048-32  1.52ms ± 0%  2.00ms ± 0%  +31.41%  (p=0.008 n=5+5)
DecryptPKCS1v15/3072-32  4.56ms ± 1%  5.98ms ± 0%  +31.06%  (p=0.008 n=5+5)
DecryptPKCS1v15/4096-32  10.2ms ± 0%  13.4ms ± 0%  +31.67%  (p=0.008 n=5+5)
EncryptPKCS1v15/2048-32   180µs ± 0%   185µs ± 0%   +3.09%  (p=0.008 n=5+5)
DecryptOAEP/2048-32      1.54ms ± 0%  1.99ms ± 0%  +28.88%  (p=0.008 n=5+5)
EncryptOAEP/2048-32       183µs ± 1%   185µs ± 0%   +1.29%  (p=0.008 n=5+5)
SignPKCS1v15/2048-32     1.58ms ± 0%  2.05ms ± 0%  +29.66%  (p=0.008 n=5+5)
VerifyPKCS1v15/2048-32    179µs ± 1%   184µs ± 0%   +2.56%  (p=0.008 n=5+5)
SignPSS/2048-32          1.59ms ± 1%  2.05ms ± 0%  +29.24%  (p=0.008 n=5+5)
VerifyPSS/2048-32         182µs ± 1%   186µs ± 0%   +2.06%  (p=0.008 n=5+5)

实际上这略好于Ubuntu Focal OpenSSL 1.1.1f性能差异(Graviton比Intel慢37%),尽管看起来2048位RSA在OpenSSL中的速度是Go基准测试的两倍(与上面的Go基准测试相比),这与openssl speed rsa2048c7g Graviton 3主机上报告的一致:

$ openssl speed rsa2048
Doing 2048 bits private rsa's for 10s: 10322 2048 bits private RSA's in 10.00s
Doing 2048 bits public rsa's for 10s: 419431 2048 bits public RSA's in 9.98s
OpenSSL 1.1.1f  31 Mar 2020
built on: Mon Feb  6 17:57:17 2023 UTC
options:bn(64,64) rc4(char) des(int) aes(partial) blowfish(ptr)
compiler: gcc -fPIC -pthread -Wa,--noexecstack -Wall -Wa,--noexecstack -g -O2 -fdebug-prefix-map=/build/openssl-0kQqA1/openssl-1.1.1f=. -fstack-protector-strong -Wformat -Werror=format-security -DOPENSSL_TLS_SECURITY_LEVEL=2 -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DVPAES_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DNDEBUG -Wdate-time -D_FORTIFY_SOURCE=2
                  sign    verify    sign/s verify/s
rsa 2048 bits 0.000969s 0.000024s   1032.2  42027.2

最后是Go与GOEXPERIMENT=boringcrypto在AWS c7g/第三代Graviton上的对比:

name                     old time/op  new time/op  delta
DecryptPKCS1v15/2048-32  2.00ms ± 0%  0.91ms ± 0%  -54.59%  (p=0.008 n=5+5)
DecryptPKCS1v15/3072-32  5.98ms ± 0%  2.71ms ± 0%  -54.62%  (p=0.008 n=5+5)
DecryptPKCS1v15/4096-32  13.4ms ± 0%   6.1ms ± 0%  -54.84%  (p=0.008 n=5+5)
EncryptPKCS1v15/2048-32   185µs ± 0%     8µs ± 0%  -95.80%  (p=0.008 n=5+5)
DecryptOAEP/2048-32      1.99ms ± 0%  0.91ms ± 0%  -54.16%  (p=0.008 n=5+5)
EncryptOAEP/2048-32       185µs ± 0%    12µs ± 0%  -93.47%  (p=0.008 n=5+5)
SignPKCS1v15/2048-32     2.05ms ± 0%  0.91ms ± 0%  -55.72%  (p=0.008 n=5+5)
VerifyPKCS1v15/2048-32    184µs ± 0%     7µs ± 0%  -96.45%  (p=0.008 n=5+5)
SignPSS/2048-32          2.05ms ± 0%  0.91ms ± 0%  -55.67%  (p=0.008 n=5+5)
VerifyPSS/2048-32         186µs ± 0%     7µs ± 0%  -96.16%  (p=0.008 n=5+5)

(那些boringcrypto签名数字大致与上面OpenSSL报告的rsa2048性能相匹配)。

相关问题