On 11/23/2021 5:00 AM, Helge Oldach wrote:
Hi,
Allan Jude wrote on Mon, 22 Nov 2021 19:14:13 +0100 (CET):
The branch stable/13 has been updated by allanjude:
URL:
https://cgit.FreeBSD.org/src/commit/?id=32a2fed6e71f896266d4c695754104d82a72c60d
commit 32a2fed6e71f896266d4c695754104d82a72c60d
Author: Allan Jude <allanj...@freebsd.org>
AuthorDate: 2021-11-19 15:14:30 +0000
Commit: Allan Jude <allanj...@freebsd.org>
CommitDate: 2021-11-22 18:12:20 +0000
openssl: Fix detection of ARMv7 and ARM64 CPU features
OpenSSL assumes the same value for AT_HWCAP=16 (Linux)
So it ends up calling elf_auxv_info() with AT_CANARY which
returns ENOENT, and all acceleration features are disabled.
With this, my ARM64 test machine runs the benchmark
`openssl speed -evp aes-256-gcm` nearly 20x faster
going from 100 MB/sec to 2000 MB/sec
It also improves sha256 from 300 MB/sec to 1800 MB/sec
This fix has been accepted but not yet merged upstream:
https://github.com/openssl/openssl/pull/17082
PR: 259937
Reviewed by: manu, imp
MFC after: immediate
Relnotes: yes
Fixes: 88e852c0b5c872b1a ("OpenSSL: Merge OpenSSL 1.1.1j")
Sponsored by: Ampere Computing LLC
Sponsored by: Klara Inc.
Differential Revision: https://reviews.freebsd.org/D33060
(cherry picked from commit d9bb798725cfce9c72b80440659b48e8668eb10d)
Hmmm. On a RPi4/8G:
Before (FreeBSD 13.0-STABLE (GENERIC) #366 stable/13-n248173-d16fbc488e6):
| Doing aes-256-gcm for 3s on 16 size blocks: 6710997 aes-256-gcm's in 3.00s
| Doing aes-256-gcm for 3s on 64 size blocks: 1806261 aes-256-gcm's in 3.00s
| Doing aes-256-gcm for 3s on 256 size blocks: 468595 aes-256-gcm's in 3.00s
| Doing aes-256-gcm for 3s on 1024 size blocks: 121282 aes-256-gcm's in 3.00s
| Doing aes-256-gcm for 3s on 8192 size blocks: 14590 aes-256-gcm's in 3.00s
| Doing aes-256-gcm for 3s on 16384 size blocks: 7258 aes-256-gcm's in 3.00s
| OpenSSL 1.1.1l-freebsd 24 Aug 2021
| built on: reproducible build, date unspecified
| options:bn(64,64) rc4(int) des(int) aes(partial) idea(int) blowfish(ptr)
| compiler: clang
| The 'numbers' are in 1000s of bytes per second processed.
| type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
16384 bytes
| aes-256-gcm 35791.98k 38533.57k 39986.77k 41397.59k
39840.43k 39638.36k
After (FreeBSD 13.0-STABLE (GENERIC) #367 stable/13-n248176-f085bb0e621)
| Doing aes-256-gcm for 3s on 16 size blocks: 3999944 aes-256-gcm's in 3.01s
| Doing aes-256-gcm for 3s on 64 size blocks: 1102925 aes-256-gcm's in 3.04s
| Doing aes-256-gcm for 3s on 256 size blocks: 279608 aes-256-gcm's in 3.03s
| Doing aes-256-gcm for 3s on 1024 size blocks: 69397 aes-256-gcm's in 3.00s
| Doing aes-256-gcm for 3s on 8192 size blocks: 9160 aes-256-gcm's in 3.14s
| Doing aes-256-gcm for 3s on 16384 size blocks: 4385 aes-256-gcm's in 3.00s
| OpenSSL 1.1.1l-freebsd 24 Aug 2021
| built on: reproducible build, date unspecified
| options:bn(64,64) rc4(int) des(int) aes(partial) idea(int) blowfish(ptr)
| compiler: clang
| The 'numbers' are in 1000s of bytes per second processed.
| type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
16384 bytes
| aes-256-gcm 21277.62k 23226.64k 23613.90k 23687.51k
23892.93k 23947.95k
It seems that AES throughput is actually cut by almost half?
Kind regards
Helge
Do you know which of the CPU optimizations your RPi4 supports?
You can set the environment variable OPENSSL_armcap to override
OpenSSL's detection.
Try: env OPENSSL_armcap=0 openssl speed -evp aes-256-gcm
And see if it gets a different result.
--
Allan Jude