Jack O'Connor <oconnor...@gmail.com> added the comment:

> Hardware accelerated SHAs are likely faster than blake3 single core.

Surprisingly, they're not. Here's a quick measurement on my recent ThinkPad 
laptop (64 KiB of input, single-threaded, TurboBoost left on), which supports 
both AVX-512 and the SHA extensions:

OpenSSL SHA-256: 1816 MB/s
OpenSSL SHA-1:   2103 MB/s
BLAKE3 SSE2:     2109 MB/s
BLAKE3 SSE4.1:   2474 MB/s
BLAKE3 AVX2:     4898 MB/s
BLAKE3 AVX-512:  8754 MB/s

The main reason SHA-1 and SHA-256 don't do better is that they're fundamentally 
serial algorithms. Hardware acceleration can speed up a single instance of 
their compression functions, but there's just no way for it to run more than 
one instance per message at a time. In contrast, AES-CTR can easily parallelize 
its blocks, and hardware accelerated AES does beat BLAKE3.

> And certainly more efficient in terms of watt-secs/byte.

I don't have any experience measuring power myself, so take this with a grain 
of salt: I think the difference in throughput shown above is large enough that, 
even accounting for the famously high power draw of AVX-512, BLAKE3 comes out 
ahead in terms of energy/byte. Probably not on ARM though.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue39298>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to