[issue39298] add BLAKE3 to hashlib

2022-03-24 Thread Jack O'Connor
Jack O'Connor added the comment: I did reply to that point above with some baseless speculation, but now I can back up my baseless speculation with unscientific data :) https://gist.github.com/oconnor663/aed7016c9dbe5507510fc50faceaaa07 According to whatever `powerstat -R` measures on my lap

[issue39298] add BLAKE3 to hashlib

2022-03-24 Thread Gregory P. Smith
Gregory P. Smith added the comment: You missed the key "And certainly more efficient in terms of watt-secs/byte" part. -- ___ Python tracker ___ _

[issue39298] add BLAKE3 to hashlib

2022-03-24 Thread Jack O'Connor
Jack O'Connor added the comment: > Truncated sha512 (sha512-256) typically performs 40% faster than sha256 on > X86_64. Without hardware acceleration, yes. But because SHA-NI includes only SHA-1 and SHA-256, and not SHA-512, it's no longer a level playing field. OpenSSL's SHA-512 and SHA-51

[issue39298] add BLAKE3 to hashlib

2022-03-24 Thread Christian Heimes
Christian Heimes added the comment: sha1 should be considered broken anyway and sha256 does not perform well on 64bit systems. Truncated sha512 (sha512-256) typically performs 40% faster than sha256 on X86_64. It should get you close to the performance of BLAKE3 SSE4.1 on your system. -

[issue39298] add BLAKE3 to hashlib

2022-03-24 Thread Jack O'Connor
Jack O'Connor added the comment: > Hardware accelerated SHAs are likely faster than blake3 single core. Surprisingly, they're not. Here's a quick measurement on my recent ThinkPad laptop (64 KiB of input, single-threaded, TurboBoost left on), which supports both AVX-512 and the SHA extension

[issue39298] add BLAKE3 to hashlib

2022-03-23 Thread Larry Hastings
Larry Hastings added the comment: > Performance wise... The SHA series have hardware acceleration on > modern CPUs and SoCs. External libraries such as OpenSSL are in a > position to provide implementations that make use of that. Same with > the Linux Kernel CryptoAPI (https://bugs.python.org/

[issue39298] add BLAKE3 to hashlib

2022-03-23 Thread Gregory P. Smith
Gregory P. Smith added the comment: Rust based anything comes with a baseline level of Rust code overhead. https://stackoverflow.com/questions/29008127/why-are-rust-executables-so-huge That seems expected. -- ___ Python tracker

[issue39298] add BLAKE3 to hashlib

2022-03-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Here's a wheel which only includes the portable code (I disabled all the special cases as you suggested). Archive: dist/blake3_experimental_c-0.0.1-cp310-cp310-linux_x86_64.whl Length DateTimeName - -- - 29768

[issue39298] add BLAKE3 to hashlib

2022-03-23 Thread Gregory P. Smith
Gregory P. Smith added the comment: Performance wise... The SHA series have hardware acceleration on modern CPUs and SoCs. External libraries such as OpenSSL are in a position to provide implementations that make use of that. Same with the Linux Kernel CryptoAPI (https://bugs.python.org/iss

[issue39298] add BLAKE3 to hashlib

2022-03-23 Thread Gregory P. Smith
Gregory P. Smith added the comment: To anyone else who comes along with motivation: I'm fine with blake3 being in hashlib, but I don't want us to guarantee it by carrying the implementation of the algorithm in the CPython codebase itself unless it gains wide industry standard-like adoption s

[issue39298] add BLAKE3 to hashlib

2022-03-23 Thread Larry Hastings
Larry Hastings added the comment: I can't answer why the Rust one is so much larger--that's a question for Jack. But the blake3-py you built might (should?) have support for SIMD extensions. See the setup.py for how that works; it appears to at least try to use the SIMD extensions on x86 P

[issue39298] add BLAKE3 to hashlib

2022-03-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: With "lean" I meant: doesn't use much code and is easy to compile and install. I built a wheel from Jack's experimental package and it comes out to just under 100kB on Linux x64, compared to around the 1.1MB the Rust wheel needs: Archive: blake3_experime

[issue39298] add BLAKE3 to hashlib

2022-03-23 Thread Larry Hastings
Larry Hastings added the comment: The Rust version is already quite "lean". And it can be much faster than the C version, because it supports internal multithreading. Even without multithreading I bet it's at least a hair faster. Also, Jack has independently written a Python package based

[issue39298] add BLAKE3 to hashlib

2022-03-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: On 23.03.2022 17:53, Larry Hastings wrote: > > Ok, I give up. Sorry to spoil the fun, but there's no need to throw everything in the bin ;-) A lean and fast blake3 C package would still be a great thing to have on PyPI, e.g. provide support for platforms

[issue39298] add BLAKE3 to hashlib

2022-03-23 Thread Larry Hastings
Larry Hastings added the comment: Ok, I give up. -- resolution: -> rejected stage: patch review -> resolved status: open -> closed ___ Python tracker ___

[issue39298] add BLAKE3 to hashlib

2022-03-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: On 23.03.2022 02:12, Gregory P. Smith wrote: > > I view the NIST standard hashes as important enough to attempt to guarantee > as present (all the SHAs and MD5) as built-in. Others should really > demonstrate practical application popularity to gain incl

[issue39298] add BLAKE3 to hashlib

2022-03-22 Thread Jack O'Connor
Jack O'Connor added the comment: > maintaining a complicated build process in-tree For what it's worth, if you have any sort of "in a perfect world" vision for what the upstream BLAKE3 project could do to make it trivially easy for you to integrate, I'd be very interested in getting that don

[issue39298] add BLAKE3 to hashlib

2022-03-22 Thread Gregory P. Smith
Gregory P. Smith added the comment: Because I don't think blake3 or blake2 _(though we've shipped it already so there's a challenge in making changes https://bugs.python.org/issue47095)_ are important enough to be _guaranteed_ present in all builds (our release binaries would include them).

[issue39298] add BLAKE3 to hashlib

2022-03-22 Thread Larry Hastings
Larry Hastings added the comment: > Given that I don't want to see us gain new vendored copies of > significant but non-critical third party hash code in our tree > (Modules/_blake3/impl/ in PR 31686) for anything but a known > fixed term need (ex: the sha2 libtomcrypt code is gone from > our t

[issue39298] add BLAKE3 to hashlib

2022-03-22 Thread Gregory P. Smith
Gregory P. Smith added the comment: correction: our md5/sha1/sha2/sha3 code is not gone yet, but they are simple C implementations used as a fallback when the provider of optimal versions are unavailable (openssl for those). That keeps the copies of code in our tree simple and most people u

[issue39298] add BLAKE3 to hashlib

2022-03-22 Thread Gregory P. Smith
Gregory P. Smith added the comment: hashlib creator and other maintainer here: I do not think it was a good idea for us to add blake2 to hashlib the way we did. So blake3 should not be presumed as a given, at least not done in the same manner. Background: While OpenSSL gained _some_ blake2

[issue39298] add BLAKE3 to hashlib

2022-03-22 Thread Larry Hastings
Larry Hastings added the comment: Jack: I've updated the PR, improving compatibility with the "blake3" package on PyPI. I took your notes, and also looked at the C module you wrote. The resulting commit is here: https://github.com/python/cpython/pull/31686/commits/37ce72b0444ad63fd1989ad36b

[issue39298] add BLAKE3 to hashlib

2022-03-05 Thread Larry Hastings
Larry Hastings added the comment: Right, and I did say "(or BDFL)". Apparently you didn't bother to consult with the BDFL in advance, or at least not in the usual public venues--I haven't found a record of such a conversation on the bpo issue, nor in python-dev. BTW you simultaneously propo

[issue39298] add BLAKE3 to hashlib

2022-03-05 Thread Christian Heimes
Christian Heimes added the comment: I didn't consult the steering council in 2016, because I lost the keys to the time machine. The very first SC election was in 2019. :) -- ___ Python tracker _

[issue39298] add BLAKE3 to hashlib

2022-03-04 Thread Larry Hastings
Larry Hastings added the comment: Jack O'Connor: > Was any of the experimental C extension code [...] useful to you? > I was wondering if it could be possible to copy blake3module.c from > there verbatim. I concede I didn't even look at it. The glue code to mate the library with the CPython

[issue39298] add BLAKE3 to hashlib

2022-03-04 Thread Christian Heimes
Christian Heimes added the comment: GH-31686 is a massive patch set. I'm feeling uncomfortable adding such much new code for a new hashing algorithm. Did you ask the Steering Council for approval? The platform detection and compiler flag logic must be added to configure.ac instead of setup.p

[issue39298] add BLAKE3 to hashlib

2022-03-04 Thread Jack O'Connor
Jack O'Connor added the comment: Thanks Larry! Was any of the experimental C extension code under https://github.com/oconnor663/blake3-py/tree/master/c_impl useful to you? I was wondering if it could be possible to copy blake3module.c from there verbatim. The setup.py build there also has wo

[issue39298] add BLAKE3 to hashlib

2022-03-04 Thread Larry Hastings
Larry Hastings added the comment: Also, for what it's worth: I just ran my checksum benchmarker using a freshly built python a la my PR. Here are my results when hashing 462183782 bytes (dicey-dungeons-linux64.zip): hash algorithm timebytes/sec si

[issue39298] add BLAKE3 to hashlib

2022-03-04 Thread Larry Hastings
Larry Hastings added the comment: Okay, so. Here's a PR that adds BLAKE3 support to hashlib. The code was straightforward; I just took the BLAKE2 module and modified it to only have one algorithm. I also copied over the whole C directory tree from BLAKE3, which is totally okay fine by thei

[issue39298] add BLAKE3 to hashlib

2022-03-04 Thread Larry Hastings
Change by Larry Hastings : -- pull_requests: +29805 stage: needs patch -> patch review pull_request: https://github.com/python/cpython/pull/31686 ___ Python tracker ___ ___

[issue39298] add BLAKE3 to hashlib

2022-02-19 Thread Jack O'Connor
Jack O'Connor added the comment: Yes, everything in https://github.com/BLAKE3-team/BLAKE3 and https://github.com/oconnor663/blake3-py is public domain via CC0, and dual licensed under Apache for good measure. Hopefully that makes it easy to use it anywhere. -- _

[issue39298] add BLAKE3 to hashlib

2022-02-18 Thread Larry Hastings
Larry Hastings added the comment: Just checking--I can liberally pull code from https://github.com/BLAKE3-team/BLAKE3 yes? -- ___ Python tracker ___ _

[issue39298] add BLAKE3 to hashlib

2022-02-17 Thread Larry Hastings
Larry Hastings added the comment: I thought someone volunteered to do it--if that's not happening, I could take a look at it next week. Shouldn't be too hard... unless I have to touch autoconf, which I only barely understand. -- ___ Python tracke

[issue39298] add BLAKE3 to hashlib

2022-02-17 Thread Jack O'Connor
Jack O'Connor added the comment: What's the best way for me to help with the next steps of this? -- ___ Python tracker ___ ___ Pyth

[issue39298] add BLAKE3 to hashlib

2022-01-12 Thread Jack O'Connor
Jack O'Connor added the comment: Yeah by intrinsics I mean stuff like _mm256_add_epi32(). All of that stuff is in these vendored files: blake3_avx2.c blake3_avx512.c blake3_neon.c blake3_sse2.c blake3_sse41.c Also to Michał's question above, I'm not necessarily opposed to publishing somethin

[issue39298] add BLAKE3 to hashlib

2022-01-12 Thread Larry Hastings
Larry Hastings added the comment: I assume by "intrinsics" you mean using the GCC SIMD stuff, not like inlining memcpy() or something. My assumption is yes, that's fine, we appear to already be using them for BLAKE2. -- ___ Python tracker

[issue39298] add BLAKE3 to hashlib

2022-01-12 Thread Jack O'Connor
Jack O'Connor added the comment: > As a first pass I say we merge the reference C implementation. Do you mean portable-only C code, or portable + intrinsics? If the assembly files are out, I'd advocate for the latter. The intrinsics implementations are nearly as fast as the assembly code, an

[issue39298] add BLAKE3 to hashlib

2022-01-12 Thread Larry Hastings
Larry Hastings added the comment: > In setup.py I assume that the target platform of the build is the same as the > current interpreter's platform. If this is included in CPython, it won't be using setup.py, so this isn't a concern. I don't think there's a way to use setup.py to cross-compi

[issue39298] add BLAKE3 to hashlib

2022-01-12 Thread Jack O'Connor
Jack O'Connor added the comment: I was about to say the only missing feature was docstrings, but then I realized I hadn't included releasing the GIL. I've added that and pushed an update just now. Fingers crossed there's nothing else I've missed. I think it's in reasonably good shape, and I'

[issue39298] add BLAKE3 to hashlib

2022-01-12 Thread Michał Górny
Michał Górny added the comment: I would still find it helpful to have a "proper" "blake3-c" package on normal pypi, for those of us who can't rely on Rust being present immediately. -- ___ Python tracker ___

[issue39298] add BLAKE3 to hashlib

2022-01-11 Thread Larry Hastings
Larry Hastings added the comment: So, can we shoot for adding this to 3.11? Jack, do you consider the code is in good shape? I'd be up for shepherding it along in the process. In particular, I can contribute the bindings so BLAKE3 is a first-class citizen of hashlib. -- _

[issue39298] add BLAKE3 to hashlib

2022-01-11 Thread Jack O'Connor
Jack O'Connor added the comment: Ah, good idea. I've published the new C implementation as: https://test.pypi.org/project/blake3-experimental-c/ You can install it with: pip install -i https://test.pypi.org/simple/ blake3-experimental-c Despite the package name change, the extension module

[issue39298] add BLAKE3 to hashlib

2022-01-11 Thread Christian Heimes
Christian Heimes added the comment: You could upload the code to https://test.pypi.org/ -- ___ Python tracker ___ ___ Python-bugs-l

[issue39298] add BLAKE3 to hashlib

2022-01-10 Thread Jack O'Connor
Jack O'Connor added the comment: Update: There is now a C version of the `blake3` Python module available at https://github.com/oconnor663/blake3-py/tree/master/c_impl. It's completely API-compatible with the Rust version, and it passes the same test suite. Multithreading (which is implement

[issue39298] add BLAKE3 to hashlib

2021-09-05 Thread Jack O'Connor
Jack O'Connor added the comment: Hi Michał, no I haven't done any more work on this since my comments back in April. If you wanted to get started on a PyPI implementation, I think that would be fantastic. I'd be happy to collaborate over email: oconnor...@gmail.com. The branches I linked are

[issue39298] add BLAKE3 to hashlib

2021-09-04 Thread Michał Górny
Michał Górny added the comment: Jack, are you still working on this? I was considering allocating the time to write the bindings for the C library myself but I've stumbled upon this bug and I suppose there's no point in duplicating work. I'd love to see it on pypi, so we could play with it

[issue39298] add BLAKE3 to hashlib

2021-04-19 Thread Jack O'Connor
Jack O'Connor added the comment: Hey Christian, yes these are new bindings, and also incomplete. See comments in https://github.com/oconnor663/cpython/commit/dc6f6163ad9754c9ad53e9e3f3613ca3891a77ba, but in short only x86-64 Unix is in working order. If 3.10 doesn't seem realistic, I'm happy

[issue39298] add BLAKE3 to hashlib

2021-04-18 Thread Christian Heimes
Christian Heimes added the comment: 3.10 feature freeze is in two weeks (May 3). I don't feel comfortable to add so much new C code shortly before beta 1. If I understandly correctly the code is new and hasn't been published on PyPI yet. I also don't have much time to properly review the cod

[issue39298] add BLAKE3 to hashlib

2021-04-18 Thread Larry Hastings
Larry Hastings added the comment: I note that Python already ships with some #ifdefs around SSE and the like. So, yes, we already do this sort of thing, although I think this usually uses compiler intrinsics rather than actual assembly. A quick grep shows zero .s files and only one .asm fi

[issue39298] add BLAKE3 to hashlib

2021-04-18 Thread Jack O'Connor
Jack O'Connor added the comment: An update a year later: I have a proof-of-concept branch that adds BLAKE3 support to hashlib: https://github.com/oconnor663/cpython/tree/blake3. That branch is API compatible with the current master branch of https://github.com/oconnor663/blake3-py. Both that

[issue39298] add BLAKE3 to hashlib

2020-03-04 Thread Jack O'Connor
Jack O'Connor added the comment: I've just published some Python bindings for the Rust implementation on PyPI: https://pypi.org/project/blake3 > I'm guessing Python is gonna hold off until BLAKE3 reaches 1.0. That's very fair. The spec and test vectors are set in stone at this point, but th

[issue39298] add BLAKE3 to hashlib

2020-02-19 Thread Maor Kleinberger
Change by Maor Kleinberger : -- nosy: +kmaork ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.py

[issue39298] add BLAKE3 to hashlib

2020-02-12 Thread Larry Hastings
Larry Hastings added the comment: Personally I'm enjoying these BLAKE3 status updates, and I wouldn't mind at all being kept up-to-date during BLAKE3's development via messages on this issue. But, given the tenor of the conversation so far, I'm guessing Python is gonna hold off until BLAKE3

[issue39298] add BLAKE3 to hashlib

2020-02-12 Thread Jack O'Connor
Jack O'Connor added the comment: Version 0.2.0 of the BLAKE3 repo includes optimized assembly implementations. These are behind the "c" Cargo feature for the `blake3` Rust crate, but included by default for the internal bindings crate. So the easiest way to rerun our favorite benchmark is:

[issue39298] add BLAKE3 to hashlib

2020-02-01 Thread Jakub Stasiak
Change by Jakub Stasiak : -- nosy: +jstasiak ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.pyt

[issue39298] add BLAKE3 to hashlib

2020-01-27 Thread Larry Hastings
Larry Hastings added the comment: I just tried it with clang, and uff-da! 2,737,446,868 bytes/sec! p.s. I compiled with -O3 for both gcc and clang -- ___ Python tracker ___

[issue39298] add BLAKE3 to hashlib

2020-01-27 Thread Larry Hastings
Larry Hastings added the comment: I gave it a go. And yup, I see a definite improvement: it jumped from 1,583,326,242 bytes/sec to 2,376,741,703 bytes/sec on my Intel laptop using AVX2. A 50% improvement! I also *think* I'm seeing a 10% improvement in ARM using NEON. On my DE10-Nano boar

[issue39298] add BLAKE3 to hashlib

2020-01-22 Thread Jack O'Connor
Jack O'Connor added the comment: Version 0.1.3 of the official BLAKE3 repo includes some significant performance improvements: - The x86 implementations include explicit prefetch instructions, which helps with long inputs. (commit b8c33e1) - The C implementation now uses the same parallel pa

[issue39298] add BLAKE3 to hashlib

2020-01-17 Thread Jack O'Connor
Jack O'Connor added the comment: I plan to bring the C code up to speed with the Rust code this week. As part of that, I'll probably remove comments like the one above :) Otherwise, is there anything else we can do on our end to help with this? -- ___

[issue39298] add BLAKE3 to hashlib

2020-01-16 Thread Jack O'Connor
Jack O'Connor added the comment: Ok, I've added Rust bindings to the BLAKE3 C implementation, so that I can benchmark it in a vaguely consistent way. My laptop is an i5-8250U, which should be very similar to yours. (Both are "Kaby Lake Refresh".) My end result do look similar to yours with T

[issue39298] add BLAKE3 to hashlib

2020-01-13 Thread Larry Hastings
Larry Hastings added the comment: According to my order details it is a "8th Generation Intel Core i7-8650U". -- ___ Python tracker ___ ___

[issue39298] add BLAKE3 to hashlib

2020-01-13 Thread Jack O'Connor
Jack O'Connor added the comment: I'm in the middle of adding some Rust bindings to the C implementation in github.com/BLAKE3-team/BLAKE3, so that `cargo test` and `cargo bench` can cover both. Once that's done, I'll follow up with benchmark numbers from my laptop (Kaby Lake i5-8250U, also AV

[issue39298] add BLAKE3 to hashlib

2020-01-11 Thread Larry Hastings
Larry Hastings added the comment: For what it's worth, I spent some time producing clean benchmarks. All these were run on the same laptop, and all pre-load the same file (406668786 bytes) and run one update() on the whole thing to minimize overhead. K12 and BLAKE3 are using a hand-written

[issue39298] add BLAKE3 to hashlib

2020-01-11 Thread Christian Heimes
Christian Heimes added the comment: I've been playing with the new algorithm, too. Pretty impressive! Let's give the reference implementation a while to stabilize. The code has comments like: "This is only for benchmarking. The guy who wrote this file hasn't touched C since college. Please d

[issue39298] add BLAKE3 to hashlib

2020-01-10 Thread Karthikeyan Singaravelan
Change by Karthikeyan Singaravelan : -- nosy: +xtreak ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https:/

[issue39298] add BLAKE3 to hashlib

2020-01-10 Thread Dong-hee Na
Change by Dong-hee Na : -- nosy: +corona10 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.pytho

[issue39298] add BLAKE3 to hashlib

2020-01-10 Thread Larry Hastings
New submission from Larry Hastings : >From 3/4 of the team that brought you BLAKE2, now comes... BLAKE3! https://github.com/BLAKE3-team/BLAKE3 BLAKE3 is a brand new hashing function. It's fast, it's paralellizeable, and unlike BLAKE2 there's only one variant. I've experimented with it a lit