Jack O'Connor <oconnor...@gmail.com> added the comment:

I was about to say the only missing feature was docstrings, but then I realized 
I hadn't included releasing the GIL. I've added that and pushed an update just 
now. Fingers crossed there's nothing else I've missed. I think it's in 
reasonably good shape, and I'd like to propose it for inclusion in 3.11.

However, I'm not very experienced with setup.py or the Python C API, so I might 
not be the best judge of shape. Here are some highlights for reviewers, where I 
think the implementation (mostly the build) could be shaky:

- Platform detection. In setup.py I assume that the target platform of the 
build is the same as the current interpreter's platform. So for example, if the 
current interpreter has sys.maxsize==2**31-1, I assume we're compiling for 
32-bit. This clearly doesn't allow for any sort of cross-compilation, and in 
general I need feedback about whether there's a more correct way to obtain the 
target platform.

- Compiling assembly files. On Unix this is easy, because we can supply *.S 
files as `extra_objects` and GCC/Clang will do the right thing. But on Windows 
this isn't easy. Currently I shell out to vswhere.exe, ask it for the path to 
the latest version of the ml64.exe assembler, and then shell out to that to 
build .obj files. Then I pass those assembled .obj files as `extra_objects`. 
This feels awfully manual, and there's a good chance I've missed some 
better-supported way to do it. I assume we don't want to check in the .obj 
files?

- Does Python support the GNU ABI on Windows? We have assembly files for this 
in vendor/, but I'm not currently building them.

- Compiling intrinsics files for 32-bit x86. In this case, I create a 
`ccompiler.new_compiler()` for each intrinsics file, so that I can set the 
appropriate flags for each. This is relatively clean, but it leads to things 
getting rebuilt every single time, rather than participating in `setup.py 
build` caching. Maybe nobody cares about this, but again it makes me think 
there might be a better-supported way to do it.

- blake3module.c contains an awful lot of gotos to handle allocation failure 
cases. Is this still considered a best practice? These are bug-prone, and I'm 
not sure how to test them.

- Existing hashlib implementations include an optimization where they avoid 
allocating an internal mutex until they see a long input and want to release 
the GIL. As a quirky side effect of this, they handle allocation failures for 
that mutex by falling back to the do-not-release-the-GIL codepath. That feels 
kind of complicated to me, and in my code I'm always allocating the mutex 
during initialization. This doesn't seem to make much of a performance 
difference when I measure it, but there might be use cases I haven't considered.

Here are some other API details that might be worth bikeshedding:

- The `max_threads` parameter is currently defined to take a special value, 
`blake3.AUTO`, to indicate that the implementation may use as many threads as 
it likes. (The C implementation doesn't include multithreading support, but 
it's API-compatible with the Rust implementation.) `blake3.AUTO` is currently a 
class attribute equal to -1. We may want to bikeshed this name or propose some 
other representation.

- BLAKE3 has three modes: regular hashing, keyed hashing, and key derivation. 
The keyed hashing mode takes a 32-byte key, and the key derivation mode takes a 
context string. Calling the 32-byte key `key` seems good. Calling the context 
string `context` seems less good. Larry has pointed out before that lots of 
random things are called `context`, and readers might not understand what 
they're seeing. I currently call it `derive_key_context` instead, but we could 
bikeshed this.

- I check `itemsize` on input Py_buffers and throw an exception if it's 
anything other than 1. My test suite exercises this, see 
`test_int_array_fails`. However, some (all?) standard hashes don't do this 
check. For example:

    >>> from hashlib import sha256
    >>> import array
    >>> a = array.array("i", [255])
    >>> sha256(a).hexdigest()
    '81ff65efc4487853bdb4625559e69ab44f19e0f5efbd6d5b2af5e3ab267c8e06'
    >>> sha256(bytes([0xff, 0, 0, 0])).hexdigest()
    '81ff65efc4487853bdb4625559e69ab44f19e0f5efbd6d5b2af5e3ab267c8e06'

Here we can see sha256() hashing an array of int. On my machine, an int is 4 
little-endian bytes, but of course this sort of thing isn't portable. The same 
array will result in a different SHA-256 output on a big-endian machine, or on 
a machine with ints of a different size. This seems undesirable, and I'm 
surprised that hashlib allows it. However, if there's some known compatibility 
reason why we have to allow it, I could remove this check.

----------
versions: +Python 3.11 -Python 3.10

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue39298>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to