Jack O'Connor <oconnor...@gmail.com> added the comment:
I was about to say the only missing feature was docstrings, but then I realized I hadn't included releasing the GIL. I've added that and pushed an update just now. Fingers crossed there's nothing else I've missed. I think it's in reasonably good shape, and I'd like to propose it for inclusion in 3.11. However, I'm not very experienced with setup.py or the Python C API, so I might not be the best judge of shape. Here are some highlights for reviewers, where I think the implementation (mostly the build) could be shaky: - Platform detection. In setup.py I assume that the target platform of the build is the same as the current interpreter's platform. So for example, if the current interpreter has sys.maxsize==2**31-1, I assume we're compiling for 32-bit. This clearly doesn't allow for any sort of cross-compilation, and in general I need feedback about whether there's a more correct way to obtain the target platform. - Compiling assembly files. On Unix this is easy, because we can supply *.S files as `extra_objects` and GCC/Clang will do the right thing. But on Windows this isn't easy. Currently I shell out to vswhere.exe, ask it for the path to the latest version of the ml64.exe assembler, and then shell out to that to build .obj files. Then I pass those assembled .obj files as `extra_objects`. This feels awfully manual, and there's a good chance I've missed some better-supported way to do it. I assume we don't want to check in the .obj files? - Does Python support the GNU ABI on Windows? We have assembly files for this in vendor/, but I'm not currently building them. - Compiling intrinsics files for 32-bit x86. In this case, I create a `ccompiler.new_compiler()` for each intrinsics file, so that I can set the appropriate flags for each. This is relatively clean, but it leads to things getting rebuilt every single time, rather than participating in `setup.py build` caching. Maybe nobody cares about this, but again it makes me think there might be a better-supported way to do it. - blake3module.c contains an awful lot of gotos to handle allocation failure cases. Is this still considered a best practice? These are bug-prone, and I'm not sure how to test them. - Existing hashlib implementations include an optimization where they avoid allocating an internal mutex until they see a long input and want to release the GIL. As a quirky side effect of this, they handle allocation failures for that mutex by falling back to the do-not-release-the-GIL codepath. That feels kind of complicated to me, and in my code I'm always allocating the mutex during initialization. This doesn't seem to make much of a performance difference when I measure it, but there might be use cases I haven't considered. Here are some other API details that might be worth bikeshedding: - The `max_threads` parameter is currently defined to take a special value, `blake3.AUTO`, to indicate that the implementation may use as many threads as it likes. (The C implementation doesn't include multithreading support, but it's API-compatible with the Rust implementation.) `blake3.AUTO` is currently a class attribute equal to -1. We may want to bikeshed this name or propose some other representation. - BLAKE3 has three modes: regular hashing, keyed hashing, and key derivation. The keyed hashing mode takes a 32-byte key, and the key derivation mode takes a context string. Calling the 32-byte key `key` seems good. Calling the context string `context` seems less good. Larry has pointed out before that lots of random things are called `context`, and readers might not understand what they're seeing. I currently call it `derive_key_context` instead, but we could bikeshed this. - I check `itemsize` on input Py_buffers and throw an exception if it's anything other than 1. My test suite exercises this, see `test_int_array_fails`. However, some (all?) standard hashes don't do this check. For example: >>> from hashlib import sha256 >>> import array >>> a = array.array("i", [255]) >>> sha256(a).hexdigest() '81ff65efc4487853bdb4625559e69ab44f19e0f5efbd6d5b2af5e3ab267c8e06' >>> sha256(bytes([0xff, 0, 0, 0])).hexdigest() '81ff65efc4487853bdb4625559e69ab44f19e0f5efbd6d5b2af5e3ab267c8e06' Here we can see sha256() hashing an array of int. On my machine, an int is 4 little-endian bytes, but of course this sort of thing isn't portable. The same array will result in a different SHA-256 output on a big-endian machine, or on a machine with ints of a different size. This seems undesirable, and I'm surprised that hashlib allows it. However, if there's some known compatibility reason why we have to allow it, I could remove this check. ---------- versions: +Python 3.11 -Python 3.10 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue39298> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com