[issue45025] Reliance on C bit fields in C API is undefined behavior

2021-08-31 Thread STINNER Victor
STINNER Victor added the comment: > My use case for these low-level APIs is to write tests for low-level > string/encoding handling in my custom use of the PyPreConfig and PyConfig > structs. I wanted to verify that exact byte sequences were turned into > specific representations inside of P

[issue45025] Reliance on C bit fields in C API is undefined behavior

2021-08-30 Thread Gregory Szorc
Gregory Szorc added the comment: My use case for these low-level APIs is to write tests for low-level string/encoding handling in my custom use of the PyPreConfig and PyConfig structs. I wanted to verify that exact byte sequences were turned into specific representations inside of Python str

[issue45025] Reliance on C bit fields in C API is undefined behavior

2021-08-30 Thread STINNER Victor
STINNER Victor added the comment: > PyUnicode_KIND does *not* expose the implementation details to the programmer. PyUnicode_KIND() is very specific to the exact PEP 393 implementation. Documentation of this field: --- /* Character size: - PyUnicode_WCHAR_KIND (0): * character type

[issue45025] Reliance on C bit fields in C API is undefined behavior

2021-08-30 Thread Petr Viktorin
Petr Viktorin added the comment: PyUnicode_KIND does *not* expose the implementation details to the programmer. If the internal representation os strings is switched to use masks and shifts instead of bitfields, PyUnicode_KIND (and others) can be adapted to the new details without breaking A

[issue45025] Reliance on C bit fields in C API is undefined behavior

2021-08-30 Thread STINNER Victor
STINNER Victor added the comment: > In order to avoid undefined behavior, Python's C API should avoid all use of > bit fields. See also the PEP 620. IMO more generally, the C API should not expose structures, but provide ways to access it through getter and setter functions. See bpo-40120 "

[issue45025] Reliance on C bit fields in C API is undefined behavior

2021-08-30 Thread STINNER Victor
STINNER Victor added the comment: > The macro PyUnicode_KIND is part of the documented public C API. IMO it was a mistake to expose it as part of the public C API. This is an implementation detail which should not be exposed. The C API should not expose *directly* how characters are stored i

[issue45025] Reliance on C bit fields in C API is undefined behavior

2021-08-30 Thread Petr Viktorin
Petr Viktorin added the comment: The macro PyUnicode_KIND is part of the documented public C API. It accesses the bit field "state.kind" directly. -- ___ Python tracker ___ _

[issue45025] Reliance on C bit fields in C API is undefined behavior

2021-08-30 Thread STINNER Victor
STINNER Victor added the comment: > At least the PyASCIIObject struct in Include/cpython/unicodeobject.h uses bit > fields. Various preprocessor macros like PyUnicode_IS_ASCII() and > PyUnicode_KIND() access this struct's bit field. What is your use case? Which functions do you need? You sh

[issue45025] Reliance on C bit fields in C API is undefined behavior

2021-08-30 Thread Georg Brandl
Change by Georg Brandl : -- nosy: +georg.brandl ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.

[issue45025] Reliance on C bit fields in C API is undefined behavior

2021-08-30 Thread Georg Brandl
Change by Georg Brandl : -- nosy: +vstinner ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.pyth

[issue45025] Reliance on C bit fields in C API is undefined behavior

2021-08-27 Thread Erlend E. Aasland
Change by Erlend E. Aasland : -- nosy: +petr.viktorin ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https:/

[issue45025] Reliance on C bit fields in C API is undefined behavior

2021-08-26 Thread Gregory Szorc
Change by Gregory Szorc : -- title: Reliance on C bit fields is C API is undefined behavior -> Reliance on C bit fields in C API is undefined behavior ___ Python tracker ___ _