[issue39295] usage of bitfields in ctypes structures changed between 3.7.5 and 3.7.6
New submission from Matthew Newville : We have a library (https://github.com/pyepics/pyepics) that wraps several C structures for a communication protocol library that involves many C->Python callbacks. One of the simpler structures we wrap with ctypes is defined with typedef struct ca_access_rights { unsignedread_access:1; unsignedwrite_access:1; } caar; struct access_rights_handler_args { long chanId; /* channel id */ caar access; /* access rights state */ }; which we had wrapped (perhaps naively) as class access_rights_handler_args(ctypes.Structure): "access rights arguments" _fields_ = [('chid', ctypes.c_long), ('read_access', ctypes.c_uint, 1), ('write_access', ctypes.c_uint, 1)] which we would then this structure as the function argument of a callback function that the underlying library would call, using _Callback = ctypes.CFUNCTYPE(None, ctypes.POINTER(access_rights_handler_args))(access_rights_handler) and the python function `access_righte_handler` would be able to unpack and use this structure. This worked for Python 2.7, 3.3 - 3.7.5 on 64-bit Linux, Windows, and MacOS. This code was well-tested and was used in production code on very many systems. It did not cause segfaults. With Python 3.7.6 this raises an exception at the ctypes.CFUNCTYPE() call with ./lib/python3.7/ctypes/__init__.py", line 99, in CFUNCTYPE class CFunctionType(_CFuncPtr): TypeError: item 1 in _argtypes_ passes a struct/union with a bitfield by value, which is unsupported. We were able to find a quick work-around this by changing the structure definition to be class access_rights_handler_args(ctypes.Structure): "access rights arguments" _fields_ = [('chid', ctypes.c_long), ('access', ctypes.c_ubyte)] and then explicitly extract the 2 desired bits from the byte. Of course, that byte is more data than is being sent in the structure, so there is trailing garbage. This change seems to have been related to https://bugs.python.org/issue16576. Is there any way to restore the no-really-I'm-not-making-it-up-it-was-most-definitely-working-for-us behavior of Python 3.7.5 and earlier? If this is not possible, what would be the right way to wrap this sort of structure? Thanks -- components: ctypes messages: 359763 nosy: Matthew Newville priority: normal severity: normal status: open title: usage of bitfields in ctypes structures changed between 3.7.5 and 3.7.6 type: behavior versions: Python 3.7 ___ Python tracker <https://bugs.python.org/issue39295> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39295] usage of bitfields in ctypes structures changed between 3.7.5 and 3.7.6
Matthew Newville added the comment: Thanks for the reply and the fix -- I have not tried the master branch, but will try to do that soon. If I understand correctly, we will have to stick with our kludgy "workaround" version in order to work with Python 3.7.6 and 3.8.1. Or is there a better approach than our workaround of using class access_rights_handler_args(ctypes.Structure): "access rights arguments" _fields_ = [('chid', ctypes.c_long), ('access', ctypes.c_ubyte)] ? As a long-time (20 years) Python user and first-time reporter of a bug to main Python, I'm both very appreciative of the effort and slightly alarmed by reading the messages related to #16575. From far outside the Python dev world, it appears that an old, poorly verified bug report inspired a change that was actually not well tested and so incorrectly broke valid code without deprecation. Trying to be as polite as possible, this appears to indicate a poor testing process, if not a poor understanding of the actual code in question. Trust is an important aspect of open source software, and much easier to lose than gain. I strongly encourage you and other Python devs to carefully assess what went wrong here and to work out (and write down) what will be done going forward to avoid such problems. Simply rolling this change back and saying "sorry, but we're overworked volunteers and stuff happens" is not going to regain lost trust. In fact, it's pretty close to a promise that this sort of issue will happen again. I think that you may want to make sure that it is not the take-away message here. Sorry if that sounds in any way unappreciative. Thanks. -- ___ Python tracker <https://bugs.python.org/issue39295> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39295] usage of bitfields in ctypes structures changed between 3.7.5 and 3.7.6
Matthew Newville added the comment: So, again, I'm trying to understand what the best workaround for this change is. I asked "can this workaround be improved" twice and got no reply, while getting plenty of responses to questions about the development process. I take this to mean that the workaround we have is the best available. That's unfortunate. >> it appears that an old, poorly verified bug report > > Why do you say the bug report is poorly verified? The libffi maintainers > accept it is an issue. > > https://github.com/libffi/libffi/issues/33 Well, it is more than six years old. It did not appear that lots of people were saying "yeah me too, please fix!". Maybe I missed those calls of urgency? >> inspired a change that was actually not well tested and so incorrectly >> broke valid code without deprecation. Trying to be as polite as possible, >> >> this appears to indicate a poor testing process, if not a poor >> understanding of the actual code in question. > Well, the original developer of ctypes is no longer involved in maintaining > > it. Those of us who try to address ctypes issues are perhaps not as well- > versed in the code as the original developer and maintainer, but we do our > best. This of course applies to other areas in CPython, and many other > projects besides. I think you are agreeing with me ;). That worries me. > The change was accompanied by tests, which have been no doubt found > wanting, It appears that the change was *intended* to fix a problem with Unions, but had the unintended consequence of not allowing any structures with bitfields. That suggests that the ctypes tests don't include structures with bitfields, These seem sort of common to me, so it appears that testing of ctypes is far less comprehensive than I had imagined. > but do *you* write software which guarantees no bugs ever in released > versions? Of course not, and that is not the expectation. It's a bit alarming to hear Python devs be so defensive and using such straw man arguments. What is expected is that working code does not break without warning or deprecation and that testing is sort of complete. It is expected that when changes unintentionally break working code that the devs take a step back and say "wait, how did that happen? what are we not testing that our users are expecting to work?". It is also expected that problems are acknowledged and fixed in a timely manner. And yes, to the extent possible, we try to do those things. With Py 3.7.6 available and installed with `conda update`, this break was a very urgent problem (library failed to import!) for us and our downstream users. Within 36 hours of the first report of the problem, we had a workaround verified and pushed to PyPI. The response of Python team to the original problem and to the unintended breakage were much slower. > Using your example above, I will look into what was missed > and try to improve the checking. The underlying libffi issue is a real > one. The change wasn't introduced in a cavalier manner, as you seem to > be suggesting. Well, the change did wait six years from the first report and yet did not include a deprecation cycle. If that TypeError had been a Warning for a few releases, you would have no doubt heard questions like "why is this working code going to be deprecated" instead of "why did Python break by library". > Do you regularly test your code with Python alpha and beta versions? > I ask because I may reinstate the check for Python 3.9 after seeing what > false-positive cases were missed here. Python 3.9 is at alpha 2 level > right now. Continued feedback could help to minimise the chances of > future surprises. I have never tested an `alpha` versions, and not used many `beta` version either (code development is not actually my full-time job) -- certainly not in the Python 3 series. I have trusted the Python devels. This has worked well for many years and Python versions. But that trust, especially concerning ctypes, is diminished (a structure with bitfields was unexpected usage??) and we probably should be testing beta versions regularly. -- ___ Python tracker <https://bugs.python.org/issue39295> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39295] usage of bitfields in ctypes structures changed between 3.7.5 and 3.7.6
Matthew Newville added the comment: @eryksun Sorry for the imprecision -- I was mixing what we do on Linux and Windows. A minimum verifiable example for Linux/MacOS would be import ctypes class bitstruct(ctypes.Structure): _fields_ = [('addr', ctypes.c_long), ('rbit', ctypes.c_uint, 1), ('wbit', ctypes.c_uint, 1)] def handler(args): print("handler: ", args.addr, args.rbit, args.wbit) callback = ctypes.CFUNCTYPE(None, bitstruct)(handler) This works with 3.7.5 but raises a TypeError with 3.7.6. For Windows (or, well, 64-bit Windows, the only kind we bother to support), we find that we have to wrap the function and use a POINTER to the struct, so what we really use is more like import os, functools def make_callback(args, func): """ make callback function""" @functools.wraps(func) def wrapped(arg, **kwargs): if hasattr(arg, 'contents'): return func(arg.contents, **kwargs) return func(arg, **kwargs) if os.name =='nt': # also check for 64-bit cb = ctypes.CFUNCTYPE(None, ctypes.POINTER(args))(wrapped) else: cb = ctypes.CFUNCTYPE(None, bitstruct)(handler) return cb callback = make_callback(bitstruct, handler) > ... > This seems rights to me. There is no problem passing a pointer > as a function parameter. The problem here is that code that worked with 3.7.5 raises a TypeError with 3.7.6. I don't know that the solution we came up with is actually the best approach. I've asked for such guidance a few times now. I don't know why using a pointer would be required for a structure containing a "u_int, 1", but not for other structures, but any guidance would be much appreciated. -- ___ Python tracker <https://bugs.python.org/issue39295> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com