[issue46323] Use _PyObject_Vectorcall in Modules/_ctypes/callbacks.c
New submission from hydroflask : _CallPythonObject() in Modules/_ctypes/callbacks.c still uses the old API (PyObject_CallObject()) to invoke methods. It should use _PyObject_Vectorcall(). Creating an issue for this since this method is a little more performance sensitive than normal Python library code. -- components: ctypes messages: 410185 nosy: hydroflask priority: normal severity: normal status: open title: Use _PyObject_Vectorcall in Modules/_ctypes/callbacks.c type: performance versions: Python 3.10, Python 3.11, Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue46323> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46323] Use _PyObject_Vectorcall in Modules/_ctypes/callbacks.c
hydroflask added the comment: @corona10 Thank you for looking into it. The benchmark you did is not reflective of my use-case. I have a codebase that I have completely converted into a C extension using mypyc. So the ctypes callback does not call into interpreted Python code but actually another C-extension. I have annotated types aggressively to avoid invoking malloc(). In this use-case the extra overhead from the tuple allocation would be significant. Additionally your benchmark may not be as micro as intended. You call some math in the callback function which may have made the savings from avoiding the tuple allocation seem more like noise. Since you already did the work I still think it's worth putting it in since PyObject_Vectorcall is an established API and it would have been used anyway if this were written from scratch. -- ___ Python tracker <https://bugs.python.org/issue46323> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46323] Use _PyObject_Vectorcall in Modules/_ctypes/callbacks.c
hydroflask added the comment: Okay I put together a benchmark that better exemplifies my use case and results are in favor of adding patch. When bench_callback_v2.py is not compiled with mypyc I see a ~10% reduction in runtime: $ unpatched-env/bin/python bench_callback_v2.py 1.1262263769749552 $ patched-env/bin/python bench_callback_v2.py 1.0174838998354971 When bench_callback_v2.py is compiled with mypyc, I am getting ~6% reduction in runtime: $ unpatched-env/bin/python -c "import bench_callback_v2" 1.0056699379347265 $ patched-env/bin/python -c "import bench_callback_v2" 0.9415687420405447 bench_callback_v2.py is attached. -- Added file: https://bugs.python.org/file50609/bench_callback_v2.py ___ Python tracker <https://bugs.python.org/issue46323> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46323] Use _PyObject_Vectorcall in Modules/_ctypes/callbacks.c
hydroflask added the comment: Vanilla bench_callback_v2.py, not compiled with mypyc: $ env/bin/python bench_callback_v2.py 1.114047016017139 $ env2/bin/python bench_callback_v2.py 0.9644024870358407 ~13% reduction in runtime. Nice job! -- ___ Python tracker <https://bugs.python.org/issue46323> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46323] Use _PyObject_Vectorcall in Modules/_ctypes/callbacks.c
hydroflask added the comment: > A benchmark on calling C functions using ctypes sounds better than a > benchmark calling Python functions. Calling C functions from Python is not the code path handled by _CallPythonObject() so no difference in run-time would theoretically be observed by that benchmark for this patch. This bug report pertains to code paths where a C function calls back into a Python function. A practical example is using Python with an event loop library written in C. -- ___ Python tracker <https://bugs.python.org/issue46323> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46323] Use _PyObject_Vectorcall in Modules/_ctypes/callbacks.c
hydroflask added the comment: Just to clarify further, the original benchmark by corona10 did indeed lead to `_CallPythonObject` being invoked but it was not quite the normal use-case. In the original benchmark it invoked the generated ctypes thunk via Python so as vstinner said it was doing this: Python -> converters -> thunk-> _CallPythonObject -> converters-> Python Namely using `PyCFuncPtr_call` to invoke the thunk generated by `_ctypes_alloc_callback`. Normally when invoking C functions via `PyCFuncPtr_call` it looks like this: Python -> converters -> C_function In the original benchmark setup no significant reduction in runtime was observed by this patch. I noticed that it was not a typical use of `_CallPythonObject`, where instead it would be a top-level C function calling back into Python. Something like this: C -> thunk -> _CallPythonObject() -> Python The benchmark I provided exercises that use case and the >=10% reduction in runtime was observed. Thanks to both corona10 and vstinner, I appreciate their effort in this issue. -- ___ Python tracker <https://bugs.python.org/issue46323> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46323] Use _PyObject_Vectorcall in Modules/_ctypes/callbacks.c
hydroflask added the comment: Two things, one is a nit pick the other is more serious. I think vstinner mentioned this in his review of your patch but on this line: https://github.com/python/cpython/commit/b5527688aae11d0b5af58176267a9943576e71e5#diff-706e65ee28911740bf638707e19578b8182e57c6a8a9a4a91105d825f95a139dR180 Instead of using PySequence_Fast_ITEMS(), you can just use PyTuple_GET_ITEM() since you know the converters are a tuple. In terms of runtime efficiency it doesn't change anything but is consistent with this line: https://github.com/python/cpython/commit/b5527688aae11d0b5af58176267a9943576e71e5#diff-706e65ee28911740bf638707e19578b8182e57c6a8a9a4a91105d825f95a139dR157 Though on second thought I think PySequence_Fast_ITEMS() is probably a better API overall in terms of efficiency if PyTuple_GET_ITEM() would eventually become a real function call given the long term push toward a stable API. The other issue is more serious, you are always allocating an array of CTYPES_MAX_ARGCOUNT pointers on the stack on every C callback. This could cause stack overflows in situations where a relatively deep set of callbacks are invoked. This usage of CTYPES_MAX_ARGCOUNT differs its usage in callproc.c, where in that case `alloca()` is used to allocate the specific number of array entries on the stack. To avoid an unintended stack overflow I would use alloca() or if you don't like alloca() I would only allocate a relatively small constant number on the stack. -- ___ Python tracker <https://bugs.python.org/issue46323> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46323] Use _PyObject_Vectorcall in Modules/_ctypes/callbacks.c
hydroflask added the comment: I don't have github so I can't comment there but just as a nitpick, alloca() never returns NULL, so you don't have to check that. I know that is inconsistent with its use in callproc.c so for consistency's sake the NULL check should stay but I would remove both checks in a future change. More importantly you can completely avoid the args_stack variable since alloca() has the same end affect. I would also add an assert that the number of args is <= CTYPES_MAX_ARGCOUNT, that should be the case since GH 31188 -- ___ Python tracker <https://bugs.python.org/issue46323> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46323] Use _PyObject_Vectorcall in Modules/_ctypes/callbacks.c
hydroflask added the comment: @corona10 I really hope I am not being annoying at this point :) One final nit, in this line: https://github.com/python/cpython/pull/31224/files#diff-706e65ee28911740bf638707e19578b8182e57c6a8a9a4a91105d825f95a139dR168 You do not have to check if nargs > 0. alloca() can handle the case when nargs == 0. The usage of alloca() in callproc.c does not check for nargs > 0 either. Otherwise thanks for this enhancement! -- ___ Python tracker <https://bugs.python.org/issue46323> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46323] Use _PyObject_Vectorcall in Modules/_ctypes/callbacks.c
hydroflask added the comment: For reference, here are MSDN, Linux, OpenBSD, and GCC docs on alloca: https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/alloca?view=msvc-170 https://www.man7.org/linux/man-pages/man3/alloca.3.html https://man.openbsd.org/alloca.3 https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_002a_005f_005fbuiltin_005falloca -- ___ Python tracker <https://bugs.python.org/issue46323> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46697] _ctypes_simple_instance returns inverted logic
New submission from hydroflask : `_ctypes_simple_instance` in _ctypes.c returns the opposite logic of what its documentation claims. It is supposed to return true when the argument (a type object) is a direct subclass of `Simple_Type` (`_SimpleCData` in Python code). However it returns false instead. No bugs have manifested from this because all of the call sites ( `callproc.c::GetResult`, `callbacks.c::_CallPythonObject`,_`ctypes.c::PyCData_get`, `_ctypes.c::Simple_from_outparm`) invert the return value of this function. The last example, `ctypes.c::Simple_from_outparm` only calls `Simple_get_value()` when `_ctypes_simple_instance` returns false, which makes sense because otherwise the invocation of `_ctypes.c::Simple_from_outparm()` could trigger an assertion error. This is not just simply an issue of inverted logic because the logic isn't inverted in all cases. In `_ctypes_simple_instance` in the case when `PyCSimpleTypeObject_Check(type)` returns false, if this were supposed to be perfect inverted logic then the whole routine would return 1 (True) not 0. Fortunately, due to the way the code is structured, I don't think there is a case when `PyCSimpleTypeObject_Check(type)` returns false so the incorrect case where it returns a constant 0 is effectively dead code. I have compiled a version of Python with the attached patch and run "make test" with no issues. -- components: ctypes files: _ctypes_simple_instance_inverted.patch keywords: patch messages: 412947 nosy: hydroflask priority: normal severity: normal status: open title: _ctypes_simple_instance returns inverted logic type: enhancement versions: Python 3.10, Python 3.11, Python 3.7, Python 3.8, Python 3.9 Added file: https://bugs.python.org/file50612/_ctypes_simple_instance_inverted.patch ___ Python tracker <https://bugs.python.org/issue46697> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46697] _ctypes_simple_instance returns inverted logic
hydroflask added the comment: I place that patch into the public domain, I claim no ownership over that patch. The patch is attached purely for demonstration purposes. -- ___ Python tracker <https://bugs.python.org/issue46697> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46323] Use _PyObject_Vectorcall in Modules/_ctypes/callbacks.c
hydroflask added the comment: Thanks again everyone, very much appreciated. -- ___ Python tracker <https://bugs.python.org/issue46323> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46323] Use _PyObject_Vectorcall in Modules/_ctypes/callbacks.c
hydroflask added the comment: Was reviewing the code and noticed that a double-free was introduced in the recent changes: https://github.com/python/cpython/blob/dd76b3f7d332dd6eced5cbc2ad2adfc397700b3d/Modules/_ctypes/callbacks.c#L192 That line should have been removed in https://github.com/python/cpython/commit/b5527688aae11d0b5af58176267a9943576e71e5 but it was missed! -- ___ Python tracker <https://bugs.python.org/issue46323> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46323] Use _PyObject_Vectorcall in Modules/_ctypes/callbacks.c
hydroflask added the comment: Another bug, the array returned by alloca() should be zero-initialized. If an early return happens because of an intermediate error, Py_DECREF() will be called on uninitialized memory. Also it should be Py_XDECREF() I think. -- ___ Python tracker <https://bugs.python.org/issue46323> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46323] Use _PyObject_Vectorcall in Modules/_ctypes/callbacks.c
hydroflask added the comment: Ignore the previous comment :) -- ___ Python tracker <https://bugs.python.org/issue46323> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46323] Use _PyObject_Vectorcall in Modules/_ctypes/callbacks.c
hydroflask added the comment: Easy one to make, might be worth adding a test that would have exercised that code path. -- ___ Python tracker <https://bugs.python.org/issue46323> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42698] Deadlock in pysqlite_connection_dealloc()
hydroflask added the comment: Any update on this? I know you wanted to repro but even in the absence of the repro, I think calling sqlite3_close() without releasing the GIL is error-prone. If there is no immediate plan to make this change you may close the issue :) -- ___ Python tracker <https://bugs.python.org/issue42698> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47019] Fatal Python Error in sqlite3 Python 3.10
New submission from hydroflask : _destructor in connection.c in Python 3.10+ now calls `PyGILState_Ensure()`, this is a problem because if the destructor is being called while the thread is being torn down it will cause an unbalanced/erroneous call to "PyEval_RestoreThread" in PyGILState_Ensure which will eventually trigger a Fatal Python Error. A perfect repro has been attached, should be run on Linux. My recommended fix is to call sqlite3_close() within Py_BEGIN_ALLOW_THREADS/Py_END_ALLOW_THREADS, and manually Py_DECREF all connection-related functions afterward. -- components: Extension Modules, Library (Lib) files: sqlite3_fatal_python_error.py messages: 415212 nosy: erlendaasland, hydroflask priority: normal severity: normal status: open title: Fatal Python Error in sqlite3 Python 3.10 type: crash versions: Python 3.10, Python 3.11 Added file: https://bugs.python.org/file50676/sqlite3_fatal_python_error.py ___ Python tracker <https://bugs.python.org/issue47019> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47019] Fatal Python Error in sqlite3 Python 3.10
hydroflask added the comment: If you connect to the sqlite3 database in the example without using the factory, it will simply segfault. -- ___ Python tracker <https://bugs.python.org/issue47019> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47019] Fatal Python Error in sqlite3 Python 3.10
hydroflask added the comment: If you connect with check_same_thread=False, I believe the issue may still present itself on 3.11+ -- ___ Python tracker <https://bugs.python.org/issue47019> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47019] Fatal Python Error in sqlite3 Python 3.10
hydroflask added the comment: I don't see it immediately but I think it's still possible to happen since all the same offending code is in place. There are two reasosn why it probably doesn't happen in 3.11+: 1) because something is deferring calling the finalizer for the zero-ref object to another thread instead on immediately on the thread that is dying as in <=3.10 2) In 3.11+ it is fixed so that PyGILState_Ensure can be called in a sequence started by PyGILState_Release. @vstinner is this correct? If the reason it doesn't happen in 3.11+ is because of 1) then I don't think that is specified to happen anywhere and changing the GC in future versions could theoretically trigger the same bug. If it is 2) then it is fixed in a more robust location. IMO, the most robust fix is to destroy the function callbacks in connection_close() and avoid using the destructor_callback and calling PyGILState_Ensure() altogether. -- nosy: +vstinner ___ Python tracker <https://bugs.python.org/issue47019> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47019] Fatal Python Error in sqlite3 Python 3.10
hydroflask added the comment: > All callbacks triggered from external libraries must protect Python C API > calls. You may call it offending, but that is the reality; a callback may > trigger at any point in time, so we need to make sure the GIL is held before > calling any Py API. That is also the case for the destructor. Sure but I'm suggesting sidestepping and not using the destructor_callback entirely. You can DECREF the callbacks manually in connection_close() after calling sqlite3_close_v2(). Doing this makes sense to me since the destructor may be called under special conditions. -- ___ Python tracker <https://bugs.python.org/issue47019> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47019] Fatal Python Error in sqlite3 Python 3.10
hydroflask added the comment: If PyGILState_Ensure() has been fixed to become re-entrant during PyGILState_Release() in 3.11+ then I believe that change should be backported to 3.10. @vstinner would know -- ___ Python tracker <https://bugs.python.org/issue47019> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47019] Fatal Python Error in sqlite3 Python 3.10
hydroflask added the comment: So what is causing this bug in 3.10 is the following stack: thread_entry_point() PyGILState_Ensure() PyGILState_Release() ... PyGILState_Ensure() The core issue is that PyGILState_Ensure is getting called while the tstate itself is in the process of being destroyed. This causes an invalid state and eventually results in a segfault or "Fatal Python Error." I think the most robust fix is to allow re-entrancy to PyGILState_Release/PyGILState_Ensure. If that is prohibitively complex and/or is not specified to be allowed, I believe the most robust fix is to avoid using sqlite3 destructor callbacks to DECREF Python objects. -- ___ Python tracker <https://bugs.python.org/issue47019> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42698] Deadlock in pysqlite_connection_dealloc()
hydroflask added the comment: Closing this issue since sqlite3_close{,_v2}() is no longer called with the GIL held in 3.11+ -- stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue42698> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42698] Deadlock in pysqlite_connection_dealloc()
hydroflask added the comment: Unfortunately I am currently unable to write a repro for this bug but I am 100% certain it exists from analyzing the code. In our application (which is a large multithreaded application) after I made the changes consistent with this report the deadlock vanished completely. When the deadlock occurred, the thread stacks were consistent with the information in this report. To get the deadlock to trigger, SQLite itself must take a lock on sqlite3_close() but I don't know very much about SQLite's internals to consistently provoke that behavior. I've attached my attempt at a repro which is similar enough to the pattern of our application but it was not working. -- Added file: https://bugs.python.org/file50192/test.py ___ Python tracker <https://bugs.python.org/issue42698> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42698] Deadlock in pysqlite_connection_dealloc()
hydroflask added the comment: You did the right thing in terms of editing the code. I was not able to get a deadlock with this code which I think should be representative of the error. In production the error was happening after 30 minutes of so. The major problem is that I don't exactly know how to provoke SQLite to acquire an internal lock. If we assume that it does acquire internal locks and other threads release the GIL before calling into SQLite functions that acquire an internal lock, then you can reason that if sqlite3_close() acquires that lock without first releasing the GIL a deadlock will certainly occur. This is the deadlock that my application was running into. -- ___ Python tracker <https://bugs.python.org/issue42698> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42698] Deadlock in pysqlite_connection_dealloc()
hydroflask added the comment: >> The major problem is that I don't exactly know how to provoke SQLite to >> acquire an internal lock. > IIRC, you can provoke the internal SQLite lock simply by using transaction > control: BEGIN (lock) => COMMIT / ROLLBACK (unlock). Ah okay, so the sequence would have to be this: thread1: pysqlite.some_operation() thread1: release gil thread1: sqlite3_some_procedure() thread1: acquire sqlite lock thread2: acquire gil thread2: pysqlite.close() thread2: sqlite3_close() thread2: acquire sqlite lock > I'll see if I can come up with a compact repro. It should be possible, good luck! -- ___ Python tracker <https://bugs.python.org/issue42698> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42698] Deadlock in pysqlite_connection_dealloc()
New submission from hydroflask : pysqlite_connection_dealloc() calls sqlite3_close{,_v2}(). This can cause a deadlock if sqlite3_close() tries to acquire a lock that another thread holds, due to a deadlock between the GIL and an internal sqlite3 lock. This is especially common with sqlite3's "shared cache mode." Since the GIL should not be released during a tp_dealloc function and python has no control over the behavior of sqlite3_close(), it is incorrect to call sqlite3_close() in pysqlite_connection_dealloc(). -- components: Library (Lib) messages: 383471 nosy: hydroflask priority: normal severity: normal status: open title: Deadlock in pysqlite_connection_dealloc() type: behavior versions: Python 3.10, Python 3.6, Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue42698> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42698] Deadlock in pysqlite_connection_dealloc()
Change by hydroflask : -- components: +Extension Modules ___ Python tracker <https://bugs.python.org/issue42698> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42698] Deadlock in pysqlite_connection_dealloc()
hydroflask added the comment: This is also a problem in pysqlite_connection_close() as currently implemented. -- ___ Python tracker <https://bugs.python.org/issue42698> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42698] Deadlock in pysqlite_connection_dealloc()
hydroflask added the comment: Nevermind it seems that it's legal to call Py_BEGIN_ALLOW_THREADS in tp_dealloc. The fix is then to allow threads around sqlite3_close(). -- ___ Python tracker <https://bugs.python.org/issue42698> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42698] Deadlock in pysqlite_connection_dealloc()
hydroflask added the comment: Another comment: if calling sqlite3_close() outside of GIL, then the associated SQL function destructors must take the GIL before calling Py_DECREF -- ___ Python tracker <https://bugs.python.org/issue42698> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com