New submission from STINNER Victor: Since the issue #26814 proved that avoiding the creation of temporary tuples to call Python and C functions makes Python faster (between 2% and 29% depending on the benchmark), I extracted a first "minimal" patch to start merging this work.
The first patch adds new functions: * PyObject_CallNoArg(func) and PyObject_CallArg1(func, arg): public functions * _PyObject_FastCall(func, args, nargs, kwargs): private function I hesitate between the C types "int" and "Py_ssize_t" for nargs. I read once that using "int" can cause performance issues on a loop using "i++" and "data[i]" because the compiler has to handle integer overflow of the int type. The "int" type is also annoying on Windows 64-bit, it causes compiler warnings on downcast like PyTuple_GET_SIZE(co->co_argcount) stored into a C int. _PyObject_FastCall() avoids the creation of tuple for: * All Python functions (PyFunction_Check) * C functions using METH_NOARGS or METH_O The patch removes the "cache tuple" optimization from property_descr_get(), it uses PyObject_CallArg1() instead. It means that the optimization is (currently) missed in some cases compared to the current code, but the code is safer and simpler. The patch adds Python/pystack.c which currently only contains _PyStack_AsTuple(), but will contain more code later. I tried to write the smallest patch, but I started to use PyObject_CallNoArg() and PyObject_CallArg1() when the code already created a tuple at each call: PyObject_CallObject(), call_function_tail() and PyEval_CallObjectWithKeywords(). In the patch, keywords are not used in fast calls. But they will be used later. I prefer to start directly with keywords than changing the calling convention once again later. -- Later, I will propose other patches to: * add METH_FASTCALL calling convention for C functions * modify Argument Clinic to use METH_FASTCALL So the fast call will be taken in more cases. -- The long term plan is to slowly use the new FASTCALL calling convention "everywhere". The tricky point are tp_new, tp_init and tp_call attributes of type objects. In the issue #26814, I wrote a patch adding Py_TPFLAGS_FASTNEW, Py_TPFLAGS_FASTINIT and Py_TPFLAGS_FASTCALL flags to use the FASTCALL calling convention for tp_new, tp_init and tp_call. The problem is that calling directly these methods looks common. If we can the calling convention of these methods, it will break the C API, I propose to discuss that later ;-) An alternative is to add a tp_fastcall method to PyTypeObject and use a wrapper for tp_call for backward compatibility. This option has also drawbacks. Again, I propose to discuss this later, and first start to focus on the changes that don't break anything ;-) ---------- files: fastcall.patch keywords: patch messages: 266422 nosy: haypo, serhiy.storchaka, yselivanov priority: normal severity: normal status: open title: Add _PyObject_FastCall() type: performance versions: Python 3.6 Added file: http://bugs.python.org/file43011/fastcall.patch _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue27128> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com