STINNER Victor added the comment:

Changes of my current implementation, ad4a53ed1fbf.diff.

The good thing is that all changes are internals (really?). Even if you don't 
modify your C extensions (nor your Python code), you should benefit of the new 
fast call is *a lot* of cases.

IMHO the best tricky part are changes on the PyTypeObject. Is it ok to add a 
new tp_fastcall slot? Should we add even more slots using the fast call 
convention like tp_fastnew and tp_fastinit? How should we handle the 
inheritance of types with that?


(*) Add 2 new public functions:

PyObject* PyObject_CallNoArg(PyObject *func);
PyObject* PyObject_CallArg1(PyObject *func, PyObject *arg);


(*) Add 1 new private function:

PyObject* _PyObject_FastCall(PyObject *func, PyObject **stack, int na, int nk);

_PyObject_FastCall() is the root of the new feature.


(*) type: add a new "tp_fastcall" field to the PyTypeObject structure.

It's unclear to me how inheritance is handled here. Maybe it's simply broken, 
but it's strange because it looks like it works :-) Maybe it's very rare that 
tp_call is overidden in a child class?

TODO: maybe reuse the "tp_call" field? (risk of major backward 
incompatibility...)


(*) slots: add a new "fastwrapper" field to the wrappercase structure. Add a 
fast wrapper to all slots (really all? i should check).

I don't think that consumers of the C API are of this change, or maybe only a 
few projects.

TODO: maybe remove "fastwrapper" and reuse the "wrapper" field? (low risk of 
backward compatibility?)


(*) Implement fast call for Python function (_PyFunction_FastCall) and C 
functions (PyCFunction_FastCall)


(*) Add a new METH_FASTCALL calling convention for C functions. Right now, it 
is used for 4 builtin functions: sorted(), getattr(), iter(), next().

Argument Clinic should be modified to emit C code using this new fast calling 
convention.


(*) Implement fast call in the following functions (types):

- method()
- method_descriptor()
- wrapper_descriptor()
- method_wrapper()
- operator.itemgetter => used by collections.namedtuple to get an item by its 
name


(*) Modify PyObject_Call*() functins to reuse internally the fast call. 
"tp_fastcall" is preferred over "tp_call" (FIXME: is it really useful to do 
that?).

The following functions are able to avoid temporary tuple/dict without having 
to modify the code calling them:

- PyObject_CallFunction()
- PyObject_CallMethod(), _PyObject_CallMethodId()
- PyObject_CallFunctionObjArgs(), PyObject_CallMethodObjArgs()

It's not required to modify code using these functions to use the 3 new shiny 
functions (PyObject_CallNoArg, PyObject_CallArg1, _PyObject_FastCall). For 
example, replacing PyObject_CallFunctionObjArgs(func, NULL) with 
PyObject_CallNoArg(func) is just a micro-optimization, the tuple is already 
avoided. But PyObject_CallNoArg() should use less memory of the C stack and be 
a "little bit" faster.


(*) Add new helpers: new Include/pystack.h file, Py_VaBuildStack(), etc.


Please ignore unrelated changes.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26814>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to