STINNER Victor <vstin...@python.org> added the comment:

I decided to merge my PR to address https://bugs.python.org/issue45439 initial 
issue: "[C API] Move usage of **tp_vectorcall_offset** from public headers to 
the internal C API".

Last years, I added `tstate` parameters to internal C functions. The agreement 
was that only internal functions should use it, and indirectly that this 
`tstate` parameter should be hidden. I'm now sure exactly, but `tstate` started 
to pop up in `Include/cpython/abstract.h` around "call" functions. This PR fix 
this issue.

About the impact on performances: well, it's really hard to draw a clear 
conclusion. Inlining, LTO and PGO give different results on runtime performance 
and stack memory usage.

IMO the fact that public C API functions are now regular functions should not 
prevent us to continue (micro) optimizing Python. We can always add a variant 
to the internal C API using an API a little bit different (e.g. add `tstate` 
parameter) or defined as a static inline function, rather than a regular 
function.

The unclear part is if PyObject_CallOneArg() (regular function call) is faster 
than _PyObject_CallOneArg() (static inline function, inlined). The performance 
may depend if it's called in the Python executable or in a dynamic library (PLT 
indirection which may be avoided by `gcc -fno-semantic-interposition`).

Well, happy hacking and let's continue *continuous* benchmarking Python!

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue45439>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to