Marc-Andre Lemburg <m...@egenix.com> added the comment:

On 15.10.2021 11:43, STINNER Victor wrote:
> Again, I'm not aware of any performance issue caused by short static inline 
> functions like Py_TYPE() or the proposed PyFloat_AS_DOUBLE(). If there is a 
> problem, it should be addressed, since Python uses more and more static 
> inline functions.
> 
> static inline functions is a common feature of C language. I'm not sure where 
> your doubts of bad performance come from.

Inlining is something that is completely under the control of the
used compilers. Compilers are free to not inline function marked for
inlining, which can result in significant slowdowns on platforms
which are e.g. restricted in RAM and thus emphasize on small code size,
or where the CPUs have small caches or not enough registers (think
micro-controllers).

The reason why we have those macros is because we want the developers to be
able to make a conscious decision "please inline this code unconditionally
and regardless of platform or compiler". The developer will know better
what to do than the compiler.

If the developer wants to pass control over to the compiler s/he can use
the corresponding C function, which is usually available (and then, in many
cases, also provides error handling).

> Using static inline functions has other advantages. It helps debugging and 
> profiling, since the function name can be retrieved by debuggers and 
> profilers when analysing the machine code. It also avoids macro pitfalls 
> (like abusing a macro to use it as an l-value ;-)).

Perhaps, but then I never had to profile macro use in the past. Instead,
what I typically found was that using macros results in faster code when
used in inner loops, so profiling usually guided me to use macros instead
of functions.

That said, the macros you have inlined so far were all really trivial,
so a compiler will most likely always inline them (the number of machine
code instructions for the call would be more than needed for
the actual operation).

Perhaps we ought to have a threshold for making such decisions, e.g.
number of machine code instructions generated for the macro or so, to
not get into discussions every time :-)

A blanket "static inline" is always better than a macro is not good
enough as an argument, though.

Esp. in PGO driven optimizations the compiler could opt for using
the function call rather than inlining if it finds that the code
in question is not used much and it needs to save space to have
loops fit into CPU caches.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue45476>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to