[Bug c++/118330] New: Improving calling convention for 32-bit ARM target

david at westcontrol dot com via Gcc-bugs Tue, 07 Jan 2025 07:09:11 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118330


            Bug ID: 118330
           Summary: Improving calling convention for 32-bit ARM target
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: david at westcontrol dot com
  Target Milestone: ---

My main focus here is for small-systems embedded 32-bit ARM targets (Cortex-M
devices), but the points here could apply to many targets.

The calling conventions used for C are often based on common usage and compiler
limitations from decades ago - from the days when C functions were regularly
used without declarations, and return types were usually "int", or occasionally
a pointer or a floating point type.

Modern programming languages - C++, D, Rust, etc., - and modern usage is very
different.  Struct return types are increasingly common.  (They are used in C
too, such as for the standard library functions "div" and "clock".)  With
C++23, we can expect to see steadily more usage of std::tuple<>,
std::optional<>, std::variant<> and std::expected<>, all of which should be
fine for targets like this.  But old-fashioned and limited calling conventions
cripple their efficiency.

The key problem is that only one register - r0 - is used for aggregate returns.
 The pair r0:r1 is used for 64-bit scalar returns (long long int, and doubles
if there are no floating point registers), but otherwise structs bigger than
32-bits are returned by the caller allocating space for the returned struct on
the stack and passing a pointer to that in r0.  This is a very significant
overhead for smaller functions passing around small structs, such as the
aforementioned C++ types or even just strong types wrapping 64-bit integers in
a class.

Ideally, this is something that ARM should deal with - making a calling
convention that suits modern usage of modern cores.  However, I don't think
this is the only gcc target that could be improved in this way.  gcc already
has a variety of ad-hoc target-specific methods of picking calling conventions
for different targets, with compiler flags and a mix of function attributes. 
Perhaps, in the spirit of <https://xkcd.com/927/>, it would be better to make a
common choice that could be used on multiple targets?

As a suggestion, "abiopt" could be the attribute name to indicate that this is
an option adding to the normal abi, rather than replacing it (so that things
like the use of floating point registers for parameters or returns is not
affected).  Then "reg-aggregate-return" could say that the same set of
registers (gprs, floating point, vector, etc.) are used for the returning
aggregate as would be used for a set of parameters of the same types as the
aggregate members.

This could then be used with :

1. Compiler flag -mabiopt=reg-aggregate-return

2. #pragma GCC abiopt reg-aggregate-return

3. Function __attribute__((abiopt(reg-aggregate-return)))

4. Type __attribute__((abiopt(reg-aggregate-return)))

The type attribute would mean that when that type is used as a return type for
a function, the function would gain the corresponding abi option.  (If other
abi options are added in the future that apply to parameter passing, then use
of the type attribute here would mean that the type is always passed in this
way.)  I am not sure of this type attribute is practical or not.

Of course this kind of calling convention change can be at conflict with code
compiled with different options - the user has to be careful here.

[Bug c++/118330] New: Improving calling convention for 32-bit ARM target

Reply via email to