https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Consider e.g.
void bar (int *);
int
foo (int *a, int *b, int *c, int *d)
{
  for (int i = 0; i < 1024; i++)
    a[i] = a[i] * b[i] + (c[i] - d[i]);
  bar (a);
  return 42;
}
with -m32 -O3 -mavx -mstackrealign.
It needs to dynamically realign the stack because user asked for it (so that it
can use 256-bit aligned stack slots and callers don't guarantee that
alignment), so it needs a frame pointer (%ebp), stack pointer (%esp) and DRAP
(%ecx in this case).  Especially if you e.g. also use VLAs or alloca in the
function.
%ebp based addressing is for the automatic vars in the function in its stack
frame, stack pointer can be variable offset from it used for outgoing arguments
to function calls and push/pop or for alloca/VLAs and DRAP is used to access
function arguments which aren't at fixed offset from the frame pointer either.
Anyway, with the "b" etc. constraints (which is a good idea to use on x86 when
it has single register constraints for those but can't be used on other arches
which do not have such constraints) you just trigger slightly different path in
the RA, but the problem remains roughly the same, you force use of 6 registers
as input plus one memory address and esp is a stack pointer and ebp could be a
frame pointer and it is a question if you don't need another register for the
address of the memory input.

A way to free one input would be to store 2 arguments into an array and use the
whole array as one memory input and only inside of the inline asm load it into
the right registers.

Reply via email to