https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032
--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Consider e.g. void bar (int *); int foo (int *a, int *b, int *c, int *d) { for (int i = 0; i < 1024; i++) a[i] = a[i] * b[i] + (c[i] - d[i]); bar (a); return 42; } with -m32 -O3 -mavx -mstackrealign. It needs to dynamically realign the stack because user asked for it (so that it can use 256-bit aligned stack slots and callers don't guarantee that alignment), so it needs a frame pointer (%ebp), stack pointer (%esp) and DRAP (%ecx in this case). Especially if you e.g. also use VLAs or alloca in the function. %ebp based addressing is for the automatic vars in the function in its stack frame, stack pointer can be variable offset from it used for outgoing arguments to function calls and push/pop or for alloca/VLAs and DRAP is used to access function arguments which aren't at fixed offset from the frame pointer either. Anyway, with the "b" etc. constraints (which is a good idea to use on x86 when it has single register constraints for those but can't be used on other arches which do not have such constraints) you just trigger slightly different path in the RA, but the problem remains roughly the same, you force use of 6 registers as input plus one memory address and esp is a stack pointer and ebp could be a frame pointer and it is a question if you don't need another register for the address of the memory input. A way to free one input would be to store 2 arguments into an array and use the whole array as one memory input and only inside of the inline asm load it into the right registers.