On Tue, Apr 29, 2025 at 3:53 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > On Tue, Apr 29, 2025 at 9:34 PM Richard Biener > <richard.guent...@gmail.com> wrote: > > > > On Tue, Apr 29, 2025 at 2:33 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > > > On Tue, Apr 29, 2025 at 6:46 PM Richard Biener > > > <richard.guent...@gmail.com> wrote: > > > > > > > > On Tue, Apr 29, 2025 at 12:32 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > > > > > > > On Tue, Apr 29, 2025 at 5:56 PM Richard Biener > > > > > <richard.guent...@gmail.com> wrote: > > > > > > > > > > > > On Tue, Apr 29, 2025 at 10:48 AM H.J. Lu <hjl.to...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > > On Tue, Apr 29, 2025 at 4:25 PM Richard Biener > > > > > > > <richard.guent...@gmail.com> wrote: > > > > > > > > > > > > > > > > On Tue, Apr 29, 2025 at 9:39 AM H.J. Lu <hjl.to...@gmail.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > For targets, like x86, which define TARGET_PROMOTE_PROTOTYPES > > > > > > > > > to return > > > > > > > > > true, all integer arguments smaller than int are passed as > > > > > > > > > int: > > > > > > > > > > > > > > > > > > [hjl@gnu-tgl-3 pr14907]$ cat x.c > > > > > > > > > extern int baz (char c1); > > > > > > > > > > > > > > > > > > int > > > > > > > > > foo (char c1) > > > > > > > > > { > > > > > > > > > return baz (c1); > > > > > > > > > } > > > > > > > > > [hjl@gnu-tgl-3 pr14907]$ gcc -S -O2 -m32 x.c > > > > > > > > > [hjl@gnu-tgl-3 pr14907]$ cat x.s > > > > > > > > > .file "x.c" > > > > > > > > > .text > > > > > > > > > .p2align 4 > > > > > > > > > .globl foo > > > > > > > > > .type foo, @function > > > > > > > > > foo: > > > > > > > > > .LFB0: > > > > > > > > > .cfi_startproc > > > > > > > > > movsbl 4(%esp), %eax > > > > > > > > > movl %eax, 4(%esp) > > > > > > > > > jmp baz > > > > > > > > > .cfi_endproc > > > > > > > > > .LFE0: > > > > > > > > > .size foo, .-foo > > > > > > > > > .ident "GCC: (GNU) 14.2.1 20240912 (Red Hat 14.2.1-3)" > > > > > > > > > .section .note.GNU-stack,"",@progbits > > > > > > > > > [hjl@gnu-tgl-3 pr14907]$ > > > > > > > > > > > > > > > > > > But integer promotion: > > > > > > > > > > > > > > > > > > movsbl 4(%esp), %eax > > > > > > > > > movl %eax, 4(%esp) > > > > > > > > > > > > > > > > > > isn't necessary if incoming arguments are copied to outgoing > > > > > > > > > arguments > > > > > > > > > directly. > > > > > > > > > > > > > > > > > > Add a new target hook, > > > > > > > > > TARGET_GET_SMALL_INTEGER_ARGUMENT_VALUE, defaulting > > > > > > > > > to return nullptr. If the new target hook returns > > > > > > > > > non-nullptr, use it to > > > > > > > > > get the outgoing small integer argument. The x86 target hook > > > > > > > > > returns the > > > > > > > > > value of the corresponding incoming argument as int if it can > > > > > > > > > be used as > > > > > > > > > the outgoing argument. If callee is a global function, we > > > > > > > > > always properly > > > > > > > > > extend the incoming small integer arguments in callee. If > > > > > > > > > callee is a > > > > > > > > > local function, since DECL_ARG_TYPE has the original small > > > > > > > > > integer type, > > > > > > > > > we will extend the incoming small integer arguments in callee > > > > > > > > > if needed. > > > > > > > > > It is safe only if > > > > > > > > > > > > > > > > > > 1. Caller and callee are not nested functions. > > > > > > > > > 2. Caller and callee use the same ABI. > > > > > > > > > > > > > > > > How do these influence the value? TARGET_PROMOTE_PROTOTYPES > > > > > > > > should apply to all of them, no? > > > > > > > > > > > > > > When the arguments are passed in different registers in different > > > > > > > ABIs, > > > > > > > we have to copy them anyway. > > > > > > > > > > > > But optimization can elide copies easily, but not easily elide > > > > > > sign-/zero-extensions. > > > > > > > > > > What I meant was that caller and callee have different ABIs. > > > > > Optimizer can't elide copies since incoming arguments and outgoing > > > > > arguments are in different registers. They have to be moved. > > > > > > > > > > > > > > > > > > > > > > 3. The incoming argument and the outgoing argument are in the > > > > > > > > > same > > > > > > > > > location. > > > > > > > > > > > > > > > > Why's that? Can't we move them but still elide the > > > > > > > > sign-/zero-extension? > > > > > > > > > > > > > > If they aren't in the same locations, we have to move them anyway. > > > > > > > This patch tries to avoid necessary moves of incoming arguments to > > > > > > > outgoing arguments. > > > > > > > > > > > > That's not exactly how you presented it, but you convenitently used > > > > > > x86 stack argument passing. That might be difficult to elide, but > > > > > > is > > > > > > also uncommon for "small integer types" - does the same issue not > > > > > > apply to other arguments passed on the stack as well? > > > > > > > > > > It applies to both passing in registers and on stack. It is an > > > > > issue only > > > > > for small integer types due to sign-/zero-extensions at call sites. > > > > > My > > > > > patch elides sign-/zero-extensions when incoming arguments and > > > > > outgoing > > > > > arguments are unchanged in the exactly same location, in register or > > > > > on stack. > > > > > > > > Is it possible to dissect this from TARGET_PROMOTE_PROTOTYPES then? > > > > That is, this should also work for the case prototypes are not promoted > > > > and > > > > for modes larger than SImode, even BLKmode. > > > > > > > > Richard. > > > > > > Arguments which don't need promotion, including large arguments, are > > > already > > > working today. The only issue is sign-/zero-extension of small outgoing > > > integer > > > arguments on x86. My patch removes unnecessary sign-/zero-extensions. > > > See: > > > > So we're back to square one ... why restrict this sign-/zero-extension > > elimination > > to the case where you can also elide the copy? > > There is no copy in other cases. The only copy case is sign-/zero-extension. > There is no copy to eliminate except for sign-/zero-extension.
So why not eliminate the sign-/zero-extension to a copy when the location is not equivalent? It seems to me there's existing code (w/o a target hook) that can eliminate useless copies but that fails when an extension is required. TARGET_PROMOTE_PROTOTYPES says incoming args are promoted for GCC controlled calls. I fail to see where "location" or "ABI" or anything else should be relevant here. It seems there's just a tiny bit of information missing in the generic mechanism and thus I fail to see why a full blown target hook is required just for the special case of the copy case of sign-/zero-extensions that are not necessary. Richard. > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14907 > > > > > > for more discussions. > > > > > > -- > > > H.J. > > > > -- > H.J.