On Tue, Apr 29, 2025 at 3:53 PM H.J. Lu <hjl.to...@gmail.com> wrote:
>
> On Tue, Apr 29, 2025 at 9:34 PM Richard Biener
> <richard.guent...@gmail.com> wrote:
> >
> > On Tue, Apr 29, 2025 at 2:33 PM H.J. Lu <hjl.to...@gmail.com> wrote:
> > >
> > > On Tue, Apr 29, 2025 at 6:46 PM Richard Biener
> > > <richard.guent...@gmail.com> wrote:
> > > >
> > > > On Tue, Apr 29, 2025 at 12:32 PM H.J. Lu <hjl.to...@gmail.com> wrote:
> > > > >
> > > > > On Tue, Apr 29, 2025 at 5:56 PM Richard Biener
> > > > > <richard.guent...@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, Apr 29, 2025 at 10:48 AM H.J. Lu <hjl.to...@gmail.com> 
> > > > > > wrote:
> > > > > > >
> > > > > > > On Tue, Apr 29, 2025 at 4:25 PM Richard Biener
> > > > > > > <richard.guent...@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Tue, Apr 29, 2025 at 9:39 AM H.J. Lu <hjl.to...@gmail.com> 
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > For targets, like x86, which define TARGET_PROMOTE_PROTOTYPES 
> > > > > > > > > to return
> > > > > > > > > true, all integer arguments smaller than int are passed as 
> > > > > > > > > int:
> > > > > > > > >
> > > > > > > > > [hjl@gnu-tgl-3 pr14907]$ cat x.c
> > > > > > > > > extern int baz (char c1);
> > > > > > > > >
> > > > > > > > > int
> > > > > > > > > foo (char c1)
> > > > > > > > > {
> > > > > > > > >   return baz (c1);
> > > > > > > > > }
> > > > > > > > > [hjl@gnu-tgl-3 pr14907]$ gcc -S -O2 -m32 x.c
> > > > > > > > > [hjl@gnu-tgl-3 pr14907]$ cat x.s
> > > > > > > > > .file "x.c"
> > > > > > > > > .text
> > > > > > > > > .p2align 4
> > > > > > > > > .globl foo
> > > > > > > > > .type foo, @function
> > > > > > > > > foo:
> > > > > > > > > .LFB0:
> > > > > > > > > .cfi_startproc
> > > > > > > > > movsbl 4(%esp), %eax
> > > > > > > > > movl %eax, 4(%esp)
> > > > > > > > > jmp baz
> > > > > > > > > .cfi_endproc
> > > > > > > > > .LFE0:
> > > > > > > > > .size foo, .-foo
> > > > > > > > > .ident "GCC: (GNU) 14.2.1 20240912 (Red Hat 14.2.1-3)"
> > > > > > > > > .section .note.GNU-stack,"",@progbits
> > > > > > > > > [hjl@gnu-tgl-3 pr14907]$
> > > > > > > > >
> > > > > > > > > But integer promotion:
> > > > > > > > >
> > > > > > > > > movsbl 4(%esp), %eax
> > > > > > > > > movl %eax, 4(%esp)
> > > > > > > > >
> > > > > > > > > isn't necessary if incoming arguments are copied to outgoing 
> > > > > > > > > arguments
> > > > > > > > > directly.
> > > > > > > > >
> > > > > > > > > Add a new target hook, 
> > > > > > > > > TARGET_GET_SMALL_INTEGER_ARGUMENT_VALUE, defaulting
> > > > > > > > > to return nullptr.  If the new target hook returns 
> > > > > > > > > non-nullptr, use it to
> > > > > > > > > get the outgoing small integer argument.  The x86 target hook 
> > > > > > > > > returns the
> > > > > > > > > value of the corresponding incoming argument as int if it can 
> > > > > > > > > be used as
> > > > > > > > > the outgoing argument.  If callee is a global function, we 
> > > > > > > > > always properly
> > > > > > > > > extend the incoming small integer arguments in callee.  If 
> > > > > > > > > callee is a
> > > > > > > > > local function, since DECL_ARG_TYPE has the original small 
> > > > > > > > > integer type,
> > > > > > > > > we will extend the incoming small integer arguments in callee 
> > > > > > > > > if needed.
> > > > > > > > > It is safe only if
> > > > > > > > >
> > > > > > > > > 1. Caller and callee are not nested functions.
> > > > > > > > > 2. Caller and callee use the same ABI.
> > > > > > > >
> > > > > > > > How do these influence the value?  TARGET_PROMOTE_PROTOTYPES
> > > > > > > > should apply to all of them, no?
> > > > > > >
> > > > > > > When the arguments are passed in different registers in different 
> > > > > > > ABIs,
> > > > > > > we have to copy them anyway.
> > > > > >
> > > > > > But optimization can elide copies easily, but not easily elide
> > > > > > sign-/zero-extensions.
> > > > >
> > > > > What I meant was that caller and callee have different ABIs.
> > > > > Optimizer can't elide copies since incoming arguments and outgoing
> > > > > arguments are in different registers.  They have to be moved.
> > > > >
> > > > > > > >
> > > > > > > > > 3. The incoming argument and the outgoing argument are in the 
> > > > > > > > > same
> > > > > > > > > location.
> > > > > > > >
> > > > > > > > Why's that?  Can't we move them but still elide the 
> > > > > > > > sign-/zero-extension?
> > > > > > >
> > > > > > > If they aren't in the same locations, we have to move them anyway.
> > > > > > > This patch tries to avoid necessary moves of incoming arguments to
> > > > > > > outgoing arguments.
> > > > > >
> > > > > > That's not exactly how you presented it, but you convenitently used
> > > > > > x86 stack argument passing.  That might be difficult to elide, but 
> > > > > > is
> > > > > > also uncommon for "small integer types" - does the same issue not
> > > > > > apply to other arguments passed on the stack as well?
> > > > >
> > > > > It applies to both passing in registers and on stack.   It is an 
> > > > > issue only
> > > > > for small integer types due to sign-/zero-extensions at call sites.  
> > > > > My
> > > > > patch elides sign-/zero-extensions when incoming arguments and 
> > > > > outgoing
> > > > > arguments are unchanged in the exactly same location, in register or 
> > > > > on stack.
> > > >
> > > > Is it possible to dissect this from TARGET_PROMOTE_PROTOTYPES then?
> > > > That is, this should also work for the case prototypes are not promoted 
> > > > and
> > > > for modes larger than SImode, even BLKmode.
> > > >
> > > > Richard.
> > >
> > > Arguments which don't need promotion, including large arguments, are 
> > > already
> > > working today.  The only issue is sign-/zero-extension of small outgoing 
> > > integer
> > > arguments on x86.  My patch removes unnecessary sign-/zero-extensions.   
> > > See:
> >
> > So we're back to square one ... why restrict this sign-/zero-extension
> > elimination
> > to the case where you can also elide the copy?
>
> There is no copy in other cases.  The only copy case is sign-/zero-extension.
> There is no copy to eliminate except for sign-/zero-extension.

So why not eliminate the sign-/zero-extension to a copy when the location is
not equivalent?

It seems to me there's existing code (w/o a target hook) that can eliminate
useless copies but that fails when an extension is required.
TARGET_PROMOTE_PROTOTYPES
says incoming args are promoted for GCC controlled calls.  I fail to see
where "location" or "ABI" or anything else should be relevant here.  It seems
there's just a tiny bit of information missing in the generic mechanism and thus
I fail to see why a full blown target hook is required just for the
special case of
the copy case of sign-/zero-extensions that are not necessary.

Richard.

>
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14907
> > >
> > > for more discussions.
> > >
> > > --
> > > H.J.
>
>
>
> --
> H.J.

Reply via email to