On Tue, Nov 5, 2024 at 10:09 AM Richard Biener
<richard.guent...@gmail.com> wrote:
>
> On Tue, Nov 5, 2024 at 5:23 AM Jeff Law <jeffreya...@gmail.com> wrote:
> >
> >
> >
> > On 11/4/24 8:13 PM, H.J. Lu wrote:
> > > On Tue, Nov 5, 2024 at 10:57 AM Jeff Law <jeffreya...@gmail.com> wrote:
> > >>
> > >>
> > >>
> > >> On 11/4/24 7:52 PM, H.J. Lu wrote:
> > >>> On Tue, Nov 5, 2024 at 8:48 AM Jeff Law <jeffreya...@gmail.com> wrote:
> > >>>>
> > >>>>
> > >>>>
> > >>>> On 11/4/24 5:42 PM, H.J. Lu wrote:
> > >>>>> On Tue, Nov 5, 2024 at 8:07 AM Jeff Law <jeffreya...@gmail.com> wrote:
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> On 11/1/24 4:32 PM, H.J. Lu wrote:
> > >>>>>>> For targets, like x86, which define TARGET_PROMOTE_PROTOTYPES to 
> > >>>>>>> return
> > >>>>>>> true, all integer arguments smaller than int are passed as int:
> > >>>>>>>
> > >>>>>>> [hjl@gnu-tgl-3 pr14907]$ cat x.c
> > >>>>>>> extern int baz (char c1);
> > >>>>>>>
> > >>>>>>> int
> > >>>>>>> foo (char c1)
> > >>>>>>> {
> > >>>>>>>       return baz (c1);
> > >>>>>>> }
> > >>>>>>> [hjl@gnu-tgl-3 pr14907]$ gcc -S -O2 -m32 x.c
> > >>>>>>> [hjl@gnu-tgl-3 pr14907]$ cat x.s
> > >>>>>>>          .file   "x.c"
> > >>>>>>>          .text
> > >>>>>>>          .p2align 4
> > >>>>>>>          .globl  foo
> > >>>>>>>          .type   foo, @function
> > >>>>>>> foo:
> > >>>>>>> .LFB0:
> > >>>>>>>          .cfi_startproc
> > >>>>>>>          movsbl  4(%esp), %eax
> > >>>>>>>          movl    %eax, 4(%esp)
> > >>>>>>>          jmp     baz
> > >>>>>>>          .cfi_endproc
> > >>>>>>> .LFE0:
> > >>>>>>>          .size   foo, .-foo
> > >>>>>>>          .ident  "GCC: (GNU) 14.2.1 20240912 (Red Hat 14.2.1-3)"
> > >>>>>>>          .section        .note.GNU-stack,"",@progbits
> > >>>>>>> [hjl@gnu-tgl-3 pr14907]$
> > >>>>>>>
> > >>>>>>> But integer promotion:
> > >>>>>>>
> > >>>>>>>          movsbl  4(%esp), %eax
> > >>>>>>>          movl    %eax, 4(%esp)
> > >>>>>>>
> > >>>>>>> isn't necessary if incoming arguments and outgoing arguments are the
> > >>>>>>> same.  Use unpromoted incoming integer arguments as outgoing 
> > >>>>>>> arguments
> > >>>>>>> if incoming integer arguments are the same as outgoing arguments to
> > >>>>>>> avoid unnecessary integer promotion.
> > >>>>>> Is there a particular reason x86 can't use the same mechanisms that
> > >>>>>
> > >>>>> Other targets define TARGET_PROMOTE_PROTOTYPES to return false
> > >>>>> to avoid this issue.   Changing x86 TARGET_PROMOTE_PROTOTYPES
> > >>>>> to return false will break LLVM which assumes that incoming char/short
> > >>>>> arguments on x86 are always extended to int.   The following targets
> > >>>> Then my suggestion would be to cover this in REE somehow.  We started
> > >>>> looking at that a couple years ago and set it aside.   But the basic
> > >>>> idea was to expose the ABI guarantees to REE, then let REE do its 
> > >>>> thing.
> > >>>>
> > >>>> Jeff
> > >>>>
> > >>>
> > >>> For
> > >>>
> > >>> extern int baz (char c1);
> > >>>
> > >>> int
> > >>> foo (char c1)
> > >>> {
> > >>>     return baz (c1);
> > >>> }
> > >>>
> > >>> on i386, we get these
> > >>>
> > >>> (insn 7 4 8 2 (set (reg:SI 0 ax [orig:102 c1 ] [102])
> > >>>           (sign_extend:SI (mem/c:QI (plus:SI (reg/f:SI 7 sp)
> > >>>                       (const_int 4 [0x4])) [0 c1+0 S1 A32]))) "x.c":6:10
> > >>> 190 {extendqisi2}
> > >>>        (expr_list:REG_EQUIV (mem:SI (reg/f:SI 16 argp) [0  S4 A32])
> > >>>           (nil)))
> > >>> (insn 8 7 9 2 (set (mem:SI (plus:SI (reg/f:SI 7 sp)
> > >>>                   (const_int 4 [0x4])) [0  S4 A32])
> > >>>           (reg:SI 0 ax [orig:102 c1 ] [102])) "x.c":6:10 95 
> > >>> {*movsi_internal}
> > >>>        (nil))
> > >>> (call_insn/j 9 8 10 2 (set (reg:SI 0 ax)
> > >>>           (call (mem:QI (symbol_ref:SI ("baz") [flags 0x41]
> > >>> <function_decl 0x7f27347aae00
> > >>> baz>) [0 baz S1 A8])
> > >>>               (const_int 4 [0x4]))) "x.c":6:10 1420 {*sibcall_value}
> > >>>        (expr_list:REG_CALL_DECL (symbol_ref:SI ("baz") [flags 0x41]
> > >>> <function_decl 0x7f273
> > >>> 47aae00 baz>)
> > >>>           (nil))
> > >>>       (expr_list:SI (use (mem:SI (reg/f:SI 16 argp) [0  S4 A32]))
> > >>>           (nil)))
> > >>>
> > >>> before REE.  How should REE work to elimate
> > >>>
> > >>> (insn 7 4 8 2 (set (reg:SI 0 ax [orig:102 c1 ] [102])
> > >>>           (sign_extend:SI (mem/c:QI (plus:SI (reg/f:SI 7 sp)
> > >>>                       (const_int 4 [0x4])) [0 c1+0 S1 A32]))) "x.c":6:10
> > >>> 190 {extendqisi2}
> > >>>        (expr_list:REG_EQUIV (mem:SI (reg/f:SI 16 argp) [0  S4 A32])
> > >>>           (nil)))
> > >>> (insn 8 7 9 2 (set (mem:SI (plus:SI (reg/f:SI 7 sp)
> > >>>                   (const_int 4 [0x4])) [0  S4 A32])
> > >>>           (reg:SI 0 ax [orig:102 c1 ] [102])) "x.c":6:10 95 
> > >>> {*movsi_internal}
> > >>>        (nil))
> > >> You'll have to write code to describe the ABI how the values in question
> > >> are already extended to REE.  It's not going to "just work", you'll have
> > >> to do some development.  I'm not inclined to ACK the expansion patch at
> > >> this time given we've got multiple ways to handle extension removal.
> > >>
> > >
> > > For char and short parameters, x86 ABI leaves the upper bits in
> > > 32 bit fields undefined.  If the outgoing arguments are the same
> > > as the incoming arguments, there is no need to extend outgoing
> > > arguments.   Also ABI info isn't available in REE.  I am not sure if
> > > REE is the best fit here.
> > The whole point is to make it available and utilize it.
>
> I think most of the complication is due to the promotion being performed
> on GIMPLE but GIMPLE not taking the incoming argument as promoted.
>
> On my wishlist is to defer "applying" promote_prototypes to call RTL
> expansion and keeping GIMPLE "type correct", aka _not_ promoting
> arguments in the C/C++/Ada frontends.  I'll note that not all frontends
> perform this promotion (sic!), and thus HJs pattern matching of the
> sibling call site is required for correctness even!
>
> I'll also note that there is DECL_ARG_TYPE on PARM_DECLs which
> should reflect the incoming promotion state - using the target hook in
> RTL expansion is "wrong" (see above - not applied consistently), but
> DECL_ARG_TYPE should tell you where it was applied and where not.
> With LTO and calling C from fortran this optimization might go wrong
> (fortran doesn't promote, C sets DECL_ARG_TYPE to promoted type).
>
> I'd say we should either
>
>  a) drop targetm.promote_prototypes (and hope nobody uses it to
>      implement an ABI requirement)
>  b) apply targetm.promote_prototypes during RTL expansion
>
> So the patch as-is isn't good I think (if you don't succeed with
> Fortran due to lack of small integer types you can also try D
> or Modula for the LTO wrong-code testcase).
>
> ISTR we do rely on DECL_ARG_TYPE in combine but wrongly so
> as said.

program test
    use iso_c_binding, only: c_short
    interface
      subroutine foo(a) bind(c)
        import c_short
        integer(kind=c_short), intent(in), value :: a
      end subroutine foo
    end interface
    integer(kind=c_short) a(5);
    call foo (a(3))
end

shows

;; foo (_7);

(insn 13 12 14 (set (reg:HI 102)
        (mem/c:HI (plus:DI (reg/f:DI 93 virtual-stack-vars)
                (const_int -6 [0xfffffffffffffffa])) [1 a[2]+0 S2
A16])) "t.f90":10:19 -1
     (nil))

(insn 14 13 15 (set (reg:HI 5 di)
        (reg:HI 102)) "t.f90":10:19 -1
     (nil))

(call_insn 15 14 0 (call (mem:QI (symbol_ref:DI ("foo") [flags 0x41]
<function_decl 0x7f7fcc430800 foo>) [0 foo S1 A8])
        (const_int 0 [0])) "t.f90":10:19 -1
     (expr_list:REG_CALL_DECL (symbol_ref:DI ("foo") [flags 0x41]
<function_decl 0x7f7fcc430800 foo>)
        (nil))
    (expr_list:HI (use (reg:HI 5 di))
        (nil)))

and with -Os generates

        movw    10(%rsp), %di
        call    foo

but a C TU would assume 'foo' receives promoted arguments.

Richard.



> Richard.
>
> > jeff

Reply via email to