Hi!

On 2024-06-27T18:49:17+0200, I wrote:
> On 2023-10-24T19:49:10+0100, Richard Sandiford <richard.sandif...@arm.com> 
> wrote:
>> This patch adds a combine pass that runs late in the pipeline.

[After sending, I realized I replied to a previous thread of this work.]

> I've beek looking a bit through recent nvptx target code generation
> changes for GCC target libraries, and thought I'd also share here my
> findings for the "late-combine" changes in isolation, for nvptx target.
> 
> First the unexpected thing:

So much for "unexpected thing" -- next level of unexpected here...
Appreciated if anyone feels like helping me find my way through this, but
I totally understand if you've got other things to do.

> there are a few cases where we now see unused
> registers get declared, for example (random) in
> 'nvptx-none/newlib/libc/libm_a-s_modf.o:modf'

I first looked into a simpler case: newlib 'libc/locale/lnumeric.c'.

Here we get the following 'diff' for '*.s' for
'-fno-late-combine-instructions' vs. (default)
'-flate-combine-instructions':

     .visible .func (.param.u32 %value_out) __numeric_load_locale (.param.u64 
%in_ar0, .param.u64 %in_ar1, .param.u64 %in_ar2, .param.u64 %in_ar3)
     {
            .reg.u32 %value;
            .reg.u64 %ar0;
            ld.param.u64 %ar0, [%in_ar0];
            .reg.u64 %ar1;
            ld.param.u64 %ar1, [%in_ar1];
            .reg.u64 %ar2;
            ld.param.u64 %ar2, [%in_ar2];
            .reg.u64 %ar3;
            ld.param.u64 %ar3, [%in_ar3];
    +       .reg.u32 %r22;
            .file 2 "../../../source-gcc/newlib/libc/locale/lnumeric.c"
            .loc 2 89 1
                    mov.u32 %value, 0;
            st.param.u32    [%value_out], %value;
            ret;
     }

Clearly, '%r22' is unused.  However, looking at the source code (manually
trimmed):

    int
    __numeric_load_locale (struct __locale_t *locale, const char *name ,
                           void *f_wctomb, const char *charset)
    {
      int ret;
      struct lc_numeric_T nm;
      char *bufp = NULL;
    
    #ifdef __CYGWIN__
      [...]
    #else
      /* TODO */
    #endif
      return ret;
    }

..., and adding '-Wall' (why isn't top-level/newlib build system doing
that...):

    [...]
    ../../../source-gcc/newlib/libc/locale/lnumeric.c:88:10: warning: ‘ret’ is 
used uninitialized [-Wuninitialized]
       88 |   return ret;
          |          ^~~
    ../../../source-gcc/newlib/libc/locale/lnumeric.c:48:7: note: ‘ret’ was 
declared here
       48 |   int ret;
          |       ^~~

Uh.  Given nothing else is going on in that function, I suppose '%r22'
relates to the uninitialized 'ret' -- and given undefined behavior, GCC
of course is fine to emit an unused 'reg' in that case...

But: should we expect '-fno-late-combine-instructions' vs.
'-flate-combine-instructions' to behave in the same way?  (After all,
'%r22' remains unused also with '-flate-combine-instructions', and
doesn't need to be emitted.)  This could, of course, also be a nvptx back
end issue?

I'm happy to supply any dump files etc.  Also, 'tmp-libc_a-lnumeric.i.xz'
is attached if you'd like to reproduce this with your own nvptx target
'cc1':

    $ [...]/configure --target=nvptx-none --enable-languages=c
    $ make -j12 all-gcc
    $ gcc/cc1 -fpreprocessed tmp-libc_a-lnumeric.i -quiet -dumpbase 
tmp-libc_a-lnumeric.c -dumpbase-ext .c -misa=sm_30 -g -O2 -fno-builtin -o 
tmp-libc_a-lnumeric.s -fdump-rtl-all # -fno-late-combine-instructions


Grüße
 Thomas


Attachment: tmp-libc_a-lnumeric.i.xz
Description: application/xz

Reply via email to