Hi! On 2024-06-27T18:49:17+0200, I wrote: > On 2023-10-24T19:49:10+0100, Richard Sandiford <richard.sandif...@arm.com> > wrote: >> This patch adds a combine pass that runs late in the pipeline.
[After sending, I realized I replied to a previous thread of this work.] > I've beek looking a bit through recent nvptx target code generation > changes for GCC target libraries, and thought I'd also share here my > findings for the "late-combine" changes in isolation, for nvptx target. > > First the unexpected thing: So much for "unexpected thing" -- next level of unexpected here... Appreciated if anyone feels like helping me find my way through this, but I totally understand if you've got other things to do. > there are a few cases where we now see unused > registers get declared, for example (random) in > 'nvptx-none/newlib/libc/libm_a-s_modf.o:modf' I first looked into a simpler case: newlib 'libc/locale/lnumeric.c'. Here we get the following 'diff' for '*.s' for '-fno-late-combine-instructions' vs. (default) '-flate-combine-instructions': .visible .func (.param.u32 %value_out) __numeric_load_locale (.param.u64 %in_ar0, .param.u64 %in_ar1, .param.u64 %in_ar2, .param.u64 %in_ar3) { .reg.u32 %value; .reg.u64 %ar0; ld.param.u64 %ar0, [%in_ar0]; .reg.u64 %ar1; ld.param.u64 %ar1, [%in_ar1]; .reg.u64 %ar2; ld.param.u64 %ar2, [%in_ar2]; .reg.u64 %ar3; ld.param.u64 %ar3, [%in_ar3]; + .reg.u32 %r22; .file 2 "../../../source-gcc/newlib/libc/locale/lnumeric.c" .loc 2 89 1 mov.u32 %value, 0; st.param.u32 [%value_out], %value; ret; } Clearly, '%r22' is unused. However, looking at the source code (manually trimmed): int __numeric_load_locale (struct __locale_t *locale, const char *name , void *f_wctomb, const char *charset) { int ret; struct lc_numeric_T nm; char *bufp = NULL; #ifdef __CYGWIN__ [...] #else /* TODO */ #endif return ret; } ..., and adding '-Wall' (why isn't top-level/newlib build system doing that...): [...] ../../../source-gcc/newlib/libc/locale/lnumeric.c:88:10: warning: ‘ret’ is used uninitialized [-Wuninitialized] 88 | return ret; | ^~~ ../../../source-gcc/newlib/libc/locale/lnumeric.c:48:7: note: ‘ret’ was declared here 48 | int ret; | ^~~ Uh. Given nothing else is going on in that function, I suppose '%r22' relates to the uninitialized 'ret' -- and given undefined behavior, GCC of course is fine to emit an unused 'reg' in that case... But: should we expect '-fno-late-combine-instructions' vs. '-flate-combine-instructions' to behave in the same way? (After all, '%r22' remains unused also with '-flate-combine-instructions', and doesn't need to be emitted.) This could, of course, also be a nvptx back end issue? I'm happy to supply any dump files etc. Also, 'tmp-libc_a-lnumeric.i.xz' is attached if you'd like to reproduce this with your own nvptx target 'cc1': $ [...]/configure --target=nvptx-none --enable-languages=c $ make -j12 all-gcc $ gcc/cc1 -fpreprocessed tmp-libc_a-lnumeric.i -quiet -dumpbase tmp-libc_a-lnumeric.c -dumpbase-ext .c -misa=sm_30 -g -O2 -fno-builtin -o tmp-libc_a-lnumeric.s -fdump-rtl-all # -fno-late-combine-instructions Grüße Thomas
tmp-libc_a-lnumeric.i.xz
Description: application/xz