On Sun, 2023-11-19 at 17:47 -0700, Jeff Law wrote: > This is work originally started by Joern @ Embecosm. > > There's been a long standing sense that we're generating too many > sign/zero extensions on the RISC-V port. REE is useful, but it's really > focused on a relatively narrow part of the extension problem. > > What Joern's patch does is introduce a new pass which tracks liveness of > chunks of pseudo regs. Specifically it tracks bits 0..7, 8..15, 16..31 > and 32..63. > > If it encounters a sign/zero extend that sets bits that are never read, > then it replaces the sign/zero extension with a narrowing subreg. The > narrowing subreg usually gets eliminated by subsequent passes (it's just > a copy after all). > > Jivan has done some analysis and found that it eliminates roughly 1% of > the dynamic instruction stream for x264 as well as some redundant > extensions in the coremark benchmark (both on rv64). In my own testing > as I worked through issues on other architectures I clearly saw it > helping in various places within GCC itself or in the testsuite. > > The basic structure is to first do a fairly standard liveness analysis > on the chunks, seeding original state with the liveness data from DF. > Once that's stable, we do a final pass to identify the useless > extensions and transform them into narrowing subregs. > > A few key points to remember. > > For destination processing it is always safe to ignore a destination. > Ignoring a destination merely means that whatever was live after the > given insn will continue to be live before the insn. What is not safe > is to clear a bit in the LIVENOW bitmap for a destination chunk that is > not set. This comes into play with things like STRICT_LOW_PART. > > For source processing the safe thing to do is to set all the chunks in a > register as live. It is never safe to fail to process a source operand. > > When a destination object is not fully live, we try to transfer that > limited liveness to the source operands. So for example if bits 16..63 > are dead in a destination of a PLUS, we need not mark bits 16..63 as > live for the source operands. We have to be careful -- consider a shift > count on a target without SHIFT_COUNT_TRUNCATED set. So we have both a > list of RTL codes where we can transfer liveness and a few codes where > one of the operands may need to be fully live (ex, a shift count) while > the other input may not need to be fully live (value left shifted). > > Locally we have had this enabled at -O1 and above to encourage testing, > but I'm thinking that for the trunk enabling at -O2 and above is the > right thing to do. > > This has (of course) been tested on rv64. It's also been bootstrapped > and regression tested on x86. Bootstrap and regression tested (C only) > for m68k, sh4, sh4eb, alpha. Earlier versions were also bootstrapped > and regression tested on ppc, hppa and s390x (C only for those as well). > It's also been tested on the various crosses in my tester. So we've > got reasonable coverage of 16, 32 and 64 bit targets, big and little > endian, with and without SHIFT_COUNT_TRUNCATED and all kinds of other > oddities. > > The included tests are for RISC-V only because not all targets are going > to have extraneous extensions. There's tests from coremark, x264 and > GCC's bz database. It probably wouldn't be hard to add aarch64 > testscases. The BZs listed are improved by this patch for aarch64. > > Given the amount of work Jivan and I have done, I'm not comfortable > self-approving at this time. I'd much rather have another set of eyes > on the code. Hopefully the code is documented well enough for that to > be useful exercise. > > So, no need to work from Pago Pago for this patch. I may make another > attempt at the eswin conditional move work while working virtually in > Pago Pago though. > > Thoughts, comments, recommendations?
Unfortunately, I get some ICE building stage 1 libgcc with this patch on loongarch64-linux-gnu: during RTL pass: ext_dce ../../../gcc/libgcc/libgcc2.c: In function ‘__absvdi2’: ../../../gcc/libgcc/libgcc2.c:224:1: internal compiler error: Segmentation fault 224 | } | ^ 0x120baa477 crash_signal ../../gcc/gcc/toplev.cc:316 0x1216aeeb4 ext_dce_process_sets ../../gcc/gcc/ext-dce.cc:128 0x1216afbaf ext_dce_process_bb ../../gcc/gcc/ext-dce.cc:647 0x1216afbaf ext_dce ../../gcc/gcc/ext-dce.cc:802 0x1216afbaf execute ../../gcc/gcc/ext-dce.cc:868 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. -- Xi Ruoyao <xry...@xry111.site> School of Aerospace Science and Technology, Xidian University