On Fri, May 27, 2022 at 11:52 PM m <m...@bitsnbites.eu> wrote: > > Hello! > > I maintain a fork of GCC which adds support for my custom CPU ISA, > MRISC32 (the machine description can be found here: > https://github.com/mrisc32/gcc-mrisc32/tree/mbitsnbites/mrisc32/gcc/config/mrisc32 > ). > > I recently discovered that scaled index addressing (i.e. MEM[base + > index * scale]) does not work inside loops, but I have not been able to > figure out why. > > I believe that I have all the plumbing in the MD that's required > (MAX_REGS_PER_ADDRESS, REGNO_OK_FOR_BASE_P, REGNO_OK_FOR_INDEX_P, etc), > and I have verified that scaled index addressing is used in trivial > cases like this: > > charcarray[100]; > shortsarray[100]; > intiarray[100]; > voidsingle_element(intidx, intvalue) { > carray[idx] = value; // OK > sarray[idx] = value; // OK > iarray[idx] = value; // OK > } > > ...which produces the expected machine code similar to this: > > stbr2, [r3, r1] // OK > sthr2, [r3, r1*2] // OK > stwr2, [r3, r1*4] // OK > > However, when the array assignment happens inside a loop, only the char > version uses index addressing. The other sizes (short and int) will be > transformed into code where the addresses are stored in registers that > are incremented by +2 and +4 respectively. > > voidloop(void) { > for(intidx = 0; idx < 100; ++idx) { > carray[idx] = idx; // OK > sarray[idx] = idx; // BAD > iarray[idx] = idx; // BAD > } > } ...which produces: > .L4: > sthr1, [r3] // BAD > stwr1, [r2] // BAD > stbr1, [r5, r1] // OK > addr1, r1, #1 > sner4, r1, #100 > addr3, r3, #2 // (BAD) > addr2, r2, #4 // (BAD) > bsr4, .L4 > > I would expect scaled index addressing to be used in loops too, just as > is done for AArch64 for instance. I have dug around in the machine > description, but I can't really figure out what's wrong. > > For reference, here is the same code in Compiler Explorer, including the > code generated for AArch64 for comparison: https://godbolt.org/z/drzfjsxf7 > > Passing -da (dump RTL all) to gcc, I can see that the decision to not > use index addressing has been made already in *.253r.expand.
The problem is your cost model for the indexing is incorrect; IV-OPTs uses TARGET_ADDRESS_COST to figure out the cost of each case. So if you don't have that implemented, then the default one is used and that will be incorrect in many cases. You can find IV-OPTs costs and such by using the ivopts dump: -fdump-tree-ivopts-details . Thanks, Andrew Pinski > > Does anyone have any hints about what could be wrong and where I should > start looking? > > Regards, > > Marcus >