On 2022-06-22, Andrew Pinski wrote:
On Fri, May 27, 2022 at 11:52 PM m <m...@bitsnbites.eu> wrote:
Hello!
I maintain a fork of GCC which adds support for my custom CPU ISA,
MRISC32 (the machine description can be found here:
https://github.com/mrisc32/gcc-mrisc32/tree/mbitsnbites/mrisc32/gcc/config/mrisc32
).
I recently discovered that scaled index addressing (i.e. MEM[base +
index * scale]) does not work inside loops, but I have not been able to
figure out why.
I believe that I have all the plumbing in the MD that's required
(MAX_REGS_PER_ADDRESS, REGNO_OK_FOR_BASE_P, REGNO_OK_FOR_INDEX_P, etc),
and I have verified that scaled index addressing is used in trivial
cases like this:
charcarray[100];
shortsarray[100];
intiarray[100];
voidsingle_element(intidx, intvalue) {
carray[idx] = value; // OK
sarray[idx] = value; // OK
iarray[idx] = value; // OK
}
...which produces the expected machine code similar to this:
stbr2, [r3, r1] // OK
sthr2, [r3, r1*2] // OK
stwr2, [r3, r1*4] // OK
However, when the array assignment happens inside a loop, only the char
version uses index addressing. The other sizes (short and int) will be
transformed into code where the addresses are stored in registers that
are incremented by +2 and +4 respectively.
voidloop(void) {
for(intidx = 0; idx < 100; ++idx) {
carray[idx] = idx; // OK
sarray[idx] = idx; // BAD
iarray[idx] = idx; // BAD
}
} ...which produces:
.L4:
sthr1, [r3] // BAD
stwr1, [r2] // BAD
stbr1, [r5, r1] // OK
addr1, r1, #1
sner4, r1, #100
addr3, r3, #2 // (BAD)
addr2, r2, #4 // (BAD)
bsr4, .L4
I would expect scaled index addressing to be used in loops too, just as
is done for AArch64 for instance. I have dug around in the machine
description, but I can't really figure out what's wrong.
For reference, here is the same code in Compiler Explorer, including the
code generated for AArch64 for comparison: https://godbolt.org/z/drzfjsxf7
Passing -da (dump RTL all) to gcc, I can see that the decision to not
use index addressing has been made already in *.253r.expand.
The problem is your cost model for the indexing is incorrect; IV-OPTs
uses TARGET_ADDRESS_COST to figure out the cost of each case.
So if you don't have that implemented, then the default one is used
and that will be incorrect in many cases.
You can find IV-OPTs costs and such by using the ivopts dump:
-fdump-tree-ivopts-details .
Thanks,
Andrew Pinski
Thank you Andrew!
I added a TARGET_ADDRESS_COST implementation that just returns zero,
as a test, and sure enough scaled indexed addressing was used.
Now I will just have to figure out a more accurate implementation for
my architecture.
Regards,
Marcus
Does anyone have any hints about what could be wrong and where I should
start looking?
Regards,
Marcus