Dear all,

I come back to you with another weirdness due to bad code generation
on my target architecture. I have a very simplified (for the moment)
rtx_costs and my address_cost is inspired by the i386 version.
However, I actually patched in the whole i386_rtx_cost function,
constraints, predicates to see if it was something I had done wrongly
but I seem to get the same results.

This is my two functions:

uint64_t foo (uint64_t n, uint64_t m)
{
    uint64_t sum = 0,i;
    for(i=n;i<n+m;i++)
    {
     sum +=  data[i] + data[i+13];
    }
    return sum;
}

After the prologue of the loop, I get :

    mov r1, theCorrectStartAddress
    load r2,0(r1)
    load r3,104(r1)

However, if I do this:

uint64_t goo (uint64_t i)
{
    return data[i] + data[i+13];
}

I get :
    mov r1, Calculation of data[i]
    mov r2, Calculation of data[i+13]

    ldd r3,0(r1)
    ldd r4,0(r2)


It seems that when set in a loop, the program is able to perform some
type of optimization to actually get the use of the offsets where as
in the case of no loop, we have twice the calculations of instructions
for each address calculations.

Like I said, I replaced the cost function with the x86 version and I
got the same thing, so I don't really know where to look? Could the
expansion of my movDI/SI and instruction definition have an
implication like this?

Thanks for your input,
Jean Christophe Beyler

Reply via email to