Hi,
I am facing a problem with code hoisting from a loop in our backend.
This problem seems to be hinted at on:
-fbranch-target-load-optimize
Perform branch target register load optimization before prologue /
epilogue threading. The use of target registers can typically be exposed
only during reload, thus hoisting loads out of loops and doing
inter-block scheduling needs a separate optimization pass.
However, the problem has nothing to do with this flag.
Our architecture doesn't allow something like:
store 0,(X+2) ; store 0 into mem of X+2
So we need to do
load Y,0 ; load Y with constant zero
store Y,(X+2) ; store the 0 in Y into mem of X+2
If I compile a loop:
int vec[8];
int i = 0;
while(i++ < 8)
vec[i] = 0;
I something like:
load X,#vec
?L2:
load Y,0
store Y,(X+0)
add X,1
The load Y,0 though should be out of the loop.
I know what the problem is... gcc during expand has:
For ?L2 bb:
(set (mem/s:QI (reg:QI 41)) (const_int 0))
This is only transformed after the reload into:
(set (reg:QI Y) (const_int 0))
(set (mem/s:QI X) (reg:QI Y))
And so there's no opportunity after reloading for hoistering it seems...
so I though about splitting the insn into two during expand:
(set (reg:QI Y) (const_int 0))
(set (mem/s:QI X) (reg:QI Y))
But then this is combined by cse into:
(set (mem/s:QI (reg:QI 41)) (const_int 0))
and bammm, same problem. No loop hoisting. What's the best way to handle
this? Any suggestions?
Cheers,
Paulo Matos