Hi,

I am facing a problem with code hoisting from a loop in our backend.

This problem seems to be hinted at on:
-fbranch-target-load-optimize
Perform branch target register load optimization before prologue / epilogue threading. The use of target registers can typically be exposed only during reload, thus hoisting loads out of loops and doing inter-block scheduling needs a separate optimization pass.


However, the problem has nothing to do with this flag.
Our architecture doesn't allow something like:

store 0,(X+2) ; store 0 into mem of X+2

So we need to do
load Y,0 ; load Y with constant zero
store Y,(X+2)  ; store the 0 in Y into mem of X+2

If I compile a loop:
int vec[8];
int i = 0;
while(i++ < 8)
  vec[i] = 0;

I something like:

  load X,#vec
?L2:
  load Y,0
  store Y,(X+0)
  add X,1

The load Y,0 though should be out of the loop.
I know what the problem is... gcc during expand has:

For ?L2 bb:

(set (mem/s:QI (reg:QI 41)) (const_int 0))

This is only transformed after the reload into:

(set (reg:QI Y) (const_int 0))
(set (mem/s:QI X) (reg:QI Y))

And so there's no opportunity after reloading for hoistering it seems... so I though about splitting the insn into two during expand:

(set (reg:QI Y) (const_int 0))
(set (mem/s:QI X) (reg:QI Y))

But then this is combined by cse into:

(set (mem/s:QI (reg:QI 41)) (const_int 0))

and bammm, same problem. No loop hoisting. What's the best way to handle this? Any suggestions?

Cheers,

Paulo Matos


Reply via email to