Hi, sorry for late response,

On 2021/3/23 16:50, Richard Biener wrote:
>>> It definitely should be before uncprop (but context stops there). And yes,
>>> re-running passes isn't the very, very best thing to do without explaining
>>> it cannot be done in other ways. Not for late stage 3 anyway.
>>>
>>> Richard.
>>>
>> Thanks.  Also tried to implement this in a seperate RTL pass, which
>> would be better?  I guess this would also be stage1 issues...
> Yes, that's also for stage1 obviously.  Can you check how the number
> of sink opportunities of your RTL pass changes if you add the
> 2nd GIMPLE sinking pass?

Number of Instructions sank out of loop when running (no sink2 -> sink2):
    1. SPEC2017 int: 371  ->  142
    2. SPEC2017 float: 949  ->  343
    3. bootstrap: 402  ->  229
    4. stage1 libraries: 115   ->  68
    5. regression tests: 4533 ->  2948

the case I used from the beginning could all be optimized by gimple sink2
instead, but there are still many instructions sunk even gimple sink2 is
added, I guess most of them are produced by expand pass.  One example(It was
after #38 in 262r.reginfo, note that new block is created between
exit->src and exit->dst to avoid other bb jumps into exit->dst cause
execution error due to r132:DI updated unexpectedly.), sometimes extra
zero extend in loop like this could cause serious performance issue:

vect-live-4.ltrans0.ltrans.263r.sink:
...
Loop 2: sinking (set (reg/v:DI 132 [ <retval> ])
    (sign_extend:DI (reg:SI 144))) from bb 7 to bb 11

...

   44: L44:
   35: NOTE_INSN_BASIC_BLOCK 7
   36: r127:DI=r127:DI+0x4
   37: r145:SI=[r127:DI]
   38: r144:SI=r145:SI+0x5
      REG_DEAD r145:SI
   40: r125:DI=r125:DI+0x4
   41: [r125:DI]=r144:SI
      REG_DEAD r144:SI
   42: r146:SI=r126:DI#0-0x1
      REG_DEAD r126:DI
   43: r126:DI=zero_extend(r146:SI)
      REG_DEAD r146:SI
   45: r147:CC=cmp(r126:DI,0)
   46: pc={(r147:CC!=0)?L65:pc}
      REG_DEAD r147:CC
      REG_BR_PROB 941032164
   68: NOTE_INSN_BASIC_BLOCK 11
   69: r132:DI=sign_extend(r144:SI)
      ; pc falls through to BB 8
   65: L65:
   64: NOTE_INSN_BASIC_BLOCK 9
      ; pc falls through to BB 7
   47: L47:
   48: NOTE_INSN_BASIC_BLOCK 8

> 
> I'll note that you are not sinking later stmts first (you only
> walk insns reverse but not BBs).  GIMPLE sinking performs a
> domwalk over post dominators (but it has an easier job because
> of PHIs).  I guess you'd want to walk starting from loop exit
> blocks (from innermost loops as you do) in reverse program order.

Yes, this rtl sink could only sink instruction from *loop header*
to every loop exit blocks in reverse order, it is a bit strict
since this is the first step to see whether it is reasonable to add
such a pass.
For example, if the instruction is in loop body block, sink it out will
cause execution error sometimes as it doesn't have information in it
whether the loop body block will be executed or not, if the loop jumps
from header to exit directly, the instructions sunk from body to exit
will change register value unexpected, seems always_reached and
always_executed in loop-invarinat.c could be reused here to determine
such circumstance?

> 
> I'll also note that you don't have to check whether stmts you
> want to sink have their uses set after it - you can emit copies
> to a new pseudo at the original insn location and use those
> after the loop (that of course comes at some cost).

Not sure whether I understood your point correctly, but the instruction
is still in loop executed loop niter times?
What I am trying to do is move r132:DI=sign_extend(r144:SI)
out of loop, if it was executed in loop 100 times, r132 is not used
in loop, and r144 is not updated after current instruction, then move
it to loop exit to execute one once. 

> 
> Also we already have a sinking pass on RTL which even computes
> a proper PRE on the reverse graph - -fgcse-sm aka store-motion.c.
> I'm not sure whether this deals with non-stores but the
> LCM machinery definitely can handle arbitrary expressions.  I wonder
> if it makes more sense to extend this rather than inventing a new
> ad-hoc sinking pass?

>From the literal, my pass doesn't handle or process store instructions
like store-motion..  Thanks, will check it.

-- 
Thanks,
Xionghu

Reply via email to