Hi, On 2021/3/26 15:35, Xionghu Luo via Gcc-patches wrote: >> Also we already have a sinking pass on RTL which even computes >> a proper PRE on the reverse graph - -fgcse-sm aka store-motion.c. >> I'm not sure whether this deals with non-stores but the >> LCM machinery definitely can handle arbitrary expressions. I wonder >> if it makes more sense to extend this rather than inventing a new >> ad-hoc sinking pass? > From the literal, my pass doesn't handle or process store instructions > like store-motion.. Thanks, will check it.
Store motion only processes store instructions with data flow equations, generating 4 inputs(st_kill, st_avloc, st_antloc, st_transp) and solve it by Lazy Code Motion API(5 DF compute call) with 2 outputs (st_delete_map, st_insert_map) globally, each store place is independently represented in the input bitmap vectors. Output is which should be delete and where to insert, current code does what you said "emit copies to a new pseudo at the original insn location and use it in followed bb", actually it is "store replacement" instead of "store move", why not save one pseudo by moving the store instruction to target edge directly? There are many differences between the newly added rtl-sink pass and store-motion pass. 1. Store motion moves only store instructions, rtl-sink ignores store instructions. 2. Store motion is a global DF problem solving, rtl-sink only processes loop header reversely with dependency check in loop, take the below RTL as example, "#538,#235,#234,#233" will all be sunk from bb 35 to bb 37 by rtl-sink, but it moves #538 first, then #235, there is strong dependency here. It seemsdoesn't like the LCM framework that could solve all and do the delete-insert in one iteration. However, there are still some common methods could be shared, like the def-use check(though store-motion is per bb, rtl-sink is per loop), insert_store, commit_edge_insertions etc. 508: L508: 507: NOTE_INSN_BASIC_BLOCK 34 12: r139:DI=r140:DI REG_DEAD r140:DI 240: L240: 231: NOTE_INSN_BASIC_BLOCK 35 232: r142:DI=zero_extend(r139:DI#0) 233: r371:SI=r142:DI#0-0x1 234: r243:DI=zero_extend(r371:SI) REG_DEAD r371:SI 235: r452:DI=r262:DI+r139:DI 538: r194:DI=r452:DI 236: r372:CCUNS=cmp(r142:DI#0,r254:DI#0) 237: pc={(geu(r372:CCUNS,0))?L246:pc} REG_DEAD r372:CCUNS REG_BR_PROB 59055804 238: NOTE_INSN_BASIC_BLOCK 36 239: r140:DI=r139:DI+0x1 241: r373:DI=r251:DI-0x1 242: r374:SI=zero_extend([r262:DI+r139:DI]) REG_DEAD r139:DI 243: r375:SI=zero_extend([r373:DI+r140:DI]) REG_DEAD r373:DI 244: r376:CC=cmp(r374:SI,r375:SI) REG_DEAD r375:SI REG_DEAD r374:SI 245: pc={(r376:CC==0)?L508:pc} REG_DEAD r376:CC REG_BR_PROB 1014686028 246: L246: 247: NOTE_INSN_BASIC_BLOCK 37 248: r377:SI=r142:DI#0-0x2 REG_DEAD r142:DI 249: r256:DI=zero_extend(r377:SI) -- Thanks, Xionghu