On Wed, 21 Dec 2016, Senthil Kumar Selvaraj wrote: > Hi, > > For this C code (slightly modified from PR 30908) > > void wait(int i) > { > while (i-- > 0) > asm volatile("nop" ::: "memory"); > } > > gcc 4.8 at -Os produces > > jmp .L2 > .L3: > nop > decl %edi > .L2: > testl %edi, %edi > jg .L3 > ret > > whereas gcc trunk (and 4.9 onwards, from a quick check) produces > > .L2: > testl %edi, %edi > jle .L5 > nop > decl %edi > jmp .L2 > .L5: > ret > > The code size is identical, but the trunk version executes one more > instruction everytime the loop runs (explicit jump to .L5 with trunk vs > fallthrough with 4.8) - it's faster only if the loop never runs. This > happens irrespective of the memory clobber inline assembler statement. > > Digging into the dump files, I found that the transformation occurs in > the bb reorder pass, when it calls cfg_layout_initialize, which > eventually calls try_redirect_by_replacing_jump with in_cfglayout set to > true. That function then removes the jump and causes the RTL > transformation that eventually results in slower code. > > Is this intentional? If not, what would be the best way to fix this?
I belive that doing BB reorder in CFG layout mode is fundamentally flawed but I guess it's wired up so that out-of-CFG layout honors EDGE_FALLTHRU. In any way, why does BB reorder not "fix" the "bogus" reorder into-CFG-layout performs? Richard. > Regards > Senthil > > RTL before and after bbro. > > Before: > > (jump_insn 24 6 25 2 (set (pc) > (label_ref 15)) "pr30908.c":3 678 {jump} > (nil) > -> 15) > (barrier 25 24 17) > (code_label 17 25 12 3 3 "" [1 uses]) > (note 12 17 13 3 [bb 3] NOTE_INSN_BASIC_BLOCK) > (insn 13 12 14 3 (parallel [ > (asm_operands/v ("nop") ("") 0 [] > [] > [] pr30908.c:4) > (clobber (mem:BLK (scratch) [0 A8])) > (clobber (reg:CCFP 18 fpsr)) > (clobber (reg:CC 17 flags)) > ]) "pr30908.c":4 -1 > (expr_list:REG_UNUSED (reg:CCFP 18 fpsr) > (expr_list:REG_UNUSED (reg:CC 17 flags) > (nil)))) > (insn 14 13 15 3 (parallel [ > (set (reg:SI 5 di [orig:90 ivtmp.9 ] [90]) > (plus:SI (reg:SI 5 di [orig:90 ivtmp.9 ] [90]) > (const_int -1 [0xffffffffffffffff]))) > (clobber (reg:CC 17 flags)) > ]) 210 {*addsi_1} > (expr_list:REG_UNUSED (reg:CC 17 flags) > (nil))) > (code_label 15 14 16 4 2 "" [1 uses]) > (note 16 15 18 4 [bb 4] NOTE_INSN_BASIC_BLOCK) > (insn 18 16 19 4 (set (reg:CCNO 17 flags) > (compare:CCNO (reg:SI 5 di [orig:90 ivtmp.9 ] [90]) > (const_int 0 [0]))) "pr30908.c":3 3 {*cmpsi_ccno_1} > (nil)) > (jump_insn 19 18 30 4 (set (pc) > (if_then_else (gt (reg:CCNO 17 flags) > (const_int 0 [0])) > (label_ref 17) > (pc))) "pr30908.c":3 646 {*jcc_1} > (expr_list:REG_DEAD (reg:CCNO 17 flags) > (int_list:REG_BR_PROB 8500 (nil))) > -> 17) > (note 30 19 28 5 [bb 5] NOTE_INSN_BASIC_BLOCK) > (note 28 30 29 5 NOTE_INSN_EPILOGUE_BEG) > (jump_insn 29 28 31 5 (simple_return) "pr30908.c":5 708 > {simple_return_internal} > (nil) > -> simple_return) > > After: > > <snip> > (code_label 15 6 16 3 2 "" [1 uses]) > (note 16 15 18 3 [bb 3] NOTE_INSN_BASIC_BLOCK) > (insn 18 16 19 3 (set (reg:CCNO 17 flags) > (compare:CCNO (reg:SI 5 di [orig:90 ivtmp.9 ] [90]) > (const_int 0 [0]))) "pr30908.c":3 3 {*cmpsi_ccno_1} > (nil)) > (jump_insn 19 18 12 3 (set (pc) > (if_then_else (le (reg:CCNO 17 flags) > (const_int 0 [0])) > (label_ref:DI 34) > (pc))) "pr30908.c":3 646 {*jcc_1} > (expr_list:REG_DEAD (reg:CCNO 17 flags) > (int_list:REG_BR_PROB 1500 (nil))) > -> 34) > (note 12 19 13 4 [bb 4] NOTE_INSN_BASIC_BLOCK) > (insn 13 12 14 4 (parallel [ > (asm_operands/v ("nop") ("") 0 [] > [] > [] pr30908.c:4) > (clobber (mem:BLK (scratch) [0 A8])) > (clobber (reg:CCFP 18 fpsr)) > (clobber (reg:CC 17 flags)) > ]) "pr30908.c":4 -1 > (expr_list:REG_UNUSED (reg:CCFP 18 fpsr) > (expr_list:REG_UNUSED (reg:CC 17 flags) > (nil)))) > (insn 14 13 35 4 (parallel [ > (set (reg:SI 5 di [orig:90 ivtmp.9 ] [90]) > (plus:SI (reg:SI 5 di [orig:90 ivtmp.9 ] [90]) > (const_int -1 [0xffffffffffffffff]))) > (clobber (reg:CC 17 flags)) > ]) 210 {*addsi_1} > (expr_list:REG_UNUSED (reg:CC 17 flags) > (nil))) > (jump_insn 35 14 36 4 (set (pc) > (label_ref 15)) -1 > (nil) > -> 15) > (barrier 36 35 34) > (code_label 34 36 30 5 5 "" [1 uses]) > (note 30 34 28 5 [bb 5] NOTE_INSN_BASIC_BLOCK) > (note 28 30 29 5 NOTE_INSN_EPILOGUE_BEG) > (jump_insn 29 28 31 5 (simple_return) "pr30908.c":5 708 > {simple_return_internal} > (nil) > -> simple_return) > > -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)