On Wed, 21 Dec 2016, Senthil Kumar Selvaraj wrote:
> Hi,
>
> For this C code (slightly modified from PR 30908)
>
> void wait(int i)
> {
> while (i-- > 0)
> asm volatile("nop" ::: "memory");
> }
>
> gcc 4.8 at -Os produces
>
> jmp .L2
> .L3:
> nop
> decl %edi
> .L2:
> testl %edi, %edi
> jg .L3
> ret
>
> whereas gcc trunk (and 4.9 onwards, from a quick check) produces
>
> .L2:
> testl %edi, %edi
> jle .L5
> nop
> decl %edi
> jmp .L2
> .L5:
> ret
>
> The code size is identical, but the trunk version executes one more
> instruction everytime the loop runs (explicit jump to .L5 with trunk vs
> fallthrough with 4.8) - it's faster only if the loop never runs. This
> happens irrespective of the memory clobber inline assembler statement.
>
> Digging into the dump files, I found that the transformation occurs in
> the bb reorder pass, when it calls cfg_layout_initialize, which
> eventually calls try_redirect_by_replacing_jump with in_cfglayout set to
> true. That function then removes the jump and causes the RTL
> transformation that eventually results in slower code.
>
> Is this intentional? If not, what would be the best way to fix this?
I belive that doing BB reorder in CFG layout mode is fundamentally
flawed but I guess it's wired up so that out-of-CFG layout honors
EDGE_FALLTHRU.
In any way, why does BB reorder not "fix" the "bogus" reorder
into-CFG-layout performs?
Richard.
> Regards
> Senthil
>
> RTL before and after bbro.
>
> Before:
>
> (jump_insn 24 6 25 2 (set (pc)
> (label_ref 15)) "pr30908.c":3 678 {jump}
> (nil)
> -> 15)
> (barrier 25 24 17)
> (code_label 17 25 12 3 3 "" [1 uses])
> (note 12 17 13 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
> (insn 13 12 14 3 (parallel [
> (asm_operands/v ("nop") ("") 0 []
> []
> [] pr30908.c:4)
> (clobber (mem:BLK (scratch) [0 A8]))
> (clobber (reg:CCFP 18 fpsr))
> (clobber (reg:CC 17 flags))
> ]) "pr30908.c":4 -1
> (expr_list:REG_UNUSED (reg:CCFP 18 fpsr)
> (expr_list:REG_UNUSED (reg:CC 17 flags)
> (nil))))
> (insn 14 13 15 3 (parallel [
> (set (reg:SI 5 di [orig:90 ivtmp.9 ] [90])
> (plus:SI (reg:SI 5 di [orig:90 ivtmp.9 ] [90])
> (const_int -1 [0xffffffffffffffff])))
> (clobber (reg:CC 17 flags))
> ]) 210 {*addsi_1}
> (expr_list:REG_UNUSED (reg:CC 17 flags)
> (nil)))
> (code_label 15 14 16 4 2 "" [1 uses])
> (note 16 15 18 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
> (insn 18 16 19 4 (set (reg:CCNO 17 flags)
> (compare:CCNO (reg:SI 5 di [orig:90 ivtmp.9 ] [90])
> (const_int 0 [0]))) "pr30908.c":3 3 {*cmpsi_ccno_1}
> (nil))
> (jump_insn 19 18 30 4 (set (pc)
> (if_then_else (gt (reg:CCNO 17 flags)
> (const_int 0 [0]))
> (label_ref 17)
> (pc))) "pr30908.c":3 646 {*jcc_1}
> (expr_list:REG_DEAD (reg:CCNO 17 flags)
> (int_list:REG_BR_PROB 8500 (nil)))
> -> 17)
> (note 30 19 28 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
> (note 28 30 29 5 NOTE_INSN_EPILOGUE_BEG)
> (jump_insn 29 28 31 5 (simple_return) "pr30908.c":5 708
> {simple_return_internal}
> (nil)
> -> simple_return)
>
> After:
>
> <snip>
> (code_label 15 6 16 3 2 "" [1 uses])
> (note 16 15 18 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
> (insn 18 16 19 3 (set (reg:CCNO 17 flags)
> (compare:CCNO (reg:SI 5 di [orig:90 ivtmp.9 ] [90])
> (const_int 0 [0]))) "pr30908.c":3 3 {*cmpsi_ccno_1}
> (nil))
> (jump_insn 19 18 12 3 (set (pc)
> (if_then_else (le (reg:CCNO 17 flags)
> (const_int 0 [0]))
> (label_ref:DI 34)
> (pc))) "pr30908.c":3 646 {*jcc_1}
> (expr_list:REG_DEAD (reg:CCNO 17 flags)
> (int_list:REG_BR_PROB 1500 (nil)))
> -> 34)
> (note 12 19 13 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
> (insn 13 12 14 4 (parallel [
> (asm_operands/v ("nop") ("") 0 []
> []
> [] pr30908.c:4)
> (clobber (mem:BLK (scratch) [0 A8]))
> (clobber (reg:CCFP 18 fpsr))
> (clobber (reg:CC 17 flags))
> ]) "pr30908.c":4 -1
> (expr_list:REG_UNUSED (reg:CCFP 18 fpsr)
> (expr_list:REG_UNUSED (reg:CC 17 flags)
> (nil))))
> (insn 14 13 35 4 (parallel [
> (set (reg:SI 5 di [orig:90 ivtmp.9 ] [90])
> (plus:SI (reg:SI 5 di [orig:90 ivtmp.9 ] [90])
> (const_int -1 [0xffffffffffffffff])))
> (clobber (reg:CC 17 flags))
> ]) 210 {*addsi_1}
> (expr_list:REG_UNUSED (reg:CC 17 flags)
> (nil)))
> (jump_insn 35 14 36 4 (set (pc)
> (label_ref 15)) -1
> (nil)
> -> 15)
> (barrier 36 35 34)
> (code_label 34 36 30 5 5 "" [1 uses])
> (note 30 34 28 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
> (note 28 30 29 5 NOTE_INSN_EPILOGUE_BEG)
> (jump_insn 29 28 31 5 (simple_return) "pr30908.c":5 708
> {simple_return_internal}
> (nil)
> -> simple_return)
>
>
--
Richard Biener <[email protected]>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB
21284 (AG Nuernberg)