Hi,

  For this C code (slightly modified from PR 30908)

void wait(int i)
{
        while (i-- > 0)
                asm volatile("nop" ::: "memory");
}

  gcc 4.8 at -Os produces

        jmp     .L2
.L3:
        nop
        decl    %edi
.L2:
        testl   %edi, %edi
        jg      .L3
        ret

whereas gcc trunk (and 4.9 onwards, from a quick check) produces

.L2:
        testl   %edi, %edi
        jle     .L5
        nop
        decl    %edi
        jmp     .L2
.L5:
        ret

The code size is identical, but the trunk version executes one more
instruction everytime the loop runs (explicit jump to .L5 with trunk vs
fallthrough with 4.8) - it's faster only if the loop never runs. This
happens irrespective of the memory clobber inline assembler statement.

Digging into the dump files, I found that the transformation occurs in
the bb reorder pass, when it calls cfg_layout_initialize, which
eventually calls try_redirect_by_replacing_jump with in_cfglayout set to
true. That function then removes the jump and causes the RTL
transformation that eventually results in slower code.

Is this intentional? If not, what would be the best way to fix this?

Regards
Senthil

RTL before and after bbro.

Before:

(jump_insn 24 6 25 2 (set (pc)
        (label_ref 15)) "pr30908.c":3 678 {jump}
     (nil)
 -> 15)
(barrier 25 24 17)
(code_label 17 25 12 3 3 "" [1 uses])
(note 12 17 13 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(insn 13 12 14 3 (parallel [
            (asm_operands/v ("nop") ("") 0 []
                 []
                 [] pr30908.c:4)
            (clobber (mem:BLK (scratch) [0  A8]))
            (clobber (reg:CCFP 18 fpsr))
            (clobber (reg:CC 17 flags))
        ]) "pr30908.c":4 -1
     (expr_list:REG_UNUSED (reg:CCFP 18 fpsr)
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (nil))))
(insn 14 13 15 3 (parallel [
            (set (reg:SI 5 di [orig:90 ivtmp.9 ] [90])
                (plus:SI (reg:SI 5 di [orig:90 ivtmp.9 ] [90])
                    (const_int -1 [0xffffffffffffffff])))
            (clobber (reg:CC 17 flags))
        ]) 210 {*addsi_1}
     (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil)))
(code_label 15 14 16 4 2 "" [1 uses])
(note 16 15 18 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(insn 18 16 19 4 (set (reg:CCNO 17 flags)
        (compare:CCNO (reg:SI 5 di [orig:90 ivtmp.9 ] [90])
            (const_int 0 [0]))) "pr30908.c":3 3 {*cmpsi_ccno_1}
     (nil))
(jump_insn 19 18 30 4 (set (pc)
        (if_then_else (gt (reg:CCNO 17 flags)
                (const_int 0 [0]))
            (label_ref 17)
            (pc))) "pr30908.c":3 646 {*jcc_1}
     (expr_list:REG_DEAD (reg:CCNO 17 flags)
        (int_list:REG_BR_PROB 8500 (nil)))
 -> 17)
(note 30 19 28 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
(note 28 30 29 5 NOTE_INSN_EPILOGUE_BEG)
(jump_insn 29 28 31 5 (simple_return) "pr30908.c":5 708 {simple_return_internal}
     (nil)
 -> simple_return)

After:

<snip>
(code_label 15 6 16 3 2 "" [1 uses])
(note 16 15 18 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(insn 18 16 19 3 (set (reg:CCNO 17 flags)
        (compare:CCNO (reg:SI 5 di [orig:90 ivtmp.9 ] [90])
            (const_int 0 [0]))) "pr30908.c":3 3 {*cmpsi_ccno_1}
     (nil))
(jump_insn 19 18 12 3 (set (pc)
        (if_then_else (le (reg:CCNO 17 flags)
                (const_int 0 [0]))
            (label_ref:DI 34)
            (pc))) "pr30908.c":3 646 {*jcc_1}
     (expr_list:REG_DEAD (reg:CCNO 17 flags)
        (int_list:REG_BR_PROB 1500 (nil)))
 -> 34)
(note 12 19 13 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(insn 13 12 14 4 (parallel [
            (asm_operands/v ("nop") ("") 0 []
                 []
                 [] pr30908.c:4)
            (clobber (mem:BLK (scratch) [0  A8]))
            (clobber (reg:CCFP 18 fpsr))
            (clobber (reg:CC 17 flags))
        ]) "pr30908.c":4 -1
     (expr_list:REG_UNUSED (reg:CCFP 18 fpsr)
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (nil))))
(insn 14 13 35 4 (parallel [
            (set (reg:SI 5 di [orig:90 ivtmp.9 ] [90])
                (plus:SI (reg:SI 5 di [orig:90 ivtmp.9 ] [90])
                    (const_int -1 [0xffffffffffffffff])))
            (clobber (reg:CC 17 flags))
        ]) 210 {*addsi_1}
     (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil)))
(jump_insn 35 14 36 4 (set (pc)
        (label_ref 15)) -1
     (nil)
 -> 15)
(barrier 36 35 34)
(code_label 34 36 30 5 5 "" [1 uses])
(note 30 34 28 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
(note 28 30 29 5 NOTE_INSN_EPILOGUE_BEG)
(jump_insn 29 28 31 5 (simple_return) "pr30908.c":5 708 {simple_return_internal}
     (nil)
 -> simple_return)

Reply via email to