Useless conditional branches

Alain Ketterlin Tue, 02 Mar 2010 00:56:14 -0800


It looks like gcc sometimes produces "useless" conditional branches.
I've found code like this:


  xor    %edx,%edx
  ; code with no effect on edx (see full code below)
  test   %edx,%edx
  jne    <somewhere else>

The branch on the last line is never taken. Why does gcc generate such
code sequences? Is this patched at runtime, or something? Am I missing
something obvious here?

I append the function's complete code below. There is another
suspicious branch at 0xb31cd8 (never taken, for less obvious
reasons---edx is never zero at that point).

I have found hundreds such occurrences across the CPU2006 suite. Does
anybody have any idea why this happens? Is there any specific
optimization to enable or disable to avoid such dead edges? Thanks in
advance for any remark/idea/...


This code is from 416.gamess (from SPEC CPU2006), function "formf",
compiled with "gcc version 4.4.1 (Ubuntu 4.4.1-4ubuntu9)" (from a
stock ubuntu 9.10), with options "-g -O3 -march=native
-fno-optimize-sibling-calls". 416.gamess is compiled with gfortran,
but the same thing happens with C or C++ programs. The same also
happens at lower optimization levels (-01), but less frequently.

uname -m gives "x86_64", and /proc/cpuinfo contains:

vendor_id       : GenuineIntel
cpu family      : 6
model           : 23
model name      : Intel(R) Core(TM)2 Duo CPU     P8700  @ 2.53GHz
[...]
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor
ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm ida
tpr_shadow vnmi flexpriority

Here is an objdump disassembly of the code, broken into basic-blocks:

0000000000b31c60 <formf_>:
  b31c60: push   %rbx
  b31c61: xor    %edx,%edx
  b31c63: mov    %rsi,%rbx
  b31c66: mov    $0x2,%r9d
  b31c6c: mov    $0xa,%r8d
  b31c72: mov    $0x6,%esi
  b31c77: mov    $0xe,%ecx
  b31c7c: test   %edx,%edx
  b31c7e: jne    b31cda <formf_+0x7a>

  b31c80: mov    $0x3ff8000000000000,%r11
  b31c8a: mov    $0xbfe0000000000000,%r10
  b31c94: mov    %r11,(%rdi)
  b31c97: movq   $0x0,0x40(%rdi)
  b31c9f: movq   $0x0,0x20(%rdi)
  b31ca7: mov    %r10,0x60(%rdi)
  b31cab: xor    %eax,%eax
  b31cad: nopl   (%rax)

  b31cb0: inc    %edx
  b31cb2: movq   $0x0,0x10(%rdi,%rax,8)
  b31cbb: movq   $0x0,0x50(%rdi,%rax,8)
  b31cc4: movq   $0x0,0x30(%rdi,%rax,8)
  b31ccd: movq   $0x0,0x70(%rdi,%rax,8)
  b31cd6: test   %edx,%edx
  b31cd8: je     b31c80 <formf_+0x20>

  b31cda: cmp    $0x2,%edx
  b31cdd: jne    b31d18 <formf_+0xb8>

  b31cdf: mov    $0xbfe0000000000000,%r11
  b31ce9: mov    $0xbfe0000000000000,%r10
  b31cf3: mov    %r11,(%rdi,%r9,8)
  b31cf7: mov    $0x2,%eax
  b31cfc: movq   $0x0,(%rdi,%r8,8)
  b31d04: movq   $0x0,(%rdi,%rsi,8)
  b31d0c: mov    %r10,(%rdi,%rcx,8)
  b31d10: jmp    b31cb0 <formf_+0x50>

  b31d12: nopw   0x0(%rax,%rax,1)

  b31d18: movslq %edx,%rax
  b31d1b: cmp    $0x1,%edx
  b31d1e: movq   $0x0,(%rdi,%rax,8)
  b31d26: movq   $0x0,0x40(%rdi,%rax,8)
  b31d2f: movq   $0x0,0x20(%rdi,%rax,8)
  b31d38: movq   $0x0,0x60(%rdi,%rax,8)
  b31d41: jne    b31cb0 <formf_+0x50>

  b31d47: movq   $0x0,0x50(%rdi,%rax,8)
  b31d50: movq   $0x0,0x30(%rdi,%rax,8)
  b31d59: mov    $0x3fe0000000000000,%r9
  b31d63: mov    $0xbff8000000000000,%r8
  b31d6d: mov    %r9,0x10(%rdi,%rax,8)
  b31d72: mov    %r8,0x70(%rdi,%rax,8)
  b31d77: mov    $0xd0f838,%edx
  b31d7c: mov    $0xd0f7b0,%esi
  b31d81: mov    %rbx,%rdi
  b31d84: xor    %eax,%eax
  b31d86: callq  977b20 <vclr_>
  b31d8b: mov    $0x3fe0000000000000,%rsi
  b31d95: mov    $0x3fe0000000000000,%rcx
  b31d9f: mov    %rsi,(%rbx)
  b31da2: mov    %rcx,0x60(%rbx)
  b31da6: mov    $0xbfe0000000000000,%rdx
  b31db0: mov    $0xbfe0000000000000,%rax
  b31dba: mov    %rdx,0x18(%rbx)
  b31dbe: mov    %rax,0x78(%rbx)
  b31dc2: pop    %rbx
  b31dc3: retq

Let me know if more detail is needed.

-- Alain.

Useless conditional branches

Reply via email to