The follwing program:

#include <stdio.h>

typedef unsigned char(*Calculable)(void);

unsigned char one() { return 1; }
unsigned char two() { return 2; }

static void print(Calculable calculate)
{
        printf("%d\n", calculate());
        printf("+1: %d\n", calculate() + 1);
}

int main()
{
        print(one);
        print(two);

        return 0;
}



when compiled with GCC 4.5.0 20091211 with -O3 -fwhole-program, outputs the
following relevant chunk of assembly:

00000000004002e0 <_Z3onev>:
  4002e0:       b8 01 00 00 00          mov    eax,0x1
  4002e5:       c3                      ret
  4002e6:       66 2e 0f 1f 84 00 00    nop    WORD PTR cs:[rax+rax*1+0x0]
  4002ed:       00 00 00

00000000004002f0 <_Z3twov>:
  4002f0:       b8 02 00 00 00          mov    eax,0x2
  4002f5:       c3                      ret
  4002f6:       66 2e 0f 1f 84 00 00    nop    WORD PTR cs:[rax+rax*1+0x0]
  4002fd:       00 00 00

0000000000400300 <main>:
  400300:       48 83 ec 08             sub    rsp,0x8
  400304:       be 01 00 00 00          mov    esi,0x1
  400309:       bf 34 04 40 00          mov    edi,0x400434
  40030e:       31 c0                   xor    eax,eax
  400310:       e8 73 02 00 00          call   400588 <pri...@plt>
  400315:       be 02 00 00 00          mov    esi,0x2
  40031a:       bf 2c 04 40 00          mov    edi,0x40042c
  40031f:       31 c0                   xor    eax,eax
  400321:       e8 62 02 00 00          call   400588 <pri...@plt>
  400326:       be 02 00 00 00          mov    esi,0x2
  40032b:       bf 34 04 40 00          mov    edi,0x400434
  400330:       31 c0                   xor    eax,eax
  400332:       e8 51 02 00 00          call   400588 <pri...@plt>
  400337:       be 03 00 00 00          mov    esi,0x3
  40033c:       bf 2c 04 40 00          mov    edi,0x40042c
  400341:       31 c0                   xor    eax,eax
  400343:       e8 40 02 00 00          call   400588 <pri...@plt>
  400348:       31 c0                   xor    eax,eax
  40034a:       48 83 c4 08             add    rsp,0x8
  40034e:       c3                      ret
  40034f:       90                      nop

GCC correctly folds the functions one() and two(), and does all the
compile-time math (yay!). however, the one() and two() functions are still
emitted to the binary even though they are now dead. There is a concern that
this negatively affects code locality on cache-limited platforms where a
profile-guided optimization is non-trivial. We are currently seeing some cache
misses and this example embodies one of the roadblocks to eliminating them
while keeping the code well-encapsulated.


-- 
           Summary: dead code not eliminated during folding with whole-
                    program
           Product: gcc
           Version: 4.5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: matt at use dot net
 GCC build triplet: x86_64-unknown-linux-gnu
  GCC host triplet: x86_64-unknown-linux-gnu
GCC target triplet: x86_64-unknown-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42371

Reply via email to