http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58863

            Bug ID: 58863
           Summary: for loop not aligned at -O2 or -O3
           Product: gcc
           Version: 4.7.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: other
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ali.baharev at gmail dot com

The for loop in work() is the hotspot:

const int LOOP_BOUND = 200000000;

__attribute__((noinline))
static int add(const int& x, const int& y) {
    return x + y;
}

__attribute__((noinline))
static int work(int xval, int yval) {
    int sum(0);
    for (int i=0; i<LOOP_BOUND; ++i) {
        int x(xval+sum);
        int y(yval+sum);
        int z = add(x, y);
        sum += z;
    }
    return sum;
}

int main(int , char* argv[]) {
    int result = work(*argv[1], *argv[2]);
    return result;
}


Running 

g++ -O2 main.cpp && objdump -d | c++filt 

gives

  400598:       41 8d 34 1c             lea    (%r12,%rbx,1),%esi
  [...]
  4005ab:       75 eb                   jne    400598 <work(int, int)+0x18>

According to the documentation:

http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

-falign-loops   Enabled at levels -O2, -O3. 

By analyzing the assembly code, it looks like gcc aligns things to the next 16
byte boundary by default on this machine in other cases.

If I pass -falign-loops=16 it becomes:

  4005a0:       41 8d 34 1c             lea    (%r12,%rbx,1),%esi
  [...]
  4005b3:       75 eb                   jne    4005a0 <work(int, int)+0x20>

I guess it is also supposed to look like this when just -O2 is passed, at least
that is what the documentation suggestes to me.

Reply via email to