Using gentoo gcc 3.4.3

This could look like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11707
(and they might be the same. However I think I had the problem with 3.3.4 too)

I have also had this problem in other older versions. In 2 projects I have been
on this has been really annoying. I think that if a loop is unrolled and the
variable is eliminated it should be replaced with a constant (and then always
false ifs should be removed) 

That is not the case:
int test(int v)
{
  int x = 0;
  for (int u=0;u<2;u++)
  {
    if (u>v)  // v is input-arg the compiler can't deside at compiletime
    {
      if (u%2==1) // can only happen for u==1 (so loops for 0 and 2 does not do
        x++;      // anything. Hoped gcc would notice when unrolling.
    }
  }  
  return x;
}

g++ -O3 -unroll-loops -S simple_test.cpp 

gives me the following code:
        .text
        .align 2
        .p2align 4,,15
.globl _Z4testi
        .type   _Z4testi, @function
_Z4testi:
.LFB2:
        pushl   %ebp
.LCFI0:
        xorl    %edx, %edx
        movl    %esp, %ebp
.LCFI1:
        xorl    %eax, %eax
        incl    %eax
        cmpl    8(%ebp), %eax
        jle     .L4
        testb   $1, %al
        setne   %cl
        movzbl  %cl, %eax
        addl    %eax, %edx
.L4:
        popl    %ebp
        movl    %edx, %eax
        ret
.LFE2:
        .size   _Z4testi, .-_Z4testi
        .section        .note.GNU-stack,"",@progbits
        .ident  "GCC: (GNU) 3.4.3-20050110 (Gentoo 3.4.3.20050110-r2,
ssp-3.4.3.20050110-0, pie-8.7.7)"

If I manually unroll like :

int test(int v)
{
  int x = 0;

  if (0>v)
  {
    if (0%2==1)
      x++;
  }
  if (1>v)
  {
    if (1%2==1)
      x++;
  }
  if (2>v)
  {
    if (2%2==1)
      x++;
  }  
  
  return x;
}

And then just with O3 I get the much nicer :
        .text
        .align 2
        .p2align 4,,15
.globl _Z4testi
        .type   _Z4testi, @function
_Z4testi:
.LFB2:
        pushl   %ebp
.LCFI0:
        xorl    %eax, %eax
        movl    %esp, %ebp
.LCFI1:
        cmpl    $0, 8(%ebp)
        popl    %ebp
        setle   %al
        ret
.LFE2:
        .size   _Z4testi, .-_Z4testi
        .section        .note.GNU-stack,"",@progbits
        .ident  "GCC: (GNU) 3.4.3-20050110 (Gentoo 3.4.3.20050110-r2,
ssp-3.4.3.20050110-0, pie-8.7.7)"

I have had too cases where this optimization is very important. One is if you a
kind of program a chessboard "from within". The other case were a raytracer I
wrote with a friend. In that situation we had to seattle with a not that fast
switch (since we did not wanted to pollute out code with a manual unroll.)

The chessboard example (here a simple case - how many knightsmove does white
have. We do not consider check, pins or that pieces can be in the way)

int knight_square_count(unsigned char* board)
{
  int count = 0;
  for (int bp=0;bp<64;bp++)
  {
    if (board[bp]==WHITE_KNIGHT)
    {
      if (bp%8>1 && bp/8>0) count++;
      if (bp%8>0 && bp/8>1) count++;
      if (bp%8<6 && bp/8>0) count++;
      if (bp%8<7 && bp/8>1) count++;
      if (bp%8>1 && bp/8<7) count++;
      if (bp%8>0 && bp/8<6) count++;
      if (bp%8<6 && bp/8<7) count++;
      if (bp%8<7 && bp/8<6) count++;
    }
  }
  return count;
}

In the above situation a manual unroll (with O3) is more than 400% faster.
(I have timed it and it is close to 500%) I thought that one of the main ideas
of unrolling loops was to make a kind of every loop "its own" (Without making
ugly code)

regards and thanks for the best (free) compiler
Bsc Computer Science 
Thorbjørn Martsum

PS : There might also be a reason for things being as they are. Then I just
don't understand why - please explain then

-- 
           Summary: unroll misses simple elimination - works with manual
                    unroll
           Product: gcc
           Version: 3.4.3
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: tlm at daimi dot au dot dk
                CC: gcc-bugs at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21827

Reply via email to