I came across some source code that failed to compile in gcc 4.4.3 with -O3
because the kernel shot gcc for using too much memory.  After I minimized the
code (and converted it from C++ to C for simplicity) into the below example,
gcc 4.4.3 still took over a minute of CPU and a gigabyte of RAM for cc1 to
build:

$ time /net/test-hsa014/wlam/local/gcc-4.4.3/bin/gcc -Wfatal-errors -c -O3
foo-min.c

real    1m2.488s
user    1m1.213s
sys     0m1.110s

int main()                                                                      
{                                                                               
        unsigned long long table[256];                                          
        unsigned int i;
        for (i=0; i<256; ++i) {
                unsigned long long j;
                unsigned char x=i;
                for (j=0; j<5; ++j) {
                        x += x<<1;
                        x ^= x>>1;
                }
                for (j=0; j<5; ++j) {
                        x ^= x>>1;
                }
                for (j=0; j<5; ++j) {
                        x += x<<1;
                        x ^= x>>1;
                }
                table[i] ^= (((unsigned long long)x)<<16);
        }
        for (i=0; i<256; ++i) {
                if ((table[i]&0xff)==i)
                        return 1;
        }
        return 0;
}

With additional
                for (j=0; j<5; ++j) { ... }
loops, the computation and RAM consumed to compile the example seemed to grow
disproportionately.  For example, the following longer variation consumed over
10 minutes and 15GB RAM before I killed it:

int main()                                   
{                                            
        unsigned long long table[256];       
        unsigned int i;                      
        for (i=0; i<256; ++i) {              
                unsigned long long j;        
                unsigned char x=i;           
                for (j=0; j<5; ++j) {        
                        x += x<<1;           
                        x ^= x>>1;           
                }                            
                for (j=0; j<5; ++j) {
                        x ^= x>>1;
                }
                for (j=0; j<5; ++j) {
                        x += x<<1;
                        x ^= x>>1;
                }
                for (j=0; j<5; ++j) {
                        x += x<<1;
                        x ^= x>>1;
                }
                for (j=0; j<5; ++j) {
                        x += x<<1;
                        x ^= x>>1;
                }
                table[i] ^= (((unsigned long long)x)<<16);
        }
        for (i=0; i<256; ++i) {
                if ((table[i]&0xff)==i)
                        return 1;
        }
        return 0;
}

(The original code I encountered was longer yet...)

I built gcc 4.4.3 from the GNU source distribution to use as the reference gcc
4.4.3 above--

$ /net/test-hsa014/wlam/local/gcc-4.4.3/bin/gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-4.4.3/configure
--prefix=/net/test-hsa014/wlam/local/gcc-4.4.3 --disable-multilib
Thread model: posix
gcc version 4.4.3 (GCC)

--but the Fedora variation of version 4.4.3 showed similar behavior:

$ gcc -v
Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla
--enable-bootstrap --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-gnu-unique-object
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk
--disable-dssi --enable-plugin
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib
--with-ppl --with-cloog --with-tune=generic --with-arch_32=i686
--build=x86_64-redhat-linux
Thread model: posix
gcc version 4.4.3 20100127 (Red Hat 4.4.3-4) (GCC)


By contrast, the Ubuntu distribution of gcc 4.4.1 (on a different machine with
a similar-speed CPU) completed trivially quickly, as expected--

$ time gcc -Wfatal-errors  -c -O3 -Wall foo.c

real    0m0.091s
user    0m0.050s
sys     0m0.020s

$ gcc -v
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.4.1-4ubuntu9'
--with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared
--enable-multiarch --enable-linker-build-id --with-system-zlib
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--with-gxx-include-dir=/usr/include/c++/4.4 --program-suffix=-4.4 --enable-nls
--enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --disable-werror
--with-arch-32=i486 --with-tune=generic --enable-checking=release
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.4.1 (Ubuntu 4.4.1-4ubuntu9)

--and the original source code (from which the above examples were minimized)
previously compiled on a Red Hat version of gcc 4.3.0 without complaint:

$ gcc -v
Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla
--enable-bootstrap --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk
--disable-dssi --enable-plugin
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib
--with-cpu=generic --build=x86_64-redhat-linux
Thread model: posix
gcc version 4.3.0 20080428 (Red Hat 4.3.0-8) (GCC)


-- 
           Summary: [4.4 regression] gcc takes unusually large amounts of
                    memory and time to compile nested for loop at -O3
           Product: gcc
           Version: 4.4.3
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: wlam at kosmix dot com
 GCC build triplet: x86_64-unknown-linux-gnu
  GCC host triplet: x86_64-unknown-linux-gnu
GCC target triplet: x86_64-unknown-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43415

Reply via email to