2009/12/21 Shachar Shemesh <shac...@shemesh.biz>: > Hi all, > > I'm trying, without success, to disable loop unrolling when compiling a > program with -O3 with gcc (4.4, but I see the same problem with 4.3).
I am actually very surprized that -O3 unrolls loops. It is not supposed to. The idea to include -funroll-loops into O3 was raised quite a few times and was always rejected. Maybe something changed in recent years. The documentation certainly does not say loop unrolling is enabled with either -O2 or -O3. I suspect something is the matter with -ftree-loop-optimize. The gcc documentation says, `-ftree-loop-optimize' Perform loop optimizations on trees. This flag is enabled by default at `-O' and higher. However, the behaviour depends on which optimization options you use. E.g., -O2 won't unroll no matter what: $ gcc -c -O2 -ftree-loop-optimize loop.c $ objdump -S loop.o loop.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <func>: 0: 31 c0 xor %eax,%eax 2: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 8: 83 c0 01 add $0x1,%eax b: c7 05 00 00 00 00 00 movl $0x0,0x0(%rip) # 15 <func+0x15> 12: 00 00 00 15: 83 f8 08 cmp $0x8,%eax 18: 75 ee jne 8 <func+0x8> 1a: f3 c3 repz retq However, try compiling with -O3 -fno-tree-loop-optimize and you will succeed. $ gcc -c -O3 -fno-tree-loop-optimize loop.c $ objdump -S loop.o loop.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <func>: 0: 31 c0 xor %eax,%eax 2: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 8: 83 c0 01 add $0x1,%eax b: c7 05 00 00 00 00 00 movl $0x0,0x0(%rip) # 15 <func+0x15> 12: 00 00 00 15: 83 f8 07 cmp $0x7,%eax 18: 7e ee jle 8 <func+0x8> 1a: f3 c3 repz retq Or, if you are primarily interested in code size as you indicate, why not -Os? $ gcc -c -Os loop.c $ objdump -S loop.o loop.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <func>: 0: 31 c0 xor %eax,%eax 2: ff c0 inc %eax 4: c7 05 00 00 00 00 00 movl $0x0,0x0(%rip) # e <func+0xe> b: 00 00 00 e: 83 f8 08 cmp $0x8,%eax 11: 75 ef jne 2 <func+0x2> 13: c3 retq Hope it helps, -- Oleg Goldshmidt | p...@goldshmidt.org _______________________________________________ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il