https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67288
Bug ID: 67288 Summary: [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add) Product: gcc Version: 4.9.3 Status: UNCONFIRMED Severity: normal Priority: P3 Component: regression Assignee: unassigned at gcc dot gnu.org Reporter: christophe.le...@c-s.fr Target Milestone: --- The following function (Linux Kernel, compiled with -O2) was resulting in a good assembly with GCC 4.8.3. With GCC 4.9.3 there are a lot of unneccessary instructions /* L1_CACHE_BYTES = 16 */ /* L1_CACHE_SHIFT = 4 */ #define mb() __asm__ __volatile__ ("sync" : : : "memory") static inline void dcbf(void *addr) { __asm__ __volatile__ ("dcbf 0, %0" : : "r"(addr) : "memory"); } void flush_dcache_range(unsigned long start, unsigned long stop) { void *addr = (void *)(start & ~(L1_CACHE_BYTES - 1)); unsigned int size = stop - (unsigned long)addr + (L1_CACHE_BYTES - 1); unsigned int i; for (i = 0; i < size >> L1_CACHE_SHIFT; i++, addr += L1_CACHE_BYTES) dcbf(addr); if (i) mb(); } Result with GCC 4.9.3: (15 insns) c000d970 <flush_dcache_range>: c000d970: 54 63 00 36 rlwinm r3,r3,0,0,27 c000d974: 38 84 00 0f addi r4,r4,15 c000d978: 7c 83 20 50 subf r4,r3,r4 c000d97c: 54 89 e1 3f rlwinm. r9,r4,28,4,31 c000d980: 4d 82 00 20 beqlr c000d984: 55 24 20 36 rlwinm r4,r9,4,0,27 c000d988: 39 24 ff f0 addi r9,r4,-16 c000d98c: 55 29 e1 3e rlwinm r9,r9,28,4,31 c000d990: 39 29 00 01 addi r9,r9,1 c000d994: 7d 29 03 a6 mtctr r9 c000d998: 7c 00 18 ac dcbf 0,r3 c000d99c: 38 63 00 10 addi r3,r3,16 c000d9a0: 42 00 ff f8 bdnz c000d998 <flush_dcache_range+0x28> c000d9a4: 7c 00 04 ac sync c000d9a8: 4e 80 00 20 blr The following section is just useless: (shift left 4 bits, remove 16, shift right 4 bits, add 1) c000d984: 55 24 20 36 rlwinm r4,r9,4,0,27 c000d988: 39 24 ff f0 addi r9,r4,-16 c000d98c: 55 29 e1 3e rlwinm r9,r9,28,4,31 c000d990: 39 29 00 01 addi r9,r9,1 Result with GCC 4.8.3 was correct: (11 insns) c000d894 <flush_dcache_range>: c000d894: 54 63 00 36 rlwinm r3,r3,0,0,27 c000d898: 38 84 00 0f addi r4,r4,15 c000d89c: 7d 23 20 50 subf r9,r3,r4 c000d8a0: 55 29 e1 3f rlwinm. r9,r9,28,4,31 c000d8a4: 4d 82 00 20 beqlr c000d8a8: 7d 29 03 a6 mtctr r9 c000d8ac: 7c 00 18 ac dcbf 0,r3 c000d8b0: 38 63 00 10 addi r3,r3,16 c000d8b4: 42 00 ff f8 bdnz c000d8ac <flush_dcache_range+0x18> c000d8b8: 7c 00 04 ac sync c000d8bc: 4e 80 00 20 blr